Whether
you
are a system administrator or user, the responsiveness of your Unix
system is going to be the primary criterion of evaluating your
machine. Of course,
"responsiveness" is a loaded word.
What about your system is responsive? Responsive to whom? How fast
does the system need to be to be responsive? There is no one silver
bullet that will slay all system latencies, but there are tools that
isolate performance bottlenecks -- the most important of which you
carry on your shoulders.
This chapter deals with issues that affect system performance
generally and how you go about finding and attenuating system
bottlenecks. Of course, this chapter cannot be a comprehensive guide
to how to maximize your system for
your needs, since that is far too dependent on
the flavors of Unix and the machines on which they run. However,
there are principles and programs that are widely available that will
help you assess how much more performance you can expect from your
hardware.
One of the fundamental illusions in a multiuser, multiprocessing
operating system like Unix is that every user and every process is
made to think that they are alone on the machine. This is by design.
At the kernel level, a program called the
scheduler attempts to juggle the needs
of each user, providing overall decent performance of:
System performance degrades when one of these goals overwhelms the
others. These problems are very intuitive: if there are five times
the normal number of users logged into your system, chances are that
your session will be less responsive than at less busy times.
Performance tuning is a multifaceted problem. At its most basic,
performance issues can be looked at
as being either global or
local problems. Global problems affect the
system as a whole and can generally be fixed only by the system
administrator. These problems include insufficient RAM or hard drive
space, inadequately powerful CPUs, and scanty network bandwidth. The
global problems are really the result of a host of local issues,
which all involve how each process on the system consumes resources.
Often, it is up to the users to fix the bottlenecks in their own
processes.
Global problems are diagnosed with tools that report system-wide
statistics. For instance, when a system appears sluggish, most
administrators run uptime
(Section 26.4) to see how many processes were recently
trying to run. If these numbers are significantly higher than normal
usage, something is amiss (perhaps your web server has been
slashdotted).
If uptime suggests increased activity, the next
tool to use is either ps or
top to see if you can find the set of
processes causing the trouble. Because it shows you
"live" numbers,
top can be particularly useful in this situation.
I also recommend checking the amount of available
free disk space with
df, since a full filesystem is often an unhappy
one, and its misery spreads quickly.
Once particular processes have been isolated as being problematic,
it's time to think locally. Process performance
suffers when either there isn't more CPU time
available to finish a task (this is known as a
CPU-bound process) or the process is waiting for
some I/O
resource (i.e., I/O-bound ), such as the hard
drive or network. One strategy for dealing with CPU-bound processes,
if you have the source code for them, is to use a
profiler like
GNU's gprof. Profilers give an
accounting for how much CPU time is spent in each subroutine of a
given program. For instance, if I want to profile one of my programs,
I'd first compile it with gcc and
use the -pg compilation flag. Then
I'd run the program. This creates the
gmon.out data file that gprof
can read. Now I can use gprof to give me a report
with the following invocation:
$ gprof -b executable gmon.out
Here's an abbreviated version of the output:
Flat profile:
Each sample counts as 0.01 seconds.
no time accumulated
% cumulative self self total
time seconds seconds calls Ts/call Ts/call name
0.00 0.00 0.00 2 0.00 0.00 die_if_fault_occurred
0.00 0.00 0.00 1 0.00 0.00 get_double
0.00 0.00 0.00 1 0.00 0.00 print_values
Here, we see that three subroutines defined in this program
(die_if_fault_occurred,
get_double, and print_values)
were called. In fact, the first subroutine was called twice. Because
this program is neither processor- nor I/O-intensive, no significant
time is shown to indicate how long each subroutine took to run. If
one subroutine took a significantly longer time to run than the
others, or one subroutine is called significantly more often than the
others, you might want to see how you can make that problem
subroutine faster. This is just the tip of the profiling iceberg.
Consult your language's profiler documentation for
more details.
One less detailed way to look at processes is to get an accounting of
how much time a program took to run in user space, in kernel space,
and in real time. For this, the time
(Section 26.2) command exists as part of both C and
bash shells. As an external program,
/bin/time gives a slightly less detailed report.
No special compilation is necessary to use this program, so
it's a good tool to use to get a first approximation
of the bottlenecks in a particular process.
Resolving I/O-bound issues is difficult for users. Only adminstrators
can both tweak the low-level system settings that control system I/O
buffering and install new hardware, if needed. CPU-bound processes
might be improved by dividing the program into smaller programs that
feed data to each other. Ideally, these smaller programs can be
spread across several machines. This is the basis of distributed
computing.
Sometimes, you want a particular process to hog all the system
resources. This is the definition of a dedicated server, like one
that hosts the Apache web server or an Oracle database. Often, server
software will have configuration switches that help the administrator
allocate system resources based on typical usage. This, of course, is
far beyond the scope of this book, but do check out Web
Performance Tuning and Oracle Performance
Tuning from O'Reilly for more details.
For more system-wide tips, pick up System Performance
Tuning, also from O'Reilly.
As with so many things in life, you can improve performance only so
much. In fact, by improving performance in one area,
you're likely to see performance degrade in other
tasks. Unless you've got a machine
that's dedicated to a very specific task, beware the
temptation to over-optimize.
-- JJ