[Chapter 39] 39.5 Why Is the System So Slow?

39.5 Why Is the System So Slow?

To a user, performance means: "How much time does it take to run my job?" For a system manager, this question is much too simple: a user's job may take a long time to execute because it is badly written or because it doesn't really use the computer appropriately. Furthermore, a system manager must optimize performance for all system users - which is much more complicated than optimizing performance for a single user. Here are some of the things that affect performance.

The UNIX utility /bin/time (39.2 ) reports the amount of time required to execute a program, breaking down the total time into several important components. For example, consider the report below:

% /bin/time application

  4.8 real      0.5 user      0.7 sys

This report shows that the program ran in roughly 4.8 seconds. This is the elapsed or wallclock time: it is the actual time that the program runs as it would be measured by a user sitting at the terminal with a stopwatch. The amount of time that the system spent working on your program is much smaller. It spent 0.5 seconds of user time , which is time spent executing code in the user state, and about 0.7 seconds of system time , which is time spent in the system state (i.e., time spent executing UNIX system code) on behalf of the user. The total amount of CPU time (actual execution time on the main processor) was only 1.2 seconds, or only one-quarter of the elapsed time. [1]

[1] Note that BSD and System V versions of /bin/time have different output formats but provide the same information. /bin/time also differs from the C shell's time command , (39.3 ) which provides a more elaborate report.

Where did the rest of the time go? Some time was spent performing I/O (text input/output) operations, which /bin/time doesn't report. Handling I/O requires some computation, which is attributed to system time. But time that is spent by disk drives, network interfaces, terminal controllers, or other hardware isn't accounted for; most of the time was spent running jobs on behalf of other users. This entails its own performance overhead (context-switch time, swapping time, etc.).

Many different components contribute to a program's total running time. When you understand the roles these components play, you will understand the problem. Here is a summary of the different components:

User-state CPU time. The actual amount of time the CPU spends running your program in the user state. It includes time spent executing library functions but excludes time spent executing system calls (i.e., time spent in the UNIX kernel on behalf of the process). Programmers can control user-state time by knowing which library routines are efficient and which aren't, and they should know how to run profilers on the program to find out where it's spending its time.
System-state CPU time. The amount of time the CPU spends in the system state (i.e., the amount of time spent executing kernel code) on behalf of the program. This includes time spent executing system calls and performing administrative functions on the program's behalf. The distinction between time spent in simple library routines and time spent in system services is important and often confused. A call to strcpy , which copies a character string, executes entirely in the user state because it doesn't require any special handling by the kernel. Calls to printf , fork , and many other routines are much more complex. These functions do require services from the UNIX kernel so they spend part of their time, if not most of it, in the system state. All I/O routines require the kernel's services.

System-state CPU time is partially under the programmer's control. Although programmers cannot change the amount of time it takes to service any system call, they can rewrite the program to issue system calls more efficiently (for example, to make I/O transfers in larger blocks).
I/O time. The amount of time the I/O subsystem spends servicing the I/O requests that the job issues. Under UNIX, I/O time is difficult to measure; however, there are some tools for determining whether the I/O system is overloaded and some configuration considerations that can help alleviate load problems.
Network time. The amount of time that the I/O subsystem spends servicing network requests that the job issues. This is really a subcategory of I/O time and depends critically on configuration and usage issues.
Time spent running other programs. As system load increases, the CPU spends less time working on any given job, thus increasing the elapsed time required to run the job. This is an annoyance, but barring some problem with I/O or virtual memory performance, there is little you can do about it.
Virtual memory performance. This is by far the most complex aspect of system performance. Ideally, all active jobs would remain in the system's physical memory at all times. But when physical memory is fully occupied, the operating system starts moving parts of jobs to disk, thus freeing memory for the job it wants to run. This takes time. It also takes time when these disk-bound jobs need to run again and therefore need to be moved back into memory. When running jobs with extremely large memory requirements, system performance can degrade significantly.

If you spend most of your time running standard utilities and commercial applications, you can't do much about user-state or system-state time. To make a significant dent in these, you have to rewrite the program. But you can do a lot to improve your memory and I/O performance, and you can do a lot to run your big applications more efficiently.

Keyboard response is an extremely important issue to users, although it really doesn't contribute to a program's execution time. If there is a noticeable gap between the time when a user types a character and the time when the system echoes that character, the user will think performance is bad, regardless of how much time it takes to run a job. In order to prevent terminal buffers from overflowing and losing characters, most UNIX systems give terminal drivers (42.1 ) very high priority. As a side effect, the high priority of terminals means that keyboard response should be bad only under exceptionally high loads. If you are accessing a remote system across a network, however, network delays can cause poor keyboard response. Network performance is an extremely complex issue.

- ML from O'Reilly & Associates' System Performance Tuning , Chapter 1