B.1 About Processes
Unix
is a multitasking operating system. Every task that the computer is
performing at any moment—every user running a word processor
program, for example—has a process. The
process is the operating system's fundamental tool
for controlling the computer.
Nearly everything that Unix does is done with a process. One process
displays the characters login: on the
user's terminal and reads the characters that the
user types to log into the system. Another process spools PostScript
to the laser printer. (If you don't have a
PostScript-based printer, yet another process translates PostScript
into whatever language your printer happens to use—for example,
PCL.) On a workstation, a special process called the
window server displays text in windows on the
screen. (Another process called the window
manager lets the user move those windows around.)
At any given moment, the average Unix operating system might be
running anywhere from a few dozen to several hundred different
processes. Large multiuser systems typically run hundreds to
thousands of processes, as Unix runs at least one process for every
user who is logged in, another process for every program that every
user is running, another process for every hardwired terminal that is
waiting for a new user, and a few dozen processes to manage servers
and background tasks.
But regardless of whether you are responsible for security on a small
system or a large one, understanding how processes work and the
process lifecycle is vital to understanding security issues.
B.1.1 Processes and Programs
The goal of the Unix process system is to
share resources (such as access to the CPU) among multiple programs
while providing a high degree of isolation between individual
instances of execution. Each executing process is given its own
context, which is
a private address space, a private stack, and its own set of file
descriptors and CPU registers (including its own program counter).
The underlying hardware and operating system software manage the
contents of registers in such a way that each process views the
computer's resources as its
"own" while it is running.
In modern programming parlance, a
thread is a flow of execution in a process.
Most processes are single-threaded and manage only a single flow of
execution. However, many Unix kernels (and programming libraries)
support the creation of multiple threads in a single process. Each
thread gets its own stack and registers, but shares most other
resources, such as address space, with other threads in the same
process. On some Unix operating systems, the system calls that create
threads allow the programmer to choose which aspects of context are
shared and which are private when a new thread is created.
In multithreaded programs, threads are often referred to as
"lightweight processes." Because
threads in the same process share so many more resources than
separate processes, the kernel can switch much more quickly between
the threads' contexts than it can between processes.
This is especially useful in applications such as web servers, in
which individual threads serving each web request can profitably
share most of the process context.
|
On a single-processor system only one process at a time is actually
running, of course; the operating system allows each process to run
until it "blocks" because it
requests information that is currently unavailable, because it
explicitly waits for some other event to occur, or because it has
exceeded its allowable amount of CPU time. Once a process blocks, the
operating system turns over control to another process that is ready
to run. The switching normally happens so fast as to give the
illusion that they are all running concurrently. Multiprocessor
computers can run several processes with true synchronicity, although
they also swap execution contexts when there are more processes than
processors.
Every Unix process (except perhaps the very first) is associated with
a program. Programs are usually referred to by the names of the files
in which they are kept. For example, the program that lists files is
named /bin/ls, and the program that spools data
to the printer is typically named /usr/lib/lpd.
Processes normally run a single program and then exit. However, a
program can cause another program to run. In this case, the same
process starts running another program.
There are three ways that a process can run executable code that is
not stored in a file:
The process may have been specially crafted in a block of memory and
then executed. This is the method that the Unix kernel uses to begin
the first process when the operating system starts up. This usually
happens only at startup.
The program's file can be deleted after its process
starts up. In this case, the process's program is
really stored in a file, but the file no longer has a name and cannot
be accessed by any other processes. The file is deleted automatically
when the process exits or runs another program.
A process can load additional machine code into its memory space and
then execute it. This is the technique that is used by shared
libraries, loadable object modules, and many
"plug-in" architectures. This is
also the technique that is used by many buffer overflow attacks.
Because there are many ways to dynamically modify the code that is
executing in the address space of a process, you should not assume
that the process that is running on your computer is the same as the
program file from which it was loaded.
B.1.2 The ps Command
The ps
command gives you a snapshot
of all of the processes running at any given moment.
ps tells you information about the running
programs on your system, as well as which programs the operating
system is spending its time executing.
Many system administrators routinely use the ps
command to see why their computers are running so slowly; system
administrators should also regularly use the command to look for
suspicious processes. (Suspicious processes are any processes that
you don't expect to be running. Methods of
identifying suspicious processes are described in detail in earlier
chapters.)
The top command is another popular program
for viewing which processes are currently running.
top prints an ASCII screen with a continuously
updated view of the top-running processes, defined as those processes
that are consuming the most CPU time (although other sorting rules,
such as memory usage, are also available). Although
top is an extremely useful command, you should
not let it become a substitute for ps, as there
are many important processes that will never appear in the output of
the top command simply because they do not
consume enough resources.
B.1.2.1 Listing processes with Solaris and other Unix systems derived from System V
The System V ps
command will normally print only the processes that are associated
with the terminal on which the program is being run. To list all of
the processes that are running on your computer, you must run the
program with the -ef options. The options are:
- e
-
List all processes
- f
-
Produce a full listing
For example:
sun.vineyard.net% /bin/ps -ef
UID PID PPID C STIME TTY TIME COMD
root 0 0 64 Nov 16 ? 0:01 sched
root 1 0 80 Nov 16 ? 9:56 /etc/init -
root 2 0 80 Nov 16 ? 0:10 pageout
root 3 0 80 Nov 16 ? 78:20 fsflush
root 227 1 24 Nov 16 ? 0:00 /usr/lib/saf/sac -t 300
root 269 1 18 Nov 16 console 0:00 /usr/lib/saf/ttymon -g -
root 97 1 80 Nov 16 ? 1:02 /usr/sbin/rpcbind
root 208 1 80 Nov 16 ? 0:01 /usr/dt/bin/dtlogin
root 99 1 21 Nov 16 ? 0:00 /usr/sbin/keyserv
root 117 1 12 Nov 16 ? 0:00 /usr/lib/nfs/statd
root 105 1 12 Nov 16 ? 0:00 /usr/sbin/kerbd
root 119 1 27 Nov 16 ? 0:00 /usr/lib/nfs/lockd
root 138 1 12 Nov 16 ? 0:00 /usr/lib/autofs/automoun
root 162 1 62 Nov 16 ? 0:01 /usr/lib/lpsched
root 142 1 41 Nov 16 ? 0:00 /usr/sbin/syslogd
root 152 1 80 Nov 16 ? 0:07 /usr/sbin/cron
root 169 162 8 Nov 16 ? 0:00 lpNet
root 172 1 80 Nov 16 ? 0:02 /usr/lib/sendmail -q1h
root 199 1 80 Nov 16 ? 0:02 /usr/sbin/vold
root 180 1 80 Nov 16 ? 0:04 /usr/lib/utmpd
root 234 227 31 Nov 16 ? 0:00 /usr/lib/saf/listen tcp
simsong 14670 14563 13 12:22:12 pts/11 0:00 rlogin next
root 235 227 45 Nov 16 ? 0:00 /usr/lib/saf/ttymon
simsong 14673 14535 34 12:23:06 pts/5 0:00 rlogin next
simsong 14509 1 80 11:32:43 ? 0:05 /usr/dt/bin/dsdm
simsong 14528 14520 80 11:32:51 ? 0:18 dtwm
simsong 14535 14533 66 11:33:04 pts/5 0:01 /usr/local/bin/tcsh
simsong 14529 14520 80 11:32:56 ? 0:03 dtfile -session dta003TF
root 14467 1 11 11:32:23 ? 0:00 /usr/openwin/bin/fbconso
simsong 14635 14533 80 11:48:18 pts/12 0:01 /usr/local/bin/tcsh
simsong 14728 14727 65 15:29:20 pts/9 0:01 rlogin next
root 332 114 80 Nov 16 ? 0:02 /usr/dt/bin/rpc.ttdbserv
root 14086 208 80 Dec 01 ? 8:26 /usr/openwin/bin/Xsun :0
simsong 13121 13098 80 Nov 29 pts/6 0:01 /usr/local/bin/tcsh
simsong 15074 14635 20 10:48:34 pts/12 0:00 /bin/ps -ef
Table B-1 summarizes the meaning of each field in
this output.
Table B-1. Fields in ps output (System V)
UID
|
Username or user ID the program is running as.
|
PID
|
Process's identification number (see the next
section).
|
PPID
|
Process ID of the process's parent process.
|
C
|
Processor utilization, which is an indication of how much CPU time
the process is using at the moment.
|
STIME
|
Time or date when the process started executing.
|
TTY
|
Controlling terminal for the process. Processes with no controlling
terminal display a "?" in this
column.
|
TIME
|
Total amount of CPU time that the process has used.
|
COMD
|
Command that was used to start the process. More precisely, this
column shows all of the command's arguments,
beginning with argv[0], which is usually the
command's name. Processes can, however, set
argv[0] to other values (several network servers
that spawn multiple processes, such as sendmail,
change this so that ps displays information
about what each sendmail process is responsible
for doing).
|
B.1.2.2 Listing processes with versions of Unix derived from BSD, including Linux
With Berkeley Unix and Linux, you can
use the command:
% ps auxww
to display detailed information about every process running on your
computer.
The options specified in this command are:
- a
-
Lists all processes
- u
-
Displays the information in a user-oriented style
- x
-
Includes information on processes that do not have controlling
ttys
- ww
-
Includes the complete command lines, even if they run past 132 columns
For example:
% ps -auxww
USER PID %CPU %MEM SZ RSS TT STAT TIME COMMAND
simsong 1996 62.6 0.6 1136 1000 q8 R 0:02 ps auxww
root 111 0.0 0.0 32 16 ? I 1:10 /etc/biod 4
daemon 115 0.0 0.1 164 148 ? S 2:06 /etc/syslog
root 103 0.0 0.1 140 116 ? I 0:44 /etc/portmap
root 116 0.0 0.5 860 832 ? I 12:24 /etc/mountd -i -s
root 191 0.0 0.2 384 352 ? I 0:30 /usr/etc/bin/lpd
root 73 0.0 0.3 528 484 ? S < 7:31 /usr/etc/ntpd -n
root 4 0.0 0.0 0 0 ? I 0:00 tpathd
root 3 0.0 0.0 0 0 ? R 0:00 idleproc
root 2 0.0 0.0 4096 0 ? D 0:00 pagedaemon
root 239 0.0 0.1 180 156 co I 0:00 std.9600 console
root 0 0.0 0.0 0 0 ? D 0:08 swapper
root 178 0.0 0.3 700 616 ? I 6:31 /etc/snmpd
root 174 0.0 0.1 184 148 ? S 5:06 /etc/inetd
root 168 0.0 0.0 56 44 ? I 0:16 /etc/cron
root 132 0.0 0.2 452 352 co I 0:11 /usr/etc/lockd
jdavis 383 0.0 0.1 176 96 p0 I 0:03 rlogin hymie
ishii 1985 0.0 0.1 284 152 q1 S 0:00 /usr/ucb/mail bl
root 26795 0.0 0.1 128 92 ? S 0:00 timed
root 25728 0.0 0.0 136 56 t3 I 0:00 telnetd
jdavis 359 0.0 0.1 540 212 p0 I 0:00 -tcsh (tcsh)
root 205 0.0 0.1 216 168 ? I 0:04 /usr/local/cap/atis
kkarahal 16296 0.0 0.4 1144 640 ? I 0:00 emacs
root 358 0.0 0.0 120 44 p0 I 0:03 rlogind
root 26568 0.0 0.0 0 0 ? Z 0:00 <exiting>
root 10862 0.0 0.1 376 112 ? I 0:00 rshd
The fields in this output are summarized in Table B-2. Individual STAT characters are summarized in
Tables Table B-3, Table B-4,
and Table B-5.
Table B-2. Fields in ps output (Berkeley-derived)
USER
|
Username of the process. If the process has a UID (described in the
next section) that does not appear in
/etc/passwd, the UID is printed
instead.
|
PID
|
Process's identification number.
|
%CPU, %MEM
|
Percentage of the system's CPU and memory that the
process is using.
|
SZ
|
Amount of virtual memory that the process is using.
|
RSS
|
Resident set size of the process, i.e., the amount of physical memory
that the process is occupying.
|
TT
|
Terminal that is controlling the process.
|
STAT
|
Field denoting the status of the process; up to three letters (four
under SunOS) are shown.
|
TIME
|
CPU time used by the process.
|
COMMAND
|
Name of the command (and arguments).
|
Table B-3. Runnability of process (first letter of STAT field)
R
|
Actually running or runnable.
|
S
|
Sleeping (sleeping > 20 seconds).
|
I
|
Idle (sleeping < 20 seconds).
|
T
|
Stopped.
|
H
|
Halted.
|
P
|
In page wait.
|
D
|
In disk wait. Processes in this state are waiting for hardware to
become available and cannot be interrupted.
|
Z
|
Zombie. A zombie is a defunct child process that has exited and
expects to report its status back to its parent, but whose parent has
not called wait( ) to collect the status and
"reap" the child process. When the
parent of a zombie exits, the init process reaps
any remaining zombies. Zombies take up an entry in the process table,
but no other resources.
|
Table B-4. Status of process swapping (second letter of STAT field)
<Blank>
|
In memory (often referred to as "in
core")
|
W
|
Swapped out
|
>
|
Process that has exceeded a soft limit on memory requirements
|
Table B-5. Status of processes running with altered CPU schedules (third letter of STAT field)
N
|
Process is running at a low priority
|
#
|
nice (a number greater than 0)
|
<
|
Process is running at a high priority
|
B.1.3 Process Properties
The kernel maintains a set of properties
for every Unix process. Most of these properties are denoted by
numbers. Some of these numbers refer to processes, while others
determine what privileges the processes have.
B.1.3.1 Process identification numbers (PIDs)
Every process is assigned a unique number called the
process identifier, or PID. The first process
to run, called init, is given the number 1.
Process numbers can range from 1 to 65,535. When the kernel runs out of process
numbers, it recycles them. The kernel guarantees that no two
active processes will ever have the same number.
B.1.3.2 Process real and effective UIDs
Every Unix process has two user identifiers: a
real UID and an effective UID.
The real UID (RUID) is the actual user
identifier (UID) of the entity (usually a person, but possibly a
daemon service such as mail) that is running the
program. It is usually the same as the UID of the actual person who
is logged into the computer, sitting in front of the terminal (or
workstation).
The effective UID (EUID) identifies the actual
privileges of the process that is running.
Normally, the real UID and the effective UID are the same. That is,
you have only the privileges associated with your own UID. Sometimes,
however, the real and effective UIDs can be different. This occurs
when a user runs a special kind of program called a SUID program.
SUID programs are often used to accomplish specific functions that
require extra privileges (such as changing the
user's password). SUID programs are described in
Chapter 5.
B.1.3.3 Process priority and niceness
Although Unix is a multitasking operating
system, most computers that run Unix can run only a single process at
a time.
Every fraction of a second, the Unix operating system rapidly
switches between many different processes so that each one gets a
little bit of work done within a given amount of time. A tiny but
important part of the Unix kernel called the process
scheduler
decides which process is allowed to run at any given moment and how
much CPU time that process should get.
To calculate which process it should run next, the scheduler computes
the priority of every process. The process
with the lowest priority number (the highest priority) runs. A
process's priority is determined with a complex
formula that includes what the process is doing and how much CPU time
the process has already consumed. A special number called the
nice number, or simply the
nice, biases this calculation: the lower a
process's nice number, the higher its calculated
priority, and the more likely that it will be run. Put another way,
the nicer the program, the less time it expects (and gets) from the
kernel.
On most versions of Unix, nice numbers are limited to being -20 to
+20. Most processes have a nice of 0. A process with a nice number of
+19 will probably not run until the system is almost completely idle;
likewise, a process with a nice number of -19 will probably preempt
every other user process on the system.
Sometimes, you will want to make a process run slower. In some cases,
processes take more than their "fair
share" of the CPU, but you don't
want to kill them outright. An example is a program that a researcher
left running overnight to perform mathematical calculations that
hasn't finished the next morning. In this case,
rather than killing the process and forcing the researcher to restart
it later from the beginning, you could simply cut the amount of CPU
time that the process is getting and let it finish slowly during the
day. The program /etc/renice lets you change a
process's niceness.
For example, suppose that Simson left a program running before he
went home. Now it's late at night, and
Simson's program is taking up most of the
computer's CPU time:
% ps aux | head -5
% ps ux
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
simsong 20655 82.2 0.3 1712 1304 p1 S+ 1:34AM 343:48.71 rsync -avz --rsh=ssh
/raid4/project g3:/usr/bak
simsong 20656 11.3 0.3 2548 1688 p1 R+ 1:34AM 62:55.55 ssh g3 rsync --server -
vlogDtprz . /usr/bak
spaf 86311 0.0 0.2 1440 1036 p1 Is Fri05PM 0:00.23 -tcsh (tcsh)
spaf 91856 0.0 1.0 8412 5272 p1 T Fri11PM 0:00.88 emacs .
beth 5643 0.0 0.2 1436 1036 p3 Ss Sat08AM 0:00.21 -tcsh (tcsh)
You could slow down Simson's program by renicing it
to a higher nice number.
For security reasons, normal users are only allowed to increase the
nice numbers of their own processes. Only the superuser can lower the
nice number of a process or raise the nice number of somebody
else's process. (Fortunately, in this example we
know the superuser password!)
% /bin/su
password: another39
# /etc/renice +4 20655
20655: old priority 0, new priority 4
# ps 20655
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
simsong 20655 65.2 0.3 1712 1304 p1 RN+ 1:34AM 343:48.71 rsync -avz --rsh=ssh
/raid4/project g3:/usr/bak
The N in the STAT field indicates that the
rsync process is now running at a lower priority
(it is "niced"). Notice that the
process's CPU consumption has already decreased. Any
new processes that are spawned by the process with PID 20655 will
inherit this new nice value, too.
You can also use /etc /renice to lower the
nice number of a process to make it finish faster. Although setting a
process to a lower priority won't speed up the CPU
or make your computer's hard disk transfer data
faster, a negative nice number will cause Unix to run a particular
process more than it runs others on the system. Of course, if you ran
every process with the same negative priority,
there wouldn't be any apparent benefit.
Some versions of the renice command allow you to
change the nice of all processes belonging to a user or all processes
in a process group (described in the next section). For instance, to
speed up all of Simson's processes, you might type:
# renice -2 -u simsong
Remember: processes with a lower nice number run
faster.
Note that because of the Unix scheduling system, renicing several
processes to lower numbers is likely to increase paging activity if
there is limited physical memory, and therefore adversely impact
overall system performance.
What do process priority and niceness have to do with security? If an
intruder has broken into your system and you have contacted the
authorities and are tracing the phone call, slowing down the intruder
with a priority of +10 or +15 will limit the damage that the intruder
can do without hanging up the phone (and losing your chance to catch
the intruder). Of course, any time that an intruder is on a system,
exercise extreme caution.
Also, running your own shell with a higher priority may give you an
advantage if the system is heavily loaded. The easiest way to do so
is by typing:
# renice -5 $$
The shell will replace the $$ with the PID of
the shell's process.
B.1.3.4 Process groups and sessions
With Berkeley-derived versions of Unix, including SVR4, each process
is assigned a process ID (PID), a
process group ID, and a session ID.
Process groups and sessions are used to implement job control.
For each process, the PID is a unique number, the process group ID is
the PID of the process group leader process, and the session ID is
the PID of the session leader process. When a process is created, it
inherits the process group ID and the session ID of its parent
process. Any process may create a new process group by calling
setpgrp( ) and may create a new session by calling the
Unix system call setsid(
). All processes that have the same process
group ID are said to be in the same process group.
Each Unix process group belongs to a session group. This is used to
help manage signals and orphaned processes. Once a user has logged
in, the user may start multiple sets of processes, or jobs, using the
shell's job control mechanism. A job may have a
single process, such as a single invocation of the
ls command. Alternatively, a job may have
several processes, such as a complex shell pipeline. For each of
these jobs, there is a process group. Unix also keeps track of the
particular process group that is controlling the terminal. This can
be set or changed with ioctl(
) system calls. Only the controlling process
group can read or write to the terminal.
A process could become an orphan if its parent process exits but it
continues to run. Historically, these processes would be inherited by
the init process but would remain in their
original process group. If a signal were sent by the controlling
terminal (process group), then it would go to the orphaned process,
even though it no longer had any real connection to the terminal or
the rest of the process group.
To counter this situation,
POSIX defines an orphaned process
group. This is a process group in which the parent of every member
either is not a member of the process group's
session or is itself a member of the same process group. Orphaned
process groups are not sent terminal signals when they are generated.
Because of the way in which new sessions are created, the initial
process in the first process group is always an orphan (its ancestor
is not in the session). Command interpreters are usually spawned as
session leaders, so they ignore TSTP signals from the
terminal.
B.1.4 Creating Processes
A
Unix process can create a new process with the
fork( ) system
function. fork( )
makes an identical copy of the calling process, with the
exception that one process is identified as the
parent or parent
process , while the other is identified as the
child or child process.
Note the following differences between child and parent:
They have different PIDs.
They have different PPIDs (parent PIDs).
Accounting information is reset for the child.
They each have their own copy of the file descriptors.
Each has its own unique program counter register value.
Usually, each has its own memory space, although the
child's is a copy of the parent's
immediately after the fork( ).
The exec family of system functions lets a
process change the program that it is running. This is equivalent to
replacing the contents of memory, resetting the stack and register,
and jumping to the start location of the program. Processes terminate
when they call the _exit system function or when
they generate an
exception,
e.g., an attempt to use an illegal instruction or address an invalid
region of memory.
Unix uses special programs called
shells
(/bin/ksh, /bin/sh, and
/bin/csh are all common shells) to read commands
from the user and run other programs. The shell runs other programs
by first executing one of the fork family of
instructions to create a near-duplicate second process; the second
process then uses one of the exec family of
calls to run a new program, while the first process waits until the
second process finishes. This technique is used to run virtually
every program in Unix, from small programs such as
/bin/ls to large programs such as Emacs.
If all of the processes on the system suddenly die (or exit), the
computer would be unusable because there would be no way to start a
new process. In practice, this scenario never occurs for reasons
we'll describe later.
|