16.2 Tips on Avoiding Security-Related Bugs
Software
engineers define errors as mistakes made by
humans when designing and coding software.
Faults are manifestations of errors in
programs that may result in failures.
Failures are deviations from program
specifications. In
common usage, faults are called bugs.
Why do we bother to explain these formal terms? For three reasons:
To remind you that although bugs (faults) may be present in the code,
they aren't necessarily a problem until they trigger
a failure. Testing is designed to trigger such a failure before the
program becomes operational...and results in damage.
Bugs don't suddenly appear in code. They are there
because some person made a mistake—from ignorance, from haste,
from carelessness, or from some other reason. Ultimately,
unintentional flaws that allow someone to compromise your system were
caused by people who made errors.
Almost every piece of Unix software (as well as software for several
other widely used operating systems) has been developed without
comprehensive specifications. As a result, you cannot easily tell
when a program has actually failed. Indeed, what appears to be a bug
to users of the program might be a feature that was intentionally
planned by the program's authors.
When you write a program that will run as superuser or in some other
critical context, you must try to make the program as bug-free as
possible because a bug in a program that runs as superuser can leave
your entire computer system wide open.
Even when your program will run as an unprivileged user,
it's important to design and implement it carefully,
especially if it will be accessed by anonymous or untrusted others.
Bugs become vulnerabilities through privilege escalation; an
untrusted remote user exploits a bug in a network daemon to gain
access as an ordinary local user, and then uses that access to
exploit bugs that allow him to act as a privileged user, or even as
the superuser.
Of course, no program can be guaranteed to be perfect. A library
routine can be faulty, or a stray gamma ray may flip a bit in memory
to cause your program to misbehave. Nevertheless, there are a variety
of techniques that you can employ when writing programs that will
tend to minimize the security implications of any bugs that may be
present. You can also program defensively to try to counter any
problems that you can't anticipate now.
Here are some general rules to code by.
16.2.1 Design Principles
Carefully design the program before you
start. Be certain that you understand what you are trying to build.
Carefully consider the environment in which it will run, the input
and output behavior, files used, arguments recognized, signals
caught, and other aspects of behavior. Try to list all of the errors
that might occur, and how you will deal with them. Remember: you will need to design your program. Either you will
design the program before you start writing it, or you will design it
while you are writing it. You might as well design as much of the
program before you write the code. That way, if
you decide to change your design in the process, there will be less
code to change.
Document your program before you start
writing the code. Write a theory-of-operation document for your code,
describing what it will do and how it will do it. Outline the major
modules. Most importantly, revise this document while you write your
program. If you can't or won't do
that, at least consider writing documentation that includes a
complete manual page before you write any code.
Doing so can serve as a valuable exercise to focus your thoughts on
the code and its intended behavior.
Make the critical portion of your program as small and as simple as
possible.
Resist adding new features simply because you can. Add features and
options only when there is an identified need that cannot be met by
combining programs (one of the strengths of Unix). The less code you write, the less likely you are to
introduce bugs, and the more likely you are to understand how the
code actually works.
Resist rewriting standard functions. Although bugs have been found in
standard library functions and system calls, you are much more likely
to introduce newer and more dangerous bugs in your versions than in
the standard versions.
Be aware of race
conditions. These can manifest themselves as a deadlock, or as
failure of two calls to execute in close sequence.
- Deadlock conditions
-
Remember: more than one copy of your program may be running at the
same time. Consider using file locking for any files that you modify.
Provide a way to recover the locks in the event that the program
crashes while a lock is held. Avoid deadlocks or
"deadly embraces," which can occur
when one program attempts to lock file A and then file B, while
another program already holds a lock for file B and then attempts to
lock file A.
- Sequence conditions
-
Be aware that your program does not execute atomically. That is, the
program can be interrupted between any two operations to let another
program run for a while—including one that is trying to abuse
yours. Thus, check your code carefully for any pair of operations
that might fail if arbitrary code is executed between them.
In particular, when you are performing a series of operations on a
file, such as changing its owner, stating the
file, or changing its mode, first open the file and then use the
fchown( ),
fstat( ), or fchmod( )
system calls. Doing so will prevent the file from being replaced
while your program is running (a possible race condition). Also avoid
the use of the access(
) function to determine your ability to
access a file: using the access( ) function
followed by an open( ) is a race condition, and
almost always a bug.
Write for clarity and correctness before optimizing the code. Trying
to write clever shortcuts may be a stimulating challenge, but it is a
place where errors often creep in. In practice, most optimizations
have little visible effect unless the code is executed in
time-critical places (e.g., interrupt handling) or is invoked tens of
thousands of times per day. Meanwhile, the penalties for writing
dense, difficult-to-understand code can include longer testing time,
increased maintenance effort, and more lurking bugs. Spending two
days of hacking to save 100 instruction cycles per day is also a very
poor return on investment.
You may not believe that system calls can
fail for a program that is running as root. For
instance, you might not believe that a chdir( ) call
could fail, as root has permission to change
into any directory. However, if the directory in question is mounted
via NFS, root usually has no special privileges.
The directory might not exist, again causing the chdir(
) call to fail. If the target program is started in the
wrong directory and you fail to check the return codes, the results
will not be what you expected when you wrote the code.
Also consider the open( ) call. It can
fail for root, too. For example, you
can't open a file on a CD-ROM for writing because
CD-ROM is a read-only media. Or consider someone creating several
thousand zero-length files to use up all the inodes on the disk. Even
root can't create a file if all
the free inodes are gone.
The fork( ) system call
may fail if the process table is full, exec( ) may
fail if the swap space is exhausted, and sbrk(
) (the call that allocates memory for
malloc( )) may fail if a process has already
allocated the maximum amount of memory allowed by process limits. An
attacker can easily arrange for these cases to occur. The difference
between a safe and an unsafe program may be how that program deals
with these situations.
If you don't like to type explicit checks for each
call, then consider writing a set of macros to
"wrap" the calls and do it for you.
You will need one macro for calls that return -1
on failure, and another for calls that return 0 on
failure.
Here are some macros that you may find helpful:
#include <assert.h>
#define Call0(s) assert((s) != 0)
#define Call1(s) assert((s) >= 0)
Here is how to use them:
Call0(fd = open("foo", O_RDWR, 0666));
Note, however, that these simply cause the program to terminate
without any cleanup. You may prefer to change the macros to call some
common routine first to do cleanup and logging.
|
16.2.2 Coding Standards
Check all of your input
arguments. An astonishing number of
security-related bugs arise because an attacker sends an unexpected
argument or an argument with an unanticipated format to a program or
a function within a program. A simple way to avoid these kinds of
problems is by having your program always check all of its
arguments. Argument checking will not noticeably slow down
most programs, but it will make them less susceptible to hostile
users. As an added benefit, argument checking and error reporting
will make the process of catching non-security-related bugs easier.
When you are checking arguments in your program, pay extra attention
to the following:
Check arguments passed to your program on the command line. Check to
make sure that each command-line argument is properly formed and
bounded.
Check arguments that you pass to
Unix system functions. Even though your program is calling the system
function, you should check the arguments to be sure that they are
what you expect them to be. For example, if you think that your
program is opening a file in the current directory, you might want to
use the index( ) function to see if the filename
contains a slash character (/). If the filename contains the slash,
and it shouldn't, the program should not open the
file.
Check arguments passed to your program via environment variables,
including general environment variables (e.g., HOME) and such
variables as the LESS argument.
Do bounds
checking on every variable. If you only define an option as valid
from 1 to 5, be sure that no one tries to set it to
0, 6, -1,
32767, or 32768. If string
arguments are supposed to be 16 bytes or less, check the length
before you copy them into a local buffer (and
don't forget the room required for the terminating
null byte). If you are supposed to have three arguments, be sure you
have three.
Check all return codes from system
calls. Practically every single Unix operating system call has a
return code. Check them! Even system calls that you think cannot
fail, such as write( ), chdir(
), and chown( ), can fail under
exceptional circumstances and return appropriate return codes. When
the calls fail, check the errno variable to
determine why they failed. Have your program log
the unexpected value and then cleanly terminate if the system call
fails for any unexpected reason. This approach will be a great help
in tracking down problems later on. If you think that a system call should not fail and it does, do
something appropriate. If you can't think of
anything appropriate to do, then have your program delete all of its
temporary files and exit.
Have internal consistency-checking code. Use the
assert macro if you are programming in C. If you
have a variable that you know should be either a 1 or a 2, then your
program should not be running if the variable is anything else.
Include lots of
logging. You are
almost always better off having too much logging rather than too
little. Report your log information into a dedicated log file. Or
consider using the syslog facility so that logs
can be redirected to users or files, piped to programs, and/or sent
to other machines. And remember to do bounds checking on arguments
passed to syslog( ) to avoid buffer overflows. Here is specific information that you might wish to log:
The time that the program was run.
The UID and effective UID of the process.
The GID and effective GID of the process.
The terminal from which it was run.
The process number (PID). If you log with
syslog, including the
LOG_PID option in the openlog(
) call will do this automatically.
Command-line arguments.
Invalid arguments, or failures in consistency checking.
The host from which the request came (in the case of network servers).
The result of an ident lookup on that remote
host.
Always use full pathnames for any filename argument, for both
commands and data files.
Check anything supplied by the user for shell metacharacters if the
user-supplied input is passed on to another program, written into a
file, or used as a filename. In general, checking for good characters
is safer than checking for a set of "bad
characters" and is not that restrictive in most
situations.
If you are expecting to create a new file with the
open call, then use
the O_EXCL | O_CREAT flags to cause the routine to fail if the file
exists. If you expect the file to be there, be sure to omit the
O_CREAT flag so that the routine will fail if the file is not
there.
If you think that a file should be a file, use lstat(
) to make sure that it is not a link.
However, remember that what you check may change before you can get
around to opening it if it is in a public directory.
If you need to create a temporary file, consider using the
tmpfile( ) or mkstemp( )
functions. tmpfile( ) creates a temporary file,
opens the file, deletes the file, and returns a file handle. The open
file can be passed to a subprocess created with fork(
) and exec( ), but the contents of
the file cannot be read by any other program on the system. The space
associated with the file will automatically be returned to the
operating system when your program exits. If possible, create the
temporary file in a closed directory, such as
/tmp/root/. mkstemp( ) does
not delete the file and provides its name as well as its file handle,
and thus is suitable for files that need more persistence.
|
Older versions of mkstemp( ) could create
world-writable files. Make sure yours doesn't. Never
use the mktemp( ) or tmpnam(
) library calls if they exist on your system—they
are not safe in programs running with extra privilege. The code as
provided on most older versions of Unix had a race condition between
a file test and a file open. This condition is a well-known problem
and is relatively easy to exploit.
|
|
Make good use of available tools.
If you are using C and have an ANSI C
compiler available, use it, and use prototypes for calls. If you
don't have an ANSI C compiler, then be sure to use
the -Wall option to your C compiler (if
supported) or the lint program to check for
common mistakes. Use bounds checkers, memory testers, and any other
commercial tools to which you have access.
16.2.3 Things to Avoid
Don't use routines
that fail to check buffer boundaries when manipulating strings of
arbitrary length. In the C programming language in particular, note the following:
gets( )
|
fget( )
|
strcpy( )
|
strncpy( )
|
strcat( )
|
strncat( )
|
sprintf( )
|
snprintf( )
|
vsprintf( )
|
vsnprintf( )
|
Use the following library calls with great care—they can
overflow either a destination buffer or an internal, static buffer on
some systems if the input is
"cooked" to do so: fscanf(
) , scanf( ),
sscanf( ), realpath( ),
getopt( ), getpass( ),
streadd( ), strecpy( ), and
strtrns( ). Check to make sure that you have the
version of the syslog( ) library that checks the
length of its arguments.
There may be other routines in libraries on your system of which you
should be somewhat cautious. Note carefully if a copy or
transformation is performed into a string argument without benefit of
a length parameter to delimit it. Also note if the documentation for
a function says that the routine returns a pointer to a result in
static storage (e.g., strtok( )). If an attacker
can provide the necessary input to overflow these buffers, you may
have a major problem.
Don't design your program to depend on Unix
environment variables.
The simplest way to write a secure program is to make absolutely no
assumptions about your environment and to set everything
explicitly (e.g., signals, umask, current directory,
environment variables). A common way of attacking programs is to make
changes in the runtime environment that the programmer did not
anticipate. Thus, you should make certain that your program environment is in a
known state. Here are some of the things you may want to do:
If you absolutely must pass information to the program in its
environment, then have your program test for the necessary
environment variables and then erase the environment completely.
Otherwise, wipe the environment clean of all but the most essential
variables. On most systems, this is the TZ variable that specifies the local
time zone, and possibly some variables to indicate locale. Cleaning
the environment avoids any possible interactions between it and the
Unix system libraries.
You might also consider constructing a new
envp
and passing it to exec( ), rather than using
even a scrubbed original envp. Doing so is safer
because you explicitly create the environment rather than try to
clean it.
Make sure that the file descriptors that you expect to
be open are open, and that the file descriptors you expect to be
closed are closed. Consider what you'll do if
stdin, stdout, or
stderr is closed when your program starts (a
safe option is usually to connect them to
/dev/null.) For example, components of Wietse
Venema's Postfix mailer often include this C
snippet: for (fd = 0; fd < 3; fd++)
if(fstat(fd, &st) == -1 && (close(fd), open("/dev/null", O_RDWR, 0)) != fd)
msg_fatal("open /dev/null: %m");
Ensure that your signals are set to a sensible state.
Set your umask appropriately.
Explicitly chdir ( ) to an appropriate directory
when the program starts.
Do not provide shell escapes in interactive programs (they are not
needed).
Never use system(
) or popen( ) calls.
Both invoke the shell, and can have unexpected results when they are
passed arguments with funny characters, or in cases where environment
variables have peculiar definitions.
Do not create files in world-writable directories.
Don't have your program dump core except during your
testing. Core
files can fill up a filesystem and contain confidential information.
In some cases, an attacker can actually use the fact that a program
dumps core to break into a system. Instead of dumping core, have your
program log the appropriate problem and exit. Use the
setrlimit( ) function
or equivalent to limit the size of the core file to 0. While
you're at it, consider setting limits on the number
of files and stack size to appropriate values if they might not be
appropriate at the start of the program.
16.2.4 Before You Finish
Read through your code. After you have written your program, think of
how you might attack it yourself. What happens if the program gets
unexpected input? What happens if you are able to delay the program
between two system calls?
Test
it carefully for assumptions about the operating environments. For
example:
If you assume that the program is always run by somebody who is not
root, what happens if the program is run by
root? (Many programs designed to be run as
daemon or bin can cause security
problems when run as root, for instance.)
If you assume that the program will be run by
root, what happens if it is not run as
root?
If you assume that the program always runs in the
/tmp or /tmp/root directory, what happens if it is run somewhere else? What
if /tmp/root is a symlink? What if it
doesn't exist?
Test your program thoroughly. If you have a system based on SVR4,
consider using (at the least) tcov, a
statement-coverage tester (and if your system uses GNU tools, try
gcov). Consider using commercial products, such
as Centerline's
CodeCenter and Rational's PurifyPlus (from personal
experience, we can tell you that these programs are very useful).
Remember that finding a bug in testing is better than letting some
anonymous attacker find it for you!
Have your code reviewed by another competent programmer (or two, or
more). After she has reviewed it, "walk
through" the code with her and explain what each
part does. We have found that such reviews are a surefire way to
discover logic errors. Trying to explain why something is done a
certain way often results in an exclamation of "Wait
a moment . . . why did I do
that?"
|
Simply making your code available for download is not the same as
having a focused review! The majority of code published on the Web
and via FTP is not carefully examined by competent reviewers with
training in security and code review. In most cases, the people who
download your code are more interested in using it, or porting it to
run on their toaster than they are in providing meaningful code
review. Keep this in mind about code you download,
too—especially if someone claims that the code must be correct
because it has had thousands of downloads.
|
|
If you need to use a shell as part of your program,
don't use the C
shell. Many versions have known flaws that can be exploited, and
nearly every version performs an implicit eval
$TERM on startup, enabling all sorts of attacks. We recommend the use of
ksh
(used for some of the shell scripts in this book). It is
well-designed, fast, powerful, and well-documented (see Appendix C). Alternatively, you could write your scripts
in Perl, which has good security
for many system-related tasks.
Remember: many security bugs are actually programming bugs, which is
good news for programmers. When you make your program more secure,
you simultaneously make it more reliable.
|