16.1 One Bug Can Ruin Your Whole Day . . .
The Unix security model makes
a tremendous investment in the infallibility of the superuser and in
the reliability of software that runs with the privileges of the
superuser. If the superuser account is compromised, then the system
is left wide open—hence, our many admonitions in this book to
protect the superuser account and restrict the number of people who
must know the superuser password.
Unfortunately, even if you prevent
users from logging into the superuser account, many Unix programs
need to run with some sort of administrative privileges. Many of
these programs are set up to run with superuser
privileges—typically by having them run as SUID
root programs, by having the programs launched
when the computer starts up, or by having them started by other
programs running with superuser privileges (the common manner in
which network servers are started). A single bug in any of these
complicated programs can compromise the safety of your entire system.
Furthermore, the environment and trusted inputs to these programs
also need to be protected to prevent unexpected (and unwanted!)
behavior. This
characteristic is a security architecture design flaw, but it is
basic to the design of Unix and is not likely to change.
16.1.1 The Lesson of the Internet Worm
One of
the best examples of how a single line of code in a program can
result in the compromise of thousands of machines dates back to the
pre-dawn of the commercial Internet. The year was 1988, and a
graduate student at Cornell University had discovered several
significant security flaws in versions of Unix that were widely used
on the Internet. Using his knowledge, the student created a program
(known as a worm) that would find vulnerable
computers, exploit one of these flaws, transfer a copy of itself to
the compromised system, and then repeat the process. The program
infected between 2,000 and 6,000 computers within hours of being
released. While that does not seem like a lot of machines today, in
1988 it represented a substantial percentage of the academic and
commercial mail servers on the Internet. The Internet was effectively
shut down for two days following the worm's release.
Although the worm used several techniques for compromising systems,
the most effective attack in its arsenal was a
buffer overflow attack directed against
the Unix fingerd daemon.
The original fingerd program contained these
lines of code:
char line[512];
...
line[0] = '\0';
gets(line);
Because the gets( ) function does not check the length
of the line read, a program that supplied more than 512 bytes of
valid data would overrun the memory allocated to the
line[] array and, ultimately, corrupt the
program's stack frame. The worm contained code that
used the stack overflow to cause the fingerd
program to execute a shell; because at the time it was standard
practice to run fingerd as the superuser, this
shell inherited superuser access to the server computer.
fingerd didn't need to run as
superuser, but it was spawned as a root process
during the system startup and never switched to a different user
ID. Because
fingerd's standard input and
standard output file descriptors were connected to the TCP socket,
the remote process that caused the overflow was given complete,
interactive control of the system.
The fix for the fingerd program was simple:
replace the gets( ) function with the
fgets( ) function. Whereas gets(
) takes one parameter, the buffer, the fgets(
) function takes three arguments: the buffer, the size of
the buffer, and the file handle from which to fetch the data:
fgets(line,sizeof(line),stdin);
When the original fingerd program was written,
it was common practice among many programers to use gets(
) instead of fgets( )—probably
because using gets( ) required typing fewer
characters each time. Nevertheless, because of the way that the C
programming language and the Standard IO library were designed, any
program that used gets( ) to fill a buffer on
the stack potentially had—and still has—this
vulnerability.
Although it seems like ancient history now, this story continues to
illustrate many important lessons:
The worm demonstrated that a single flaw in a single innocuous
Internet server could compromise the security of an entire
system—and, indeed, an entire network.
Many of the administrators whose systems were compromised by the worm
did not even know what the fingerd program did
and had not made a conscious decision to have the service running.
Likewise, many of the security flaws that have been discovered in the
years since have been with software that was installed by default and
not widely used.
Although the worm did not use its superuser access to intentionally
damage programs or data on computers that it penetrated, the program
did result in significant losses. Many of those losses were the
result of lost time, lost productivity, and the loss of confidence in
the compromised systems. There is no such thing as a
"harmless break-in."
The worm showed that flaws in deployed software might lurk for years
before being exploited by someone with the right tools and the wrong
motives. Indeed, the flaw in the finger code had
been unnoticed for more than six years, from the time of the first
Berkeley Unix network software release until the day that the worm
ran loose. This illustrates a fundamental lesson: because a hole has
never been discovered in a program does not mean that no hole exists.
The fact that a hole has not been exploited today does not guarantee
that the hole will not be exploited tomorrow.
Interestingly enough, the fallible human component of secure
programming is illustrated by the same example. Shortly after the
problem with the gets( ) subroutine was exposed,
the Berkeley programming group went through all of its code and
eliminated every similar use of the gets( ) call
in a network server. Most vendors did the same with their code.
Several people, including Spafford in his paper analyzing the
operations and effects of the worm, publicly warned that uses of
other library calls that wrote to buffers without bounds checks also
needed to be examined. These included calls to the
sprintf( ) routine,
and byte-copy routines such as strcpy (
). However, those admonitions were not
heeded.
In late 1995, as we were finishing the second edition of this book, a
new security vulnerability in several versions of Unix was widely
publicized. It was based on buffer overruns in the
syslog library routine. An attacker could
carefully craft an argument to a network daemon such that, when an
attempt was made to log it using syslog, the
message overran the buffer and compromised the system in a manner
hauntingly similar to the fingerd problem. After
seven years, a close cousin to the fingerd bug
was discovered. What underlying library calls contribute to the
problem? The sprintf( ) library call does, and
so do byte-copy routines such as strcpy( ).
In the summer of 2002, as we were working on the third edition of
this book, not one but four separate overflow vulnerabilities were
found in the popular OpenSSL security library, based on effectively
the same vulnerability. In use on more than a million Internet
servers, this SSL library is the basis of the SSL offering used by
the Apache web server and all Unix
SSL-wrapped mail services.
While many Unix security bugs are the result of poor programming
tools and methods, even more regrettable is the failure to learn from
old mistakes, and the failure to redesign the underlying operating
system or programming languages so that this broad class of attacks
will no longer be effective.
16.1.2 An Empirical Study of the Reliability of Unix Utilities
In December 1990, the
Communications of the ACM published an article
by Miller, Fredrickson, and So entitled
"An Empirical Study of the Reliability of Unix
Utilities" (Volume 33, issue 12, pp. 32-44). The
paper started almost as a joke: a researcher was logged into a Unix
computer from home, and the programs he was running kept crashing
because of line noise from a poor modem connection. Eventually,
Barton Miller, a professor at the University of Wisconsin, decided to
subject the Unix utility programs from a variety of different vendors
to a selection of random inputs and monitor the results.
16.1.2.1 What he found
The results were discouraging. Between 25% and 33% of the Unix
utilities could be crashed or hung by supplying them with unexpected
inputs—sometimes input that was as simple as an end-of-file on
the middle of an input line. On at least one occasion, crashing a
program tickled an operating system bug and caused the entire
computer to crash. Many times, programs would freeze for no apparent
reason.
In 1995 a new team headed by Miller repeated the experiment, this
time running a program called Fuzz on nine
different Unix platforms. The team also tested Unix network servers,
and a variety of X Window System applications (both clients and
servers). Here are some of the highlights:
According
to the 1995 paper, vendors were still shipping a distressingly buggy
set of programs: "...the failure rate of utilities
on the commercial versions of Unix that we tested (from Sun, IBM,
SGI, DEC, and NeXT) ranged from 15-43%."
Unix vendors don't seem to be overly concerned about
bugs in their programs: "Many of the bugs discovered
(approximately 40%) and reported in 1990 are still present in their
exact form in 1995. The 1990 study was widely published in at least
two languages. The code was made freely available via anonymous FTP.
The exact random data streams used in our testing were made freely
available via FTP. The identification of failures that we found were
also made freely available via FTP; these include code fragments with
file and line numbers for the errant code. According to our records,
over 2000 copies of the...tools and bug identifications were fetched
from our FTP sites...It is difficult to understand why a vendor would
not partake of a free and easy source of reliability
improvements."
The two lowest failure rates in the study were the Free Software
Foundation's GNU utilities (failure rate of 7%) and the
utilities included with the freely distributed Linux version of the
Unix operating system (failure rate 9%). Interestingly enough, the Free Software
Foundation has strict coding rules that forbid the use of
fixed-length buffers. (Miller et al. failed to note that many of the
Linux utilities were repackaged GNU utilities.)
There were a few bright points in the 1995 paper. Most notable was
the fact that Miller's group was unable to crash any
Unix network server. The group was also unable to crash any X Window
System server.
On the other hand, the group discovered that many X clients will
readily crash when fed random streams of data. Others will lock
up—and in the process, freeze the X server until the programs
are terminated.
In 2000, Professor Miller and Justin Forrester ran
the Fuzz tests a third time, although this time
exclusively against Windows NT. Their testing revealed that they
could crash or hang 45% of all programs expecting user input. When
they tried sending random Win32 messages to applications (something
any user can accomplish), they disrupted 100% of all applications!
16.1.2.2 Where's the beef?
Many of the errors that Miller's group discovered
resulted from common programming mistakes with the C programming
language: programmers wrote clumsy or confusing code that did the
wrong things; programmers neglected to check for array boundary
conditions; and programmers assumed that their char variables were of
type unsigned, when in fact they were signed.
While these errors can certainly cause programs to crash when they
are fed random streams of data, these errors are exactly the kinds of
problems that can be exploited by carefully crafted streams of data
to achieve malicious results. Think back to the Internet worm: if
tested by the Miller Fuzz program, the original
fingerd program would have crashed. But when
presented with the carefully crafted stream that was present in the
worm, the program gave its attacker a root
shell!
What is somewhat frightening about the study is that the tests
employed by Miller's group are among the least
comprehensive known to testers: random, black-box testing. Different
patterns of input could possibly cause more programs to fail. Inputs
made under different environmental circumstances could also lead to
abnormal behavior. Other testing methods could expose these problems
whereas random testing, by its very nature, would not.
Miller's group also found that use of several
commercially available tools enabled them to discover errors and
perform other tests, including discovery of buffer overruns and
related memory errors. These tools were readily available; however,
vendors were apparently not using them.
Why don't vendors care more about quality? Well,
according to many of them, they do care, but quality does not sell.
Writing good code and testing it carefully is not a quick or simple
task. It requires extra effort and extra time. The extra time spent
on ensuring quality will result in increased cost, and an increase in
time-to-market. To date, few customers (possibly including you,
gentle reader) have indicated a willingness to pay extra for
better-quality software. Vendors have thus put their efforts into
what customers are willing to buy, such as new features. Although we
believe that most vendors could do a better job in this respect (and
some could do a much better job), we must be
fair and point the finger at the user population, too.
In some sense, any program you write might fare as well as
vendor-supplied software. However, that isn't good
enough if the program is running in a sensitive role and could
potentially be abused. Therefore, you must practice good coding
habits, and pay special attention to common trouble spots.
|