20.3 Detecting Changes After the Fact
As we saw in the last section,
there may be circumstances in which we cannot use read-only media to
protect files and directories. Or, we may have a case in which some
of the important files are relatively volatile and need to change on
a regular basis. In cases such as these, we want to be able to detect
whether unauthorized changes occur.
There are basically three approaches to detecting changes to files
and inodes:
Use comparison copies of the data to be monitored. This is the most
reliable way.
Monitor metadata about the items to be protected. This includes
monitoring the modification time of entries as kept by the operating
system, and monitoring any logs or audit trails that show alterations
to files.
Use some form of signature of the data to be
monitored, and periodically recompute and compare the
signature against a stored
value.
Each of these approaches has drawbacks and benefits, as we discuss in
the following sections. But before we explain them in detail, we need
to explain a fundamental problem common to all of these schemes.
20.3.1 The Achilles Heel of Integrity Management Systems
The remainder of this chapter describes several different integrity
management systems. All of these systems perform more or less the
same function: they examine files on a computer's
disk drive to determine whether the files have been changed in any
significant way.
Although there are many reasons that you might want to examine the
integrity of your system's files, one of the most
common is to determine what has changed after a computer has been
attacked, broken into, and compromised.
If you suspect that a system has been compromised, there are many
ways that you can examine its files for evidence of this fact:
Physically remove the hard disk from the computer in question, attach
the disk to a second computer as an auxiliary disk, boot the second
computer, mount the disk read-only, and use the second
computer's operating system to examine the disk.
(For extra credit, you can use a tool like dd on
the second computer to make a block-for-block copy of the [unmounted]
disk in question on a spare drive. This will minimize the chance that
the drive might be inadvertently modified as part of the analysis
process.)
Leave the suspect disk in the suspect computer, but boot the suspect
computer with a clean operating system from a CD-ROM or a floppy
disk. Then, using only the tools on the CD-ROM or floppy, you could
proceed to mount the suspect disk read-only and analyze the possibly
compromised filesystem.
Log into the suspect computer and run whatever integrity-checking
tools happen to be installed.
Try to determine what hole the attacker used, close it, and continue
operations as normal.
Clearly, the most thorough way to examine the suspect system is the
first technique. In practice, the third and fourth techniques are the
most common. And to all of the people who have simply treated the
symptoms of a compromised system, rather than taken a more thorough
approach, we have one question:
- Which part of the word "compromised" do you not understand?
If an attacker truly compromises your computer system, all bets are
off. Nothing should be trusted. It is possible that the attacker has
done nothing to affect the integrity of critical system programs such
as login, ps,
ls, and netstat. On the
other hand, it is possible that the attacker has replaced all of
these programs with modified programs that contain Trojan horses and
back doors, and then modified your computer's kernel
so that integrity-checking tools cannot tell the
difference!
Sadly, it takes a lot of extra time to do things the right way.
It's much easier to log into a suspect computer and
run a copy of Tripwire or AIDE to check for
modifications—rather than going to the trouble of booting a
kernel from CD-ROM that is known to be good. That's
why many people—the authors included—will occasionally
run automated tools on possibly compromised machines before breaking
out the CD-ROMs and the screwdrivers. But beware: if it looks like
nothing is wrong, everything could be wrong.
20.3.2 Comparison Copies
The most direct and assured
method of detecting changes to data is to keep a copy of the
unaltered data and do a byte-by-byte comparison when needed. If there
is a difference, this indicates not only that a change occurred, but
explains what that change involved. There is no more reliable and
complete method of detecting changes.
Comparison copies,
however, are unwieldy. They require that you keep copies of every
file of interest. Not only does such a method require twice as much
storage as the original files, it also may involve a violation of the
licenses or copyrights of the files. (Copyright law allows one copy
for archival purposes, and your distribution media is that one
copy.)
To use a comparison copy means that both the original and the copy
must be read through, byte by byte, each time a check is made. And,
of course, the comparison copy needs to be saved in a protected
location.
Even with these drawbacks, comparison copies have a particular
benefit: if you discover an unauthorized change, you can simply
replace the altered version with the saved comparison copy, thus
restoring the system to normal. These copies can be made locally, at
remote sites, or over the network, as we describe in the following
sections.
20.3.2.1 Local copies
One standard method of storing comparison copies is to put them on
another disk. Many people report success with storing copies of
critical system files on removable media drives. If there is any
question about a particular file, the appropriate disk is placed in
the drive, mounted, and compared. If you are careful about how you
configure these disks, you get the added (and valuable) benefit of
having a known good version of the system to boot up if the system is
compromised by accident or attack. Making regular backups to
removable or write-once media such as tapes and CDs can provide
similar benefits.
A second standard method of storing comparison copies is to make
on-disk copies somewhere else on the system. For instance, you might
keep a copy of /bin/login in
/usr/adm/.hidden/.bin/login. Furthermore, you
can compress and/or encrypt the copy to help reduce disk use and keep
it safe from tampering; if an attacker were to alter both the
original /bin/login and the copy, then any
comparison you made would show no change. The disadvantage to
compression and encryption is that it then requires extra processing
to recover the files if you want to compare them against the working
copies. This extra effort may be significant if you wish to do
comparisons daily (or more often!). If you make these copies in
single-user mode and mark them as immutable (as described earlier),
you prevent them from being altered or removed by an attacker.
20.3.2.2 Remote copies
A third method of using comparison copies is to store them on a
remote site and make them available remotely in some manner. For
instance, you might place copies of all the system files on a disk
partition on a secured server, and export that partition read-only
using NFS or some similar protocol. All the client hosts could then
mount that partition and use the copies in local comparisons. Of
course, you need to ensure that whatever programs used in the
comparison (e.g., cmp,
find, and diff) are taken
from the remote partition and not from the local disk. Otherwise, an
attacker could modify those files to not report changes!
Remember that it is not enough to keep copies of executable programs.
Shared libraries and configuration files must usually be compared as
well.
20.3.2.3 rdist
Another method of remote comparison involves using a program to do
the comparison across the network. The
rdist utility is one such program that works
well in this context. The drawback to using
rdist, however, is the same as with using full
comparison copies: you need to read both versions of each file, byte
by byte. The problem is compounded, however, because you need to
transfer one copy of each file across the network each time you
perform a check. (If you use rdist, always use
it with the options -P ssh rather than relying
on the Berkeley "r" commands.)
One scenario that works well with rdist is to
have a "master" configuration for
each architecture you support at your site. This master machine
should not generally support user accounts, and it should have extra
security measures in place. On this machine, you put your master
software copies, possibly installed on read-only disks.
Periodically, the master machine copies a clean copy of the
rdist binary to the client machine to be
checked. The master machine then initiates an
rdist session involving the
-b option (byte-by-byte compare) against the
client. Differences are reported or, optionally, fixed. In this
manner, you can scan and correct dozens or hundreds of machines
automatically. If you use the -R option, you can
also check for new files or directories that are not supposed to be
present on the client machine.
|
The normal mode of operation of rdist, without
the -b option, does not do a byte-by-byte
compare. Instead, it compares only the metadata in the inode
concerning times and file sizes. As we discuss in the next section,
this information can be spoofed.
|
|
An rdist master machine has other advantages. It
makes it much easier to install new and updated software on a large
set of client machines. This feature is especially helpful when you
are in a rush to install the latest security patch in software on
every one of your machines. It also provides a way to ensure that the
owners and modes of system files are set correctly on all the
clients. The downside of this is that if you are not careful, and an
attacker modifies your master machine, rdist
will just as efficiently install the same security hole on every one
of your clients automatically!
20.3.3 Checklists and Metadata
Saving an extra copy of each
critical file and performing a byte-by-byte comparison can be unduly
expensive. It requires substantial disk space to store the copies.
Furthermore, if the comparison is performed over the network, either
via rdist or NFS, it will involve substantial
disk and network overhead each time the comparisons are made.
A more efficient approach would be to store a summary of important
characteristics of each file and directory. When the time comes to do
a comparison, the characteristics are regenerated and compared with
the saved information. If the characteristics are comprehensive and
smaller than the file contents (on average), then this method is
clearly a more efficient way of doing the comparison.
Furthermore, this approach can capture changes that a simple
comparison copy cannot: comparison copies detect changes in the
contents of files, but do little to detect changes in metadata such
as file owners or protection modes. It is this data—the data
normally kept in the inodes of files and directories—that is
sometimes more important than the data within the files themselves.
For instance, changes in owner or protection bits may result in
disaster if they occur to the wrong file or directory.
Thus, we would like to compare the values in the
inodes of critical files and directories
with a database of comparison values. The values we wish to compare
and monitor for critical changes are owner, group, and protection
modes. We also wish to monitor the
mtime (modification time) and the file
size to determine if the file contents change in an unauthorized or
unexpected manner. We may also wish to monitor the link count, inode
number, and ctime as additional indicators of change. All of this
material can be listed with the ls command.
20.3.3.1 Simple listing
The simplest form of a checklist mechanism is to run the
ls command on a list of files and compare the output
against a saved version. The most primitive approach might be a shell
script such as this:
#!/bin/sh
cat /usr/adm/filelist | xargs ls -ild > /tmp/now
diff -b /usr/adm/savelist /tmp/now
The file /usr/adm/filelist would contain a list
of files to be monitored. The /usr/adm/savelist
file would contain a base listing of the same files, generated on a
known secure version of the system. The
-i option adds the inode number in the
listing. The -d option includes directory
properties, rather than contents, if the entry is a directory name.
This approach has some drawbacks. First of all, the output does not
contain all of the information we might want to monitor. A more
complete listing can be obtained by using the
find
command:
#!/bin/sh
find `cat /usr/adm/filelist` -ls > /tmp/now
diff -b /usr/adm/savelist /tmp/now
This will not only give us the data to compare on the entries, but it
will also disclose if files have been deleted or added to any of the
monitored directories.
|
Writing a script to perform this operation and running it
periodically from a cron file may seem tempting. The difficulty
with this approach is that an attacker may modify the
cron entry or the script itself to not report
any changes. Thus, be cautious if you take this approach and be sure
to review and then execute the script manually on a regular basis.
|
|
20.3.3.2 Ancestor directories
You must be sure to check the ancestor directories of
all critical files and directories—i.e., all the directories
between the root directory and the files being
monitored. These are often overlooked, but can present a significant
problem if their owners or permissions are altered. An attacker might
then be able to rename one of the directories and install a
replacement or a symbolic link to a replacement that contains
dangerous information. For instance, if the /etc
directory is set to mode 777, then anyone could temporarily rename
the password file, install a replacement containing a
root entry with no password, run
su, and reinstall the old password file. Any
commands or scripts you have that monitor the password file would
show no change unless they happen to run during the few seconds of
the actual attack—something the attacker can usually avoid.
The following script takes a list of absolute file pathnames,
determines the names of all of them that contain directories, and
then prints them:
#!/bin/ksh
typeset pdir
function getdir # Gets the real, physical pathname
{
if [[ $1 != /* ]]
then
print -u2 "$1 is not an absolute pathname"
return 1
elif cd "${1%/*}"
then
pdir=$(pwd -P)
cd ~-
else
print -u2 "Unable to attach to directory of $1"
return 2
fi
return 0
}
cd /
print / # Ensure we always have the root directory included
while read name
do
getdir $name || continue
while [[ -n $pdir ]]
do
print $pdir
pdir=${pdir%/*}
done
done | sort -u
20.3.4 Checksums and Signatures
Unfortunately, the
approach we described for monitoring files can be defeated with a
little effort. Files can be modified in such a way that the
information we monitor will not disclose the change. For instance, a
file might be modified by writing to the raw disk device after the
appropriate block is known. As the modification did not go through
the filesystem, none of the information in the inodes will be
altered.
An attacker could also surreptitiously alter a file by setting the
system clock back to the time of the last legitimate change, making
the edits, and then setting the clock forward again. If this is done
quickly enough, no one will notice the change. Furthermore, all the
times on the file (including the ctime) will be set to the
"correct" values. Several so-called
"rootkits" in widespread use on the
Internet actually take this approach. It is easier and safer than
writing to the raw device. It is also more portable.
Thus, we need to have some stronger approach in place to check the
contents of files against a known good value. Obviously, we could use
comparison copies, but we have already noted that they are expensive.
A second approach would be to create a signature of the
file's contents to determine if a change occurred.
The first, naive approach using such a signature might involve the
use of a standard CRC checksum, as implemented by the
sum command. CRC polynomials are often used
to detect changes in message transmissions, so they could logically
be applied here. However, this application would be a mistake.
CRC checksums are designed to detect random bit changes, not
purposeful alterations. As such, CRC checksums are good at finding a
few bits changed at random. However, because they are generated with
well-known polynomials, an attacker can alter the input file to
generate an arbitrary CRC polynomial after an edit. In fact, some of
the same attacker toolkits that allow files to be changed without
altering the time also contain code to set the file contents to
generate the same sum outputs for the altered
file as for the original. These tools have been generally available
since at least 1992.
To generate a checksum that cannot be easily spoofed, we need to use
a stronger mechanism, such as the message digests described in Section 7.4. These are also dependent on the contents of
the file, but they are too difficult to spoof after changes have been
made.
If we had a program to generate the MD5 checksum of a file, we might
alter our checklist script to be:
#!/bin/sh
find 'cat /usr/adm/filelist' -ls -type f -exec md5 {}\; > /tmp/now
diff -b /usr/adm/savelist /tmp/now
Both the mtree command and the Tripwire system
(discussed later in this chapter) employ cryptographic checksums for
this purpose.
|