[Chapter 9] 9.2 Detecting Change

9.2 Detecting Change

As we saw in the last section, there may be circumstances in which we cannot use read-only media to protect files and directories. Or, we may have a case in which some of the important files are relatively volatile and need to change on a regular basis. In cases such as these, we want to be able to detect if unauthorized changes occur.

There are basically three approaches to detecting changes to files and inodes. The first and most certain way is to use comparison copies of the data to be monitored. A second way is to monitor metadata about the items to be protected. This includes monitoring the modification time of entries as kept by the operating system, and monitoring any logs or audit trails that show alterations to files. The third way is to use some form of signature of the data to be monitored, and to periodically recompute and compare the signature against a stored value. Each of these approaches has drawbacks and benefits, as we discuss below.

9.2.1 Comparison Copies

The most direct and assured method of detecting changes to data is to keep a copy of the unaltered data, and do a byte-by-byte comparison when needed. If there is a difference, this indicates not only that a change occurred, but what that change involved. There is no more certain and complete method of detecting changes.

Comparison copies, however, are unwieldy. They require that you keep copies of every file of interest. Not only does such a method require twice as much storage as the original files, it also may involve a violation of license or copyright of the files (copyright law allows one copy for archival purposes, and your distribution media is that one copy).[3] To use a comparison copy means that both the original and the copy must be read through byte by byte each time a check is made. And, of course, the comparison copy needs to be saved in a protected location.

[3] Copyright law does not allow for copies on backups.

Even with these drawbacks, comparison copies have a particular benefit - if you discover an unauthorized change, you can simply replace the altered version with the saved comparison copy, thus restoring the system to normal.

9.2.1.1 Local copies

One standard method of storing comparison copies is to put them on another disk. Many people report success at storing copies of critical system files on removable media, such as Syquest or Bernoulli drives. If there is any question about a particular file, the appropriate disk is placed in the drive, mounted, and compared. If you are careful about how you configure these disks, you get the added (and valuable) benefit of having a known good version of the system to boot up if the system is compromised by accident or attack.

A second standard method of storing comparison copies is to make on-disk copies somewhere else on the system. For instance, you might keep a copy of /bin/login in /usr/adm/.hidden/.bin/login . Furthermore, you might compress and/or encrypt the copy to help reduce disk use and keep it safe from tampering; if an attacker were to alter both the original /bin/login and the copy, then any comparison you made would show no change. The disadvantage to compression and encryption is that it then requires extra processing to recover the files if you want to compare them against the working copies. This extra effort may be significant if you wish to do comparisons daily (or more often!).

9.2.1.2 Remote copies

A third method of using comparison copies is to store them on a remote site and make them available remotely in some manner. For instance, you might place copies of all the system files on a disk partition on a secured server, and export that partition read-only using NFS or some similar protocol. All the client hosts could then mount that partition and use the copies in local comparisons. Of course, you need to ensure that whatever programs are used in the comparison (e.g., cmp , find , and diff ) are taken from the remote partition and not from the local disk. Otherwise, an attacker could modify those files to not report changes!

9.2.1.3 rdist

Another method of remote comparison would be to use a program to do the comparison across the network. The rdist utility is one such program that works well in this context. The drawback to using rdist , however, is the same as with using full comparison copies: you need to read both versions of each file, byte by byte. The problem is compounded, however, because you need to transfer one copy of each file across the network each time you perform a check. Another drawback is that rdist depends on the Berkeley trusted hosts mechanism to work correctly (unless you have Kerberos installed), and this may open the systems more than you want to allow.

NOTE: If you have an older version of rdist , be certain to check with your vendor for an update. Some versions of rdist can be exploited to gain root access to a system. See CERT advisories 91-20 and 94-04 for more information.

One scenario we have found to work well with rdist is to have a "master" configuration for each architecture you support at your site. This "master" machine should not generally support user accounts, and it should have extra security measures in place. On this machine, you put your master software copies, possibly installed on read-only disks.

Periodically, the master machine copies a clean copy of the rdist binary to the client machine to be checked. The master machine then initiates an rdist session involving the -b option (byte-by-byte compare) against the client. Differences are reported, or optionally, fixed. In this manner you can scan and correct dozens or hundreds of machines automatically. If you use the -R option, you can also check for new files or directories that are not supposed to be present on the client machine.

An rdist "master" machine has other advantages. It makes it much easier to install new and updated software on a large set of client machines. This feature is especially helpful when you are in a rush to install the latest security patch in software on every one of your machines. It also provides a way to ensure that the owners and modes of system files are set correctly on all the clients. The down side of this is that if you are not careful, and an attacker modifies your master machine, rdist will as efficiently install the same security hole on every one of your clients, automatically!

Note that the normal mode of operation of rdist , without the -b option, does not do a byte-by-byte compare. Instead, it only compares the metadata in the Inode concerning times and file sizes. As we discuss in the next section, this information can be spoofed.

9.2.2 Checklists and Metadata

Saving an extra copy of each critical file and performing a byte-by-byte comparison can be unduly expensive. It requires substantial disk space to store the copies. Furthermore, if the comparison is performed over the network, either via rdist or NFS , it will involve substantial disk and network overhead each time the comparisons are made.

A more efficient approach would be to store a summary of important characteristics of each file and directory. When the time comes to do a comparison, the characteristics are regenerated and compared with the saved information. If the characteristics are comprehensive and smaller than the file contents (on average), then this method is clearly a more efficient way of doing the comparison.

Furthermore, this approach can capture changes that a simple comparison copy cannot: comparison copies detect changes in the contents of files, but do little to detect changes in metadata such as file owners or protection modes. It is this data - the data normally kept in the Inodes of files and directories - that is sometimes more important than the data within the files themselves. For instance, changes in owner or protection bits may result in disaster if they occur to the wrong file or directory.

Thus, we would like to compare the values in the inodes of critical files and directories with a database of comparison values. The values we wish to compare and monitor for critical changes are owner, group, and protection modes. We also wish to monitor the mtime (modification time) and the file size to determine if the file contents change in an unauthorized or unexpected manner. We may also wish to monitor the link count, Inode number, and ctime as additional indicators of change. All of this material can be listed with the ls command.

9.2.2.1 Simple listing

The simplest form of a checklist mechanism is to run the ls command on a list of files and compare the output against a saved version. The most primitive approach might be a shell script such as this:

#!/bin/sh  

cat /usr/adm/filelist | xargs ls -ild > /tmp/now 
diff -b /usr/adm/savelist /tmp/now

The file /usr/adm/filelist would contain a list of files to be monitored. The /usr/adm/savelist file would contain a base listing of the same files, generated on a known secure version of the system. The -i option adds the inode number in the listing. The -d option includes directory properties, rather than contents, if the entry is a directory name.

This approach has some drawbacks. First of all, the output does not contain all of the information we might want to monitor. A more complete listing can be obtained by using the find command:

#!/bin/sh

find `cat /usr/adm/filelist` -ls > /tmp/now 
diff -b /usr/adm/savelist /tmp/now

This will not only give us the data to compare on the entries, but it will also disclose if files have been deleted or added to any of the monitored directories.

NOTE: Writing a script to do the above and running it periodically from a cron file can seem tempting. The difficulty with this approach is that an attacker may modify the cron entry or the script itself to not report any changes. Thus, be cautious if you take this approach and be sure to review and then execute the script manually on a regular basis.

9.2.2.2 Ancestor directories

We should mention here that you must check the ancestor directories of all critical files and directories - all the directories between the root directory and the files being monitored. These are often overlooked, but can present a significant problem if their owner or permissions are altered. An attacker could then be able to rename one of the directories and install a replacement or a symbolic link to a replacement that contains dangerous information. For instance, if the /etc directory becomes mode 777, then anyone could temporarily rename the password file, install a replacement containing a root entry with no password, run su , and reinstall the old password file. Any commands or scripts you have that monitor the password file would show no change unless they happen to run during the few seconds of the actual attack - something the attacker can usually avoid.

The following script takes a list of absolute file pathnames, determines the names of all containing directories, and then prints them. These files can then be added to your comparison list (checklist) such as the script shown earlier:

#!/bin/ksh

typeset pdir

function getdir      # Gets the real, physical pathname
{
   if [[ $1 != /* ]]
   then
      print -u2 "$1 is not an absolute pathname"
      return 1
   elif cd "${1%/*}"
   then
      pdir=$(pwd -P)
      cd ~-
   else
      print -u2 "Unable to attach to directory of $1"
      return 2
   fi
   return 0
}

cd /
print /     # ensure we always have the root directory included

while read name
do
   getdir $name || continue
   while [[ -n $pdir ]]
   do
      print $pdir
      pdir=${pdir%/*}
   done
done | sort -u

9.2.3 Checksums and Signatures

Unfortunately, the approach we described above for monitoring files can be defeated with a little effort. Files can be modified in such a way that the information we monitor will not disclose the change. For instance, a file might be modified by writing to the raw disk device after the appropriate block is known. As the modification did not go through the filesystem, none of the information in the inodes will be altered.

You could also surreptitiously alter a file by setting the system clock back to the time of the last legitimate change, making the edits, and then setting the clock forward again. If this is done quickly enough, no one will notice the change. Furthermore all the times on the file (including the ctime) will be set to the "correct" values. Several hacker toolkits in widespread use on the Internet actually take this approach. It is easier and safer than writing to the raw device. It is also more portable.

Thus, we need to have some stronger approach in place to check the contents of files against a known good value. Obviously, we could use comparison copies, but we have already noted that they are expensive. A second approach would be to create a signature of the file's contents to determine if a change occurred.

The first, naive approach with such a signature might involve the use of a standard CRC checksum, as implemented by the sum command. CRC polynomials are often used to detect changes in message transmissions, so they could logically be applied here. However, this application would be a mistake.

CRC checksums are designed to detect random bit changes, not purposeful alterations. As such, CRC checksums are good at finding a few bits changed at random. However, because they are generated with well-known polynomials, one can alter the input file so as to generate an arbitrary CRC polynomial after an edit. In fact, some of the same hacker toolkits that allow files to be changed without altering the time also contain code to set the file contents to generate the same sum outputs for the altered file as for the original.

To generate a checksum that cannot be easily spoofed, we need to use a stronger mechanism, such as the message digests described in "Message Digests and Digital Signatures" in Chapter 6, Cryptography . These are also dependent on the contents of the file, but they are too difficult to spoof after changes have been made.

If we had a program to generate the MD5 checksum of a file, we might alter our checklist script to be:

#!/bin/sh  

find `cat /usr/adm/filelist` -ls -type f -exec md5 {}\; > /tmp/now 
diff -b /usr/adm/savelist /tmp/now

9.2.4 Tripwire

Above, we described a method of generating a list of file attributes and message digests. The problem with this approach is that we don't really want that information for every file. For instance, we want to know if the owner or protection modes of /etc/passwd change, but we don't care about the size or checksum because we expect the contents to change. At the same time, we are very concerned if the contents of /bin/login are altered.

We would also like to be able to use different message digest algorithms. In some cases, we are concerned enough that we want to use three strong algorithms, even if they take a long time to make the comparison; after all, one of the algorithms might be broken soon.[4] In other environments, a fast but less secure algorithm, used in conjunction with other methods, might be all that is necessary.

[4] This is not so farfetched. As this book went to press, reports circulated of a weakness in the MD4 message digest algorithm.

In an attempt to meet these needs[5] the Tripwire package was written at Purdue by Gene Kim and Gene Spafford. Tripwire is a program that runs on every major version of UNIX (and several obscure versions). It reads a configuration file of files and directories to monitor, and then tracks changes to inode information and contents. The database is highly configurable, and allows the administrator to specify particular attributes to monitor, and particular message digest algorithms to use for each file.

[5] And more: see the papers that come with the distribution.

9.2.4.1 Building Tripwire

To build the Tripwire package, you must first download a copy from the canonical distribution site, located at ftp://coast.cs.purdue.edu/pub/COAST/Tripwire .[6] The distribution has been signed with a detached PGP digital signature to verify that the version you download has not been altered in an unauthorized manner (See the discussion of digital signatures in "PGP detached signatures" in Chapter 6 .)

[6] As this book goes to press, several companies are discussing a commercial version of Tripwire. If one is available, a note will be on the FTP site noting its source. Be aware that Purdue University holds the copyright on Tripwire and only allows its use for limited, noncommercial use without additional license.

Next, read all the README files in the distribution. Be certain you understand the topics discussed. Pay special attention to the details of customization for local considerations, including the adaption for the local operating system. These changes normally need to be made in the include/config.h file. You also need to set the CONFIG_PATH and DATABASE_PATH defines to the secured directory where you will store the data.

One additional change that might be made is to the default flags field, defined in the DEFAULTIGNORE field. This item specifies the fields that Tripwire, by default, will monitor or ignore for files without an explicit set of flags. As shipped, this is set to the value "R-3456789" - the flags for read-only files, and only message digests 1 and 2. These are the MD5 and Snefru signatures. You may want to change this to include other signatures, or replace one of these signatures with a different one; the MD5, SHA , and Haval algorithms appear to be the strongest algorithms to use.

Next, you will want to do a make followed by a make test . The test exercises all of the Tripwire functions, and ensures that it was built properly for your machine. This will also demonstrate how the output from Tripwire appears.

The next step is to build the configuration file. Tripwire scans files according to a configuration file. The file contains the names of files and directories to scan, and flags to specify what to note for changes. For example, you might have the following in your configuration file:

/.rhosts R # may not exist

/.profile R # may not exist

/.forward R-12+78 # may not exist

/usr/spool/at L

=/tmp L-n

In this example, the /.rhosts , /.profile , and /.forward file have everything in the inode checked for changes except the access time. Thus, the owner, group, size, protection modes, modification time, link count, and ctime are all monitored for change. Further, the first two files are also checksummed using the MD5 and Snefru signatures, and the /.forward file is checksummed with the SHA and HAVAL algorithms. The directory /usr/spool/at and everything inside it is checked for changes to owner, group, protection modes, and link count; changes to contents are allowed and ignored. The /tmp directory is checked for the same changes, but its contents are not checked.

Other flags and combinations are also possible. Likely skeleton configuration files are provided for several major platforms.

Finally, you will want to move the binary for Tripwire and the configuration file to the protected directory that is located on (normally) read-only storage. You will then need to initialize the database; be sure that you are doing this on a known clean system - reinstall the software, if necessary. After you have generated the database, set the protections on the database and binary.

When you build the database, you will see output similar to this:

### Phase 1:   Reading configuration file
### Phase 2:   Generating file list
### Phase 3:   Creating file information database
###
### Warning:   Database file placed in ./databases/tw.db_mordor.cs.purdue.edu.
###
###            Make sure to move this file file and the configuration
###            to secure media!
###
###            (
Tripwire
 expects to find it in `/floppy/system.db'.)

NOTE: When possible, build Tripwire statically to prevent its using shared libraries which might have Trojan horses in them. This type of attack is one of the few remaining vulnerabilities in the use of Tripwire .

9.2.4.2 Running Tripwire

You run Tripwire from the protected version on a periodic basis to check for changes. You should run it manually sometimes rather than only from cron . This step ensures that Tripwire is actually run and you will see the output. When you run it and everything checks out as it should, you will see output something like this:

### Phase 1:   Reading configuration file
### Phase 2:   Generating file list
### Phase 3:   Creating file information database
### Phase 4:   Searching for inconsistencies
###
###                     Total files scanned:            2640
###                           Files added:              0
###                           Files deleted:            0
###                           Files changed:            2586
###
###                     After applying rules:
###                           Changes discarded:        2586
###                           Changes remaining:        0
###

This output indicates no changes. If files or directories had changed, you would instead see output similar to:

### Phase 1:   Reading configuration file
### Phase 2:   Generating file list
### Phase 3:   Creating file information database
### Phase 4:   Searching for inconsistencies
###
###                     Total files scanned:            2641
###                           Files added:              1
###                           Files deleted:            0
###                           Files changed:            2588
###
###                     After applying rules:
###                           Changes discarded:        2587
###                           Changes remaining:        3
###
added:   -rw------- root           27 Nov  8 00:33:40 1995 /.login
changed: -rw-r--r-- root         1594 Nov  8 00:36:00 1995 /etc/mnttab
changed: drwxrwxrwt root         1024 Nov  8 00:42:37 1995 /tmp
### Phase 5:   Generating observed/expected pairs for changed files
###
### Attr        Observed (what it is)      Expected (what it should be)
### =========== ================================================
/etc/mnttab
      st_mtime: Wed Nov  8 00:36:00 1995      Tue Nov  7 18:44:47 1995      
      st_ctime: Wed Nov  8 00:36:00 1995      Tue Nov  7 18:44:47 1995      
    md5 (sig1): 2tIRAXU5G9WVjUKuRGTkdi        0TbwgJStEO1boHRbXkBwcD        
 snefru (sig2): 1AvEJqYMlsOMUAE4J6byKJ        3A2PMKEy3.z8KIbwgwBkRs        

/tmp
      st_nlink: 5                             4

This output shows that a file has been added to a monitored directory (Tripwire also detects when files are deleted), and that both a directory and a file had changes. In each case, the nature of the changes are reported to the user. This report can then be used to determine what to do next.

Tripwire has many options, and can be used for several things other than simple change detection. The papers and man pages provided in the distribution are quite detailed and should be consulted for further information .