Handling Timing Glitches (Programming Perl)

23.2. Handling Timing Glitches

Sometimes your program's behavior is exquisitely sensitive to the timing of external events beyond your control. This is always a concern when other programs, particularly inimical ones, might be vying with your program for the same resources (such as files or devices). In a multitasking environment, you cannot predict the order in which processes waiting to run will be granted access to the processor. Instruction streams among all eligible processes are interleaved, so first one process gets some CPU, and then another process, and so on. Whose turn it is to run, and how long they're allowed to run, appears to be random. With just one program that's not a problem, but with several programs sharing common resources, it can be.

Thread programmers are especially sensitive to these issues. They quickly learn not to say:

$var++ if $var == 0;

when they should say:

{
    lock($var);
    $var++ if $var == 0;
}

The former produces unpredictable results when multiple execution threads attempt to run this code at the same time. (See Chapter 17, "Threads".) If you think of files as shared objects, and processes as threads contending for access to those shared objects, you can see how the same issues arise. A process, after all, is really just a thread with an attitude. Or vice versa.

Timing unpredictabilities affect both privileged and nonprivileged situations. We'll first describe how to cope with a long-standing bug in old Unix kernels that affects any set-id program. Then we'll move on to discuss race conditions in general, how they can turn into security holes, and steps you can take to avoid falling into these holes.

23.2.1. Unix Kernel Security Bugs

Beyond the obvious problems that stem from giving special privileges to interpreters as flexible and inscrutable as shells, older versions of Unix have a kernel bug that makes any set-id script insecure before it ever gets to the interpreter. The problem is not the script itself, but a race condition in what the kernel does when it finds a set-id executable script. (The bug doesn't exist on machines that don't recognize #! in the kernel.) When a kernel opens such a file to see which interpreter to run, there's a delay before the (now set-id) interpreter starts up and reopens the file. That delay gives malicious entities a chance to change the file, especially if your system supports symbolic links.

Fortunately, sometimes this kernel "feature" can be disabled. Unfortunately, there are a couple of different ways to disable it. The system can outlaw scripts with the set-id bits set, which doesn't help much. Alternatively, it can ignore the set-id bits on scripts. In the latter case, Perl can emulate the setuid and setgid mechanism when it notices the (otherwise useless) set-id bits on Perl scripts. It does this via a special executable called suidperl, which is automatically invoked for you if it's needed.[7]

[7]Needed and permitted--if Perl detects that the filesystem on which the script resides was mounted with the nosuid option, that option will still be honored. You can't use Perl to sneak around your sysadmin's security policy this way.

However, if the kernel set-id script feature isn't disabled, Perl will complain loudly that your setuid script is insecure. You'll either need to disable the kernel set-id script "feature", or else put a C wrapper around the script. A C wrapper is just a compiled program that does nothing except call your Perl program. Compiled programs are not subject to the kernel bug that plagues set-id scripts.

Here's a simple wrapper, written in C:

#define REAL_FILE "/path/to/script"
main(ac, av)
    char **av;
{
    execv(REAL_FILE, av);
}

Compile this wrapper into an executable image and then make it rather than your script set-id. Be sure to use an absolute filename, since C isn't smart enough to do taint checking on your PATH.

(Another possible approach is to use the experimental C code generator for the Perl compiler. A compiled image of your script will not have the race condition. See Chapter 18, "Compiling".)

Vendors in recent years have finally started to provide systems free of the set-id bug. On such systems, when the kernel gives the name of the set-id script to the interpreter, it no longer uses a filename subject to meddling, but instead passes a special file representing the file descriptor, like /dev/fd/3. This special file is already opened on the script so that there can be no race condition for evil scripts to exploit.[8] Most modern versions of Unix use this approach to avoid the race condition inherent in opening the same filename twice.

[8]On these systems, Perl should be compiled with -DSETUID_SCRIPTS_ARE_SECURE_NOW. The Configure program that builds Perl tries to figure this out for itself, so you should never have to specify this explicitly.

23.2.2. Handling Race Conditions

Which runs us right into the topic of race conditions. What are they really? Race conditions turn up frequently in security discussions. (Although less often than they turn up in insecure programs. Unfortunately.) That's because they're a fertile source of subtle programming errors, and such errors can often be turned into security exploits (the polite term for screwing up someone's security). A race condition exists when the result of several interrelated events depends on the ordering of those events, but that order cannot be guaranteed due to nondeterministic timing effects. Each event races to be the first one done, and the final state of the system is anybody's guess.

Imagine you have one process overwriting an existing file, and another process reading that same file. You can't predict whether you read in old data, new data, or a haphazard mixture of the two. You can't even know whether you've read all the data. The reader could have won the race to the end of the file and quit. Meanwhile, if the writer kept going after the reader hit end-of-file, the file would grow past where the reader stopped reading, and the reader would never know it.

Here the solution is simple: just have both parties flock the file. The reader typically requests a shared lock, and the writer typically requests an exclusive one. So long as all parties request and respect these advisory locks, reads and writes cannot be interleaved, and there's no chance of mutilated data. See the section Section 23.2.1, "File Locking" in Chapter 16, "Interprocess Communication".

You risk a far less obvious form of race condition every time you let operations on a filename govern subsequent operations on that file. When used on filenames rather than filehandles, the file test operators represent something of a garden path leading straight into a race condition. Consider this code:

if (-e $file) {
    open(FH, "< $file")
        or die "can't open $file for reading: $!";
}
else {
    open(FH, "> $file")
        or die "can't open $file for writing: $!";
}

The code looks just about as straightforward as it gets, but it's still subject to races. There's no guarantee that the answer returned by the -e test will still be valid by the time either open is called. In the if block, another process could have removed the file before it could be opened, and you wouldn't find the file you thought was going to be there. In the else block, another process could have created the file before the second open could get its turn to create the file, so the file that you thought would not be there, would be. The simple open function creates new files but overwrites existing ones. You may think you want to overwrite any existing file, but consider that the existing file might be a newly created alias or symbolic link to a file elsewhere on the system that you very much don't want to overwrite. You may think you know what a filename means at any particular instant, but you can never really be sure, as long as any other processes with access to the file's directory are running on the same system.

To fix this problem of overwriting, you'll need to use sysopen, which provides individual controls over whether to create a new file or to clobber an existing one. And we'll ditch that -e file existence test since it serves no useful purpose here and only increases our exposure to race conditions.

use Fcntl qw/O_WRONLY O_CREAT O_EXCL/;
open(FH, "<", $file)
    or sysopen(FH, $file, O_WRONLY | O_CREAT | O_EXCL)
    or die "can't create new file $file: $!";

Now even if the file somehow springs into existence between when open fails and when sysopen tries to open a new file for writing, no harm is done, because with the flags provided, sysopen will refuse to open a file that already exists.

If someone is trying to trick your program into misbehaving, there's a good chance they'll go about it by having files appear and disappear when you're not expecting. One way to reduce the risk of deception is by promising yourself you'll never operate on a filename more than once. As soon as you have the file opened, forget about the filename (except maybe for error messages), and operate only on the handle representing the file. This is much safer because, even though someone could play with your filenames, they can't play with your filehandles. (Or if they can, it's because you let them--see "Passing Filehandles" in Chapter 16, "Interprocess Communication".)

Earlier in this chapter, we showed a handle_looks_safe function which called Perl's stat function on a filehandle (not a filename) to check its ownership and permissions. Using the filehandle is critical to correctness--if we had used the name of the file, there would have been no guarantee that the file whose attributes we were inspecting was the same one we just opened (or were about to open). Some pesky evil doer could have deleted our file and quickly replaced it with a file of nefarious design, sometime between the stat and the open. It wouldn't matter which was called first; there'd still be the opportunity for foul play between the two. You may think that the risk is very small because the window is very short, but there are many cracking scripts out in the world that will be perfectly happy to run your program thousands of times to catch it the one time it wasn't careful enough. A smart cracking script can even lower the priority of your program so it gets interrupted more often than usual, just to speed things up a little. People work hard on these things--that's why they're called exploits.

By calling stat on a filehandle that's already open, we only access the filename once and so avoid the race condition. A good strategy for avoiding races between two events is to somehow combine both into one, making the operation atomic.[9] Since we access the file by name only once, there can't be any race condition between multiple accesses, so it doesn't matter whether the name changes. Even if our cracker deletes the file we opened (yes, that can happen) and puts a different one there to trick us with, we still have a handle to the real, original file.

[9] Yes, you may still perform atomic operations in a nuclear-free zone. When Democritus gave the word "atom" to the indivisible bits of matter, he meant literally something that could not be cut: a- (not) + tomos (cuttable). An atomic operation is an action that can't be interrupted. (Just you try interrupting an atomic bomb sometime.)

23.2.3. Temporary Files

Apart from allowing buffer overruns (which Perl scripts are virtually immune to) and trusting untrustworthy input data (which taint mode guards against), creating temporary files improperly is one of the most frequently exploited security holes. Fortunately, temp file attacks usually require crackers to have a valid user account on the system they're trying to crack, which drastically reduces the number of potential bad guys.

Careless or casual programs use temporary files in all kinds of unsafe ways, like placing them in world-writable directories, using predictable filenames, and not making sure the file doesn't already exist. Whenever you find a program with code like this:

open(TMP, ">/tmp/foo.$$")
    or die "can't open /tmp/foo.$$: $!";

you've just found all three of those errors at once. That program is an accident waiting to happen.

The way the exploit plays out is that the cracker first plants a file with the same name as the one you'll use. Appending the PID isn't enough for uniqueness; surprising though it may sound, guessing PIDs really isn't difficult.[10] Now along comes the program with the careless open call, and instead of creating a new temporary file for its own purposes, it overwrites the cracker's file instead.

[10]Unless you're on a system like OpenBSD, which randomizes new PID assignments.

So what harm can that do? A lot. The cracker's file isn't really a plain file, you see. It's a symbolic link (or sometimes a hard link), probably pointing to some critical file that crackers couldn't normally write to on their own, such as /etc/passwd. The program thought it opened a brand new file in /tmp, but it clobbered an existing file somewhere else instead.

Perl provides two functions that address this issue, if properly used. The first is POSIX::tmpnam, which just returns a filename that you're expected to open for yourself:

# Keep trying names until we get one that's brand new.
use POSIX;
do {
    $name = tmpnam();
} until sysopen(TMP, $name, O_RDWR | O_CREAT | O_EXCL, 0600);
# Now do I/O using TMP handle.

The second is IO::File::new_tmpfile, which gives you back an already opened handle:

# Or else let the module do that for us.
use IO::File;
my $fh = IO::File::new_tmpfile();  # this is POSIX's tmpfile(3)
# Now do I/O using $fh handle.

Neither approach is perfect, but of the two, the first is the better approach. The major problem with the second one is that Perl is subject to the foibles of whatever implementation of tmpfile(3) happens to be in your system's C library, and you have no guarantee that this function doesn't do something just as dangerous as the open we're trying to fix. (And some, sadly enough, do.) A minor problem is that it doesn't give you the name of the file at all. Although it's better if you can handle a temp file without a name--because that way you'll never provoke a race condition by trying to open it again--often you can't.

The major problem with the first approach is that you have no control over the location of the pathname, as you do with the C library's mkstemp(3) function. For one thing, you never want to put the file on an NFS-mounted filesystem. The O_EXCL flag is not guaranteed to work correctly under NFS, so multiple processes that request an exclusive create at nearly the same time might all succeed. For another, because the path returned is probably in a directory others can write to, someone could plant a symbolic link pointing to a nonexistent file, forcing you to create your file in a location they prefer.[11] If you have any say in it, don't put temp files in a directory that anyone else can write to. If you must, make sure to use the O_EXCL flag to sysopen, and try to use directories with the owner-delete-only flag (the sticky bit) set on them.

[11]A solution to this, which works only under some operating systems, is to call sysopen and OR in the O_NOFOLLOW flag. This makes the function fail if the final component of the path is a symbolic link.

As of version 5.6.1 of Perl, there is a third way. The standard File::Temp module takes into account all the difficulties we've mentioned. You might use the default options like this:

use File::Temp "tempfile";
$handle = tempfile();

Or you might specify some of the options like this:

use File::Temp "tempfile";
($handle, $filename) = tempfile("plughXXXXXX",
                                DIR => "/var/spool/adventure",
                                SUFFIX = '.dat');

The File::Temp module also provides security-conscious emulations of the other functions we've mentioned (though the native interface is better because it gives you an opened filehandle, not just a filename, which is subject to race conditions). See Chapter 32, "Standard Modules", for a longer description of the options and semantics of this module.

Once you have your filehandle, you can do whatever you want with it. It's open for both reading and writing, so you can write to the handle, seek back to the beginning, and then if you want, overwrite what you'd just put there or read it back again. The thing you really, really want to avoid doing is ever opening that filename again, because you can't know for sure that it's really the same file you opened the first time around.[12]

[12] Except afterwards by doing a stat on both filehandles and comparing the first two return values of each (the device/inode pair). But it's too late by then because the damage is already done. All you can do is detect the damage and abort (and maybe sneakily send email to the system administrator).

When you launch another program from within your script, Perl normally closes all filehandles for you to avoid another vulnerability. If you use fcntl to clear your close-on-exec flag (as demonstrated at the end of the entry on open in Chapter 29, "Functions"), other programs you call will inherit this new, open file descriptor. On systems that support the /dev/fd/ directory, you could provide another program with a filename that really means the file descriptor by constructing it this way:

$virtname = "/dev/fd/" . fileno(TMP);

If you only needed to call a Perl subroutine or program that's expecting a filename as an argument, and you knew that subroutine or program used regular open for it, you could pass the handle using Perl's notation for indicating a filehandle:

$virtname = "=&" . fileno(TMP);

When that file "name" is passed with a regular Perl open of one or two arguments (not three, which would dispel this useful magic), you gain access to the duplicated descriptor. In some ways, this is more portable than passing a file from /dev/fd/, because it works everywhere that Perl works; not all systems have a /dev/fd/ directory. On the other hand, the special Perl open syntax for accessing file descriptors by number works only with Perl programs, not with programs written in other languages.