home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam  


Book HomeCGI Programming with PerlSearch this book

8.4. Perl's Taint Mode

If you have been paying close attention, you may have noticed that the example scripts in this chapter are all a little different from previous examples. The difference appears at the end of the first line. All of our prior examples have had this as the first line:

#!/usr/bin/perl -wT

In this chapter, they have started like this:

#!/usr/bin/perl -w

The difference is the -T option, which enables Perl's taint mode. Taint mode tells Perl to keep track of data that comes from the user and avoid doing anything insecure with it. Because our examples this chapter intentionally showed insecure ways of doing things, they wouldn't have worked with the -T flag, thus we omitted it. From this it should be clear, however, that taint mode is generally a very good thing.

The purpose of taint mode is to not allow any data from outside your application from affecting anything else external to your application. Thus, Perl will not allow user-inputted values to be used in an eval, passed through a shell, or used in any of the Perl commands that affect external files and processes. It was created for situations when security is important, such as writing Perl programs that run as root or CGI scripts. You should always use taint mode in your CGI scripts.

8.4.1. How Taint Works

When taint mode is enabled, Perl monitors every variable to see if it is tainted. Tainted data, according to Perl, is any data that comes from outside your code. Because this includes anything read from STDIN (or any other file input) as well as all environment variables, this covers everything your CGI script receives from the user.

Not only does Perl keep track of whether variables are tainted or not, but that taintedness follows the data in the variable around if you try to assign it to another variable. For example, because it is an environment variable, Perl considers the HTTP request method stored in $ENV{REQUEST_METHOD} to be tainted. If you then assign this to another variable, that variable also becomes tainted.

my $method = $ENV{REQUEST_METHOD};

Here $method also becomes tainted. It does not matter whether the expression simple or complex. If a tainted value is used in an expression, then the result of that expression is also tainted, and any variable it is assigned to will also become tainted.

You can use this subroutine to test whether a variable is tainted.[16] It returns a true or false value:

[16]The perlsec manpage suggests a subroutine that uses Perl's kill function to test for taintedness. Unfortunately, the kill function is not supported by many systems. The subroutine provided here should work on any platform.

sub is_tainted {
    my $var = shift;
    my $blank = substr( $var, 0, 0 );
    return not eval { eval "1 || $blank" || 1 };
}

We set $blank to a zero-length substring of the variable we're testing. If the value is tainted and we are running in taint mode, Perl will throw an error when we evaluate this in the quoted expression on the following line. This error is caught by the outer eval, which then returns undef. If the variable is not tainted or we are not running in taint mode, then the expression within the outer eval evaluates to 1. The not reverses the resulting values.

8.4.3. How Taintedness Is Removed

Taint mode would be much too restrictive if there was no way to untaint your data. Of course, you do not want to untaint data without checking it to verify that it is safe. Fortunately, one command can accomplish both of these tasks. It turns out that Perl does allow one expression involving tainted values to evaluate to an untainted value. If you match a variable with a regular expression, then the pattern match variables that correspond to the matched parentheses (e.g., $1, $2, etc.) are untainted. If, for example, you wanted to get a particular filename for the user while making sure that it doesn't include a full path (so the user cannot write to a file outside the directory you are intending), you could untaint the user input this way:

$q->param( "filename" ) =~ /^([\w+.])$/;
my $filename = $1;

unless ( $filename ) {
    .
    .
    .

You can reduce the first two lines to one line because a regular expression match returns a list of matches, and these are also untainted:

my( $filename ) = $q->param( "filename" ) =~ /^([\w.])$/;

unless ( $filename ) {
    .
    .
    .

You have seen this notation previously in many of our examples. Note that because the result of the regular expression is a list, you must include parentheses around $filename to evaluate it in a list context. Otherwise, $filename will be set to the number of successful parenthesized matches (1 in this case).

8.4.3.1. Allowing versus disallowing

Remember what we said previously. It is generally better to determine what characters to allow than to try to determine what not to allow. Build your untaint regular expressions with this in mind. In this example, we only allowed letters, numbers, underscores, and periods in the filename, which is much simpler than scanning against possible file path delimiters.

8.4.4. Why Use Taint Mode?

Perl's taint mode doesn't do anything for you that you can't do for yourself. It simply monitors the data and stops you if you're in danger of shooting yourself in the foot. You could be careful on your own, but it certainly helps to have Perl do its best to help. In general, the best argument for using taint mode is simply turn the question around and ask "Why not use taint mode?"

Many CGI developers can come up with excuses for not using taint mode, but none of them really hold water. Some may find it too difficult or complicated to deal with the restrictions that taint mode imposes. This is generally because they don't fully understand how taint mode works and they find it easier to turn it off than to learn how to fix the problems Perl is trying to point out (see the next section for some help).

Other developers may argue that taint mode slows their scripts down more than they can afford. Believe it or not, taint mode does not significantly slow down your scripts. If you are concerned about performance, don't implicitly assume that taint mode must slow down your code. Use the Benchmark module and test the difference; you may be surprised at the results. We'll discuss how to use the Benchmark module in Chapter 17, "Efficiency and Optimization".

The final reason to use taint mode is that CGI scripts rarely remain unchanged. Bugs are fixed, new features are added, and even though the original code may have been perfectly safe, someone may accidentally change all that. You can think of taint mode as an ongoing security audit that Perl provides for free.



Library Navigation Links

Copyright © 2001 O'Reilly & Associates. All rights reserved.