Writing Apache Modules with Perl and C

Writing Apache Modules with Perl and C

By:	Lincoln Stein and Doug MacEachern
Published:	O'Reilly & Associates, Inc. - March 1999

Show Contents Previous Page Next Page

Chapter 4 - Content Handlers / Content Handlers as File Processors
Adding a Canned Footer to Pages

To show you how content handlers work, we'll develop a module with the Perl API that adds a canned footer to all pages in a particular directory. You could use this, for example, to automatically add copyright information and a link back to the home page. Later on, we'll turn this module into a full-featured navigation bar.

Figure 4-1. The footer on this page was generated automatically by Apache::Footer.

Example 4-1 gives the code for Apache::Footer, and Figure 4-1 shows a screenshot of it in action. Since this is our first substantial module, we'll step through the code section by section.

package Apache::Footer;

use strict;
use Apache::Constants qw(:common);
use Apache::File ();

The code begins by declaring its package name and loading various Perl modules that it depends on. The use strict pragma activates Perl checks that prevent us from using global variables before declaring them, disallows the use of function calls without the parentheses, and prevents other unsafe practices. The Apache::Constants module defines constants for the various Apache and HTTP result codes; we bring in only those constants that belong to the frequently used :common set. Apache::File defines methods that are useful for manipulating files.

sub handler {
   my $r = shift;
   return DECLINED unless $r->content_type() eq 'text/html';

The handler() subroutine does all the work of generating the content. It is roughly divided into three parts. In the first part, it fetches information about the requested file and decides whether it wants to handle it. In the second part, it creates the canned footer dynamically from information that it gleans about the file. In the third part, it rewrites the file to include the footer.

In the first part of the process, the handler retrieves the Apache request object and stores it in $r. Next it calls the request's content_type() method to retrieve its MIME type. Unless the document is of type text/html, the handler stops here and returns a DECLINED result code to the server. This tells Apache to pass the document on to any other handlers that have declared their willingness to handle this type of document. In most cases, this means that the document or image will be passed through to the browser in the usual way.

    my $file = $r->filename;

    unless (-e $r->finfo) {
      $r->log_error("File does not exist: $file"); 
     return NOT_FOUND;
   }
   unless (-r _) {
      $r->log_error("File permissions deny access: $file");  
      return FORBIDDEN;
   }

At this point we go ahead and recover the file path, by calling the request object's filename() method. Just because Apache has assigned the document a MIME type doesn't mean that it actually exists or, if it exists, that its permissions allow it to be read by the current process. The next two blocks of code check for these cases. Using the Perl -e file test, we check whether the file exists. If not, we log an error to the server log using the request object's log_error() method and return a result code of NOT_FOUND. This will cause the server to return a page displaying the 404 "Not Found" error (exactly what's displayed is under the control of the ErrorDocument directive).

There are several ways to perform file status checks in the Perl API. The simplest way is to recover the file's pathname using the request object's filename() method, and pass the result to the Perl -e file test:

unless (-e $r->filename) {
  $r->log_error("File does not exist: $file");   
 return NOT_FOUND;
}

A more efficient way, however, is to take advantage of the fact that during its path walking operation Apache already performed a system stat() call to collect filesystem information on the file. The resulting status structure is stored in the request object and can be retrieved with the object's finfo() method. So the more efficient idiom is to use the test -e $r->finfo.

Once finfo() is called, the stat() information is stored into the magic Perl file-handle _ and can be used for subsequent file testing and stat() operations, saving even more CPU time. Using the _ filehandle, we next test that the file is readable by the current process and return FORBIDDEN if this isn't the case. This displays a 403 "Forbidden" error.

    my $modtime = localtime((stat _)[9]);

After performing these tests, we get the file modification time by calling stat(). We can use the _ filehandle here too, avoiding the overhead of repeating the stat() system call. The modification time is passed to the built-in Perl localtime() function to convert it into a human-readable string.

    my $fh;
   unless ($fh = Apache::File->new($file)) {
      $r->log_error("Couldn't open $file for reading: $!");
      return SERVER_ERROR;
   }

At this point, we attempt to open the file for reading using Apache::File's new() method. For the most part, Apache::File acts just like Perl's IO::File object-oriented I/O package, returning a filehandle on success or undef on failure. Since we've already handled the two failure modes that we know how to deal with, we return a result code of SERVER_ERROR if the open is unsuccessful. This immediately aborts all processing of the document and causes Apache to display a 500 "Internal Server Error" message.

    my $footer = <<END;
<hr>
&copy; 1998 <a href="http://www.ora.com/">O'Reilly &amp; Associates</a><br>
<em>Last Modified: $modtime</em>
END

Having successfully opened the file, we build the footer. The footer in this example script is entirely static, except for the document modification date that is computed on the fly.

    $r->send_http_header;

    while (<$fh>) {
      s!(</BODY>)!$footer$1!oi;
   } continue {
      $r->print($_);
   }

The last phase is to rewrite the document. First we tell Apache to send the HTTP header. There's no need to set the content type first because it already has the appropriate value. We then loop through the document looking for the closing </BODY> tag. When we find it, we use a substitution statement to insert the footer in front of it. The possibly modified line is now sent to the browser using the request object's print() method.

    return OK;
}

1;

At the end, we return an OK result code to Apache and end the handler subroutine definition. Like any other .pm file, the module itself must end by returning a true value (usually 1) to signal Perl that it compiled correctly.

If all this checking for the existence and readability of the file before processing seems a bit pedantic, don't worry. It's actually unnecessary for you to do this. Instead of explicitly checking the file, we could have simply returned DECLINED if the attempt to open the file failed. Apache would then pass the URI to the default file handler which will perform its own checks and display the appropriate error messages. Therefore we could have replaced the file tests with the single line:

my $fh = Apache::File->new($file) || return DECLINED;

Doing the tests inside the module this way makes the checks explicit and gives us a chance to intervene to rescue the situation. For example, we might choose to search for a text file of the same name and present it instead. The explicit tests also improve module performance slightly, since the system wastes a small amount of CPU time when it attempts to open a nonexistent file. If most of the files the module serves do exist, however, this penalty won't be significant.

Example 4-1. Adding a Canned Footer to HTML Pages

package Apache::Footer;
# file: Apache/Footer.pm

use strict;
use Apache::Constants qw(:common);
use Apache::File ();

sub handler {
   my $r = shift;
   return DECLINED unless $r->content_type() eq 'text/html';

    my $file = $r->filename;

    unless (-e $r->finfo) {
      $r->log_error("File does not exist: $file");    
      return NOT_FOUND;
   }
   unless (-r _) {
      $r->log_error("File permissions deny access: $file");
      return FORBIDDEN;
   }

    my $modtime = localtime((stat _)[9]);

    my $fh;
   unless ($fh = Apache::File->new($file)) {
      $r->log_error("Couldn't open $file for reading: $!");
      return SERVER_ERROR;

    }
   my $footer = <<END;
<hr>
&copy; 1998 <a href=">http://www.ora.com/">O'Reilly &amp; Associates</a><br>
<em>Last Modified: $modtime</em>
END

    $r->send_http_header;

    while (<$fh>) {
      s!(</BODY>)!$footer$1!oi;
   } continue {
      $r->print($_);
   }

    return OK;
}

1;
__END__

There are several ways to install and use the Apache::Footer content handler. If all the files that needed footers were gathered in one place in the directory tree, you would probably want to attach Apache::Footer to that location:

<Location /footer>
 SetHandler perl-script
 PerlHandler Apache::Footer
</Location>

If the files were scattered about the document tree, it might be more convenient to map Apache::Footer to a unique filename extension, such as .footer. To achieve this, the following directives would suffice:

AddType text/html .footer
<Files ~ "\.footer$">
  SetHandler  perl-script
  PerlHandler Apache::Footer
</Files>

Note that it's important to associate MIME type text/html with the new extension; otherwise, Apache won't be able to determine its content type during the MIME type checking phase.

If your server is set up to allow per-directory access control files to include file information directives, you can place any of these handler directives inside a .htaccess file. This allows you to change handlers without restarting the server. For example, you could replace the <Location> section shown earlier with a .htaccess file in the directory where you want the footer module to be active:

SetHandler perl-script
PerlHandler Apache::Footer

Show Contents Previous Page Next Page