home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam  


Writing Apache Modules with Perl and C
By:   Lincoln Stein and Doug MacEachern
Published:   O'Reilly & Associates, Inc.  - March 1999

Copyright © 1999 by O'Reilly & Associates, Inc.


 


   Show Contents   Previous Page   Next Page

Chapter 9 - Perl API Reference Guide
The Apache::File Class

In this section...

Introduction
Apache::File Methods
Apache Methods Added by Apache::File
Using Apache::File to Send Static Files

Introduction

   Show Contents   Go to Top   Previous Page   Next Page

The Perl API includes a class named Apache::File which, when loaded, provides advanced functions for opening and manipulating files at the server side.

Apache::File does two things. First, it provides an object-oriented interface to filehandles similar to Perl's standard IO::File class. While the Apache::File module does not provide all the functionality of IO::File, its methods are approximately twice as fast as the equivalent IO::File methods. Second, when you use Apache::File, it adds several new methods to the Apache class which provide support for handling files under the HTTP/1.1 protocol.

Like IO::File, the main advantage of accessing filehandles through Apache::File's object-oriented interface is the ability to create new anonymous filehandles without worrying about namespace collision. Furthermore, you don't have to close the filehandle explicitly before exiting the subroutine that uses it; this is done automatically when the filehandle object goes out of scope:

{
 use Apache::File;
 my $fh = Apache::File->new($config);
 # no need to close
}

However, Apache::File is still not as fast as Perl's native open() and close() functions. If you wish to get the highest performance possible, you should use open() and close() in conjunction with the standard Symbol::gensym or Apache::gensym functions:

{ # using standard Symbol module
  use Symbol 'gensym';
  my $fh = gensym;
  open $fh, $config;
  close $fh;
}
{ # Using Apache::gensym() method
  my $fh = Apache->gensym;
  open $fh, $config;
  close $fh;
}

A little known feature of Perl is that when lexically defined variables go out of scope, any indirect filehandle stored in them is automatically closed. So, in fact, there's really no reason to perform an explicit close() on the filehandles in the two preceding examples unless you want to test the close operation's return value. As always with Perl, there's more than one way to do it.

Apache::File Methods

   Show Contents   Go to Top   Previous Page   Next Page

These are methods associated directly with Apache::File objects. They form a subset of what's available from the Perl IO::File and FileHandle classes.

new()

This method creates a new filehandle, returning the filehandle object on success and undef on failure. If an additional argument is given, it will be passed to the open() method automatically.

use Apache::File ();
my $fh = Apache::File->new;
my $fh = Apache::File->new($filename) or die "Can't open $filename $!";

open()

Given an Apache::File object previously created with new(), this method opens a file and associates it with the object. The open() method accepts the same types of arguments as the standard Perl open() function, including support for file modes.

$fh->open($filename);
$fh->open(">$out_file");
$fh->open("|$program");

close()

The close() method is equivalent to the Perl built-in close() function, returning true upon success and false upon failure.

$fh->close or die "Can't close $filename $!";

tmpfile()

The tmpfile() method is responsible for opening up a unique temporary file. It is similar to the tmpnam() function in the POSIX module but doesn't come with all the memory overhead that loading POSIX does. It will choose a suitable temporary directory (which must be writable by the web server process). It then generates a series of filenames using the current process ID and the $TMPNAM package global. Once a unique name is found, it is opened for writing, using flags that will cause the file to be created only if it does not already exist. This prevents race conditions in which the function finds what seems to be an unused name, but someone else claims the same name before it can be created.

As an added bonus, tmpfile() calls the register_cleanup() method behind the scenes to make sure the file is unlinked after the transaction is finished.

Called in a list context, tmpfile() returns the temporary file name and a filehandle opened for reading and writing. In a scalar context, only the filehandle is returned.

my($tmpnam, $fh) = Apache::File->tmpfile;
my $fh = Apache::File->tmpfile;

Apache Methods Added by Apache::File

   Show Contents   Go to Top   Previous Page   Next Page

When a handler pulls in Apache::File, the module adds a number of new methods to the Apache request object. These methods are generally of interest to handlers that wish to serve static files from disk or memory using the features of the HTTP/1.1 protocol that provide increased performance through client-side document caching.

To take full advantage of the HTTP/1.1 protocol, your content handler will test the meets_conditions() method before sending the body of a static document. This avoids sending a document that is already cached and up-to-date on the browser's side of the connection. You will then want to call set_content_length() and update_mtime() in order to make the outgoing HTTP headers correctly reflect the size and modification time of the requested file. Finally, you may want to call set_etag() in order to set the file's entity tag when communicating with HTTP/1.1-compliant browsers.

In the section following this one, we demonstrate these methods fully by writing a pure Perl replacement for the http_core module's default document retrieval handler.

discard_request_body()

The majority of GET method handlers do not deal with incoming client data, unlike POST and PUT handlers. However, according to the HTTP/1.1 specification, any method, including GET, can include a request body. The discard_request_body() method tests for the existence of a request body and, if present, simply throws away the data. This discarding is especially important when persistent connections are being used, so that the request body will not be attached to the next request. If the request is malformed, an error code will be returned, which the module handler should propagate back to Apache.

if ((my $rc = $r->discard_request_body) != OK) {
   return $rc;
}

meets_conditions()

In the interest of HTTP/1.1 compliance, the meets_conditions() method is used to implement conditional GET rules. These rules include inspection of client headers, including If-Modified-Since, If-Unmodified-Since, If-Match, and If-None-Match. Consult RFC 2068 section 9.3 (which you can find at http://www.w3.org/Protocols) if you are interested in the nitty-gritty details.

As far as Apache modules are concerned, they need only check the return value of this method before sending a request body. If the return value is anything other than OK, the module should return from the handler with that value. A common return value is HTTP_NOT_MODIFIED, which is sent when the document is already cached on the client side and has not changed since it was cached.

if((my $rc = $r->meets_conditions) != OK) {
    return $rc;
}
# else ... go and send the response body ...

mtime()

This method returns the last modified time of the requested file, expressed as seconds since the epoch. The last modified time may also be changed using this method, although the update_mtime() method is better suited to this purpose.

my $date_string = localtime $r->mtime;

set_content_length()

This method sets the outgoing Content-length header based on its argument, which should be expressed in byte units. If no argument is specified, the method will use the size returned by $r->filename. This method is a bit faster and more concise than setting Content-length in the headers_out table yourself.

$r->set_content_length;
$r->set_content_length(-s $r->finfo); #same as above
$r->set_content_length(-s $filename);

set_etag()

This method is used to set the outgoing ETag header corresponding to the requested file. ETag is an opaque string that identifies the current version of the file and changes whenever the file is modified. This string is tested by the meets_conditions() method if the client provides an If-Match or If-None-Match header.

$r->set_etag;

set_last_modified()

This method is used to set the outgoing Last-Modified header from the value returned by $r->mtime. The method checks that the specified time is not in the future. In addition, using set_last_modified() is faster and more concise than setting Last-Modified in the headers_out table yourself.

You may provide an optional time argument, in which case the method will first call the update_mtime() to set the file's last modification date. It will then set the outgoing Last- Modified header as before.

$r->update_mtime((stat $r->finfo)[9]);
$r->set_last_modified;
$r->set_last_modified((stat $r->finfo)[9]); # same as the two lines above

update_mtime()

Rather than setting the request record mtime field directly, you should use the update_ mtime() method to change the value of this field. It will only be updated if the new time is more recent than the current mtime. If no time argument is present, the default is the last modified time of $r->filename.

$r->update_mtime;
$r->update_mtime((stat $r->finfo)[9]); #same as above
$r->update_mtime(time);
   Show Contents   Go to Top   Previous Page   Next Page
Copyright © 1999 by O'Reilly & Associates, Inc.