Show Contents Previous Page Next Page
Chapter 4 - Content Handlers / Content Handlers as File Processors
One of us (Lincoln) thought the virtual navigation bar was so neat that he immediately ran out and used it for all documents on his site. Unfortunately, he had some pretty large (>400 MB) files there, and he soon noticed something interesting. Before installing the navigation bar handler, browsers would cache the large HTML files locally and only download them again when they had changed. After installing the handler, however, the files were always downloaded. What happened?
When a browser is asked to display a document that it has cached locally, it sends the remote server a GET request with an additional header field named If-Modified-Since. The request looks something like this:
GET /index.html HTTP/1.0
If-Modified-Since: Tue, 24 Feb 1998 11:19:03 GMT
User-Agent: (etc. etc. etc.)
The server will compare the document's current modification date to the time given in the request. If the document is more recent than that, it will return the whole document. Otherwise, the server will respond with a 304 "not modified" message and the browser will display its cached copy. This reduces network bandwidth usage dramatically.
When you install a custom content handler, the If-Modified-Since mechanism no longer works unless you implement it. In fact, you can generally ignore If-Modified-Since because content handlers usually generate dynamic documents that change from access to access. However, in some cases the content you provide is sufficiently static that it pays to cache the documents. The navigation bar is one such case because even though the bar is generated dynamically, it rarely changes from day to day.
In order to handle If-Modified-Since caching, you have to settle on a definition for the document's most recent modification date. In the case of a static document, this is simply the modification time of the file. In the case of composite documents that consist equally of static file content and a dynamically generated navigation bar, the modification date is either the time that the HTML file was last changed or the time that the navigation bar configuration file was changed, whichever happens to be more recent. Fortunately for us, we're already storing the configuration file's modification date in the NavBar object, so finding this aggregate modification time is relatively simple.
To use these routines, simply add the following just before the call to
$r->send_http_header in the handler() subroutine:
my $rc = $r-> meets_conditions
return $rc unless $rc == OK;
We first call the update_mtime() function with the navigation bar's modification date. This function will compare the specified date with the modification date of the request document and update the request's internal
mtime field to the most recent of the two. We then call set_last_modified() to copy the
mtime field into the outgoing Last-Modified header. If a synthesized document depends on several configuration files, you should call update_mtime() once for each configuration file, followed by set_last_modified() at the very end.
The complete code for the new and improved Apache::NavBar, with the If-Modified-Since improvements, can be found at this book's companion web site.
If you think carefully about this module, you'll see that it still isn't strictly
correct. There's a third modification date that we should take into account,
that of the module source code itself. Changes to the source code may affect
the appearance of the document without changing the modification date of either
the configuration file or the HTML file. We could add a new update_mtime()
with the modification time of the Apache::NavBar module, but then we'd
have to worry about modification times of libraries that Apache::NavBar
depends on, such as Apache::File. This gets hairy very quickly, which
is why caching becomes a moot issue for any dynamic document much more complicated
than this one. See "The Apache::File Class"
in Chapter 9, Perl API Reference
Guide, for a complete rundown of the methods that are available to
you for controlling HTTP/1.1 caching.
Sending Static Files Show Contents Go to Top Previous Page Next Page
If you want your content handler to send a file through without modifying it, the easiest way is to let Apache do all the work for you. Simply return
DECLINED from your handler (before you send the HTTP header or the body) and the request will fall through to Apache's default handler. This is a lot easier, not to mention faster, than opening up the file, reading it line by line, and transmitting it unchanged. In addition, Apache will automatically handle a lot of the details for you, first and foremost of which is handling the If-Modified-Since header and other aspects of client-side caching.
If you have a compelling reason to send static files manually, see Using
Apache::File to Send Static Files in Chapter 9
for a full description of the technique. Also see "Redirection,"
later in this chapter, for details on how to direct the browser to request a
different URI or to make Apache send the browser a different document from the
one that was specifically requested.
1 At least in theory, you can divine what MIME
types a browser prefers by examining the contents of the Accept header
with Show Contents Go to Top Previous Page Next Page
$r->header_in('Accept'). According to the HTTP protocol,
this should return a list of MIME types that the browser can handle along with
a numeric preference score. The CGI.pm module even has an accept()
function that leverages this information to choose the best format for a given
document type. Unfortunately, this part of the HTTP protocol has atrophied,
and neither Netscape's nor Microsoft's browsers give enough information in the
Accept header to make it useful for content negotiation.
Copyright © 1999 by O'Reilly & Associates, Inc.