Show Contents Previous Page Next Page
Chapter 4 - Content Handlers
Content Handlers as File Processors
In this section...
Introduction Show Contents Go to Top Previous Page Next Page
Early web servers were designed as engines for transmitting physical files from the host machine to the browser. Even though Apache does much more, the file-oriented legacy still remains. Files can be sent to the browser unmodified or passed through content handlers to transform them in various ways before sending them on to the browser. Even though many of the documents that you produce with modules have no corresponding physical files, some parts of Apache still behave as if they did.
When Apache receives a request, the URI is passed through any URI translation
handlers that may be installed (see Chapter 7, Other Request Phases, for information on how to roll
your own), transforming it into a file path. The mod_alias translation
handler (compiled in by default) will first process any Alias, ScriptAlias,
Redirect, or other mod_alias directives. If none applies,
the http_core default translator will simply prepend the DocumentRoot
directory to the beginning of the URI.
Next, Apache attempts to divide the file path into two parts: a "filename" part which usually (but not always) corresponds to a physical file on the host's filesystem, and an "additional path information" part corresponding to additional stuff that follows the filename. Apache divides the path using a very simple-minded algorithm. It steps through the path components from left to right until it finds something that doesn't correspond to a directory on the host machine. The part of the path up to and including this component becomes the filename, and everything that's left over becomes the additional path information.
Consider a site with a document root of /home/www that has just received
a request for URI /abc/def/ghi. The way Apache splits the file path
into filename and path information parts depends on what directories it finds
in the document root:
Additional Path Information
Note that the presence of any actual files in the path is irrelevant to this process. The division between the filename and the path information depends only on what directories are present.
Once Apache has decided where the file is in the path, it determines what MIME type it might be. This is again one of the places where you can intervene to alter the process with a custom type handler. The default type handler (mod_mime) just compares the filename's extension to a table of MIME types. If there's a match, this becomes the MIME type. If no match is found, then the MIME type is undefined. Again, note that this mapping from filename to MIME type occurs even when there's no actual file there.
There are two special cases. If the last component of the filename happens
to be a physical directory, then Apache internally assigns it a "magic" MIME
type, defined by the
DIR_MAGIC_TYPE constant as httpd/unix-directory.
This is used by the directory module to generate automatic directory listings.
The second special case occurs when you have the optional mod_mime_magic
module installed and the file actually exists. In this case Apache will peek
at the first few bytes of the file's contents to determine what type of file
it might be. Chapter 7 shows you how to write
your own MIME type checker handlers to implement more sophisticated MIME type
After Apache has determined the name and type of the file referenced by the
URI, it decides what to do about it. One way is to use information hard-wired
into the module's static data structures. The module's
table, which we describe in detail in Chapter 10, C API Reference
Guide, Part I, declares the module's
willingness to handle one or more magic MIME types and associates a content
handler with each one. For example, the mod_cgi module associates MIME
type application/x-httpd-cgi with its cgi_handler() handler
subroutine. When Apache detects that a filename is of type application/x-httpd-cgi
it invokes cgi_handler() and passes it information about the file.
A module can also declare its desire to handle an ordinary MIME type, such as
video/quicktime, or even a wildcard type, such as video/*.
In this case, all requests for URIs with matching MIME types will be passed
through the module's content handler unless some other module registers a more
Newer modules use a more flexible method in which content handlers are associated with files at runtime using explicit names. When this method is used, the module declares one or more content handler names in its
handler_rec array instead of, or in addition to, MIME types. Some examples of content handler names you might have seen include cgi-script, server-info, server-parsed, imap-file, and perl-script. Handler names can be associated with files using either AddHandler or SetHandler directives. AddHandler associates a handler with a particular file extension. For example, a typical configuration file will contain this line to associate .shtml files with the server-side include handler:
AddHandler server-parsed .shtml
Now, the server-parsed handler defined by mod_include will be called on to process all files ending in ".shtml" regardless of their MIME type.
SetHandler is used within
<Files> sections to associate a particular handler with an entire section of the site's URI space. In the two examples that follow, the
<Location> section attaches the server-parsed method to all files within the virtual directory /shtml, while the
<Files> section attaches imap-file to all files that begin with the prefix "map-":
Surprisingly, the AddHandler and SetHandler directives are
not actually implemented in the Apache core. They are implemented by the standard
mod_actions module, which is compiled into the server by default. In
Chapter 7, we show how to reimplement mod_actions
using the Perl API.
You'll probably want to use explicitly named content handlers in your modules rather than hardcoded MIME types. Explicit handler names make configuration files cleaner and easier to understand. Plus, you don't have to invent a new magic MIME type every time you add a handler.
Things are slightly different for
mod_perl users because two directives are needed to assign a content handler to a directory or file. The reason for this is that the only real content handler defined by
mod_perl is its internal
perl-script handler. You use SetHandler to assign
perl-script the responsibility for a directory or partial URI, and then use a PerlHandler directive to tell the perl-script handler which Perl module to execute. Directories supervised by Perl API content handlers will look something like this:
Don't try to assign perl-script to a file extension using something like
.pl; this is generally useless because you'd need to set PerlHandler too. If you'd like to associate a Perl content handler with an extension, you should use the <Files> directive. Here's an example:
<Files ~ "\.graph$">
There is no UnSetHandler directive to undo the effects of SetHandler. However, should you ever need to restore a subdirectory's handler to the default, you can do it with the directive
SetHandler default-handler, as follows:
Show Contents Go to Top Previous Page Next Page
Copyright © 1999 by O'Reilly & Associates, Inc.