Show Contents Previous Page Next Page
Chapter 3 - The Apache Module Architecture and API / The Handler API Perl API Configuration Directives
This section lists the configuration directives that the Perl API makes available. Most of these directives install handlers, but there are a few that affect the Perl engine in other ways.
- PerlRequire
- PerlModule
These directives are used to load Perl modules and files from disk. Both
are implemented using the Perl built-in require operator. However,
there are subtle differences between the two. A PerlModule must
be a "bareword," that is, a package name without any path information. Perl
will search the @INC paths for a .pm file that
matches the name. Example:
PerlModule Apache::Plotter
This will do the same as either of the following Perl language statements:
require Apache::Plotter;
use Apache::Plotter ();
In contrast, the PerlRequire directive expects an absolute or
relative path to a file. The Perl API will enclose the path in quotes, then
pass it to the require function. If you use a relative path, Perl
will search through the @INC list for a match. Examples:
PerlRequire /opt/www/lib/directory_colorizer.pl
PerlRequire scripts/delete_temporary_files.pl
This will do the same as the following Perl language statement: require '/opt/www/lib/directory_colorizer.pl';
require 'scripts/delete_temporary_files.pl';
As with modules and files pulled in directly by the require operator,
PerlRequire and PerlModule also require the modules to return
a true value (usually 1) to indicate that they were evaluated successfully.
Like require, these files will be added to the %INC
hash so that it will not be evaluated more than once. The Apache::StatINC
module and the PerlFreshRestart directive can alter this behavior
so modules can be reloaded. Both directives will accept any number of modules and files:
PerlModule CGI LWP::Simple Apache::Plotter
PerlRequire scripts/startup.pl scripts/config.pl
All PerlModule and PerlRequire files will be loaded during
server startup by mod_perl during the module_init phase.
The value of the ServerRoot directive is added to the @INC
paths by mod_perl as an added convenience. Remember that all the code that is run at server initialization time is
run with root privileges when the server is bound to a privileged port, such
as the default 80. This means that anyone who has write access to one of the
server configuration files, or who has write access to a script or module
that is loaded by PerlModule or PerlRequire, effectively
has superuser access to the system. There is a new PerlOpmask directive
and PERL_ OPMASK_DEFAULT compile time option, currently in the experimental
stages, for disabling possible dangerous operators. The PerlModule and PerlRequire directives are also permitted
in .htaccess files. They will be loaded at request time and be run
as the unprivileged web user. - PerlChildInitHandler
This directive installs a handler that is called immediately after a
child process is launched. On Unix systems, it is called every time the
parent process forks a new child to add to the flock of listening daemons.
The handler is called only once in the Win32 version of Apache because
that server uses a single-process model. In contrast to the server initialization phase, the child will be running
as an unprivileged user when this handler is called. All child_init
handlers will be called unless one aborts by logging an error message
and calling exit() to terminate the process. Example:
PerlChildInitHandler Apache::DBLogin
This directive can appear in the main configuration files and within virtual
host sections, but not within <Directory>, <Location>,
or <Files> sections or within .htaccess files.
- PerlPostReadRequestHandler
The post_read_request handler is called every time an Apache
process receives an incoming request, at the point at which the server
has read the incoming request's data and parsed the HTTP header fields
but before the server has translated the URI to a filename. It is called
once per transaction and is intended to allow modules to step in and perform
special processing on the incoming data. However, because there's no way
for modules to step in and actually contribute to the parsing of the HTTP
header, this phase is more often used just as a convenient place to do
processing that must occur once per transaction. All post_read_request
handlers will be called unless one aborts by returning an error code or
terminating the phase with DONE . Example:
PerlPostReadRequestHandler Apache::StartTimer
This directive can appear in the main configuration files and within virtual
host sections but not within <Directory>, <Location>,
or <Files> sections or within .htaccess files. The
reason for this restriction is simply that the request has not yet been
associated with a particular filename or directory.
- PerlInitHandler
When found at the "top-level" of a configuration file, that is, outside
of any <Location>, <Directory>, or <Files>
sections, this handler is an alias for PerlPost-ReadRequestHandler.
When found inside one of these containers, this handler is an alias for
PerlHeaderParserHandler described later. Its name makes it easy
to remember that this is the first handler invoked when serving an HTTP
request.
- PerlTransHandler
The uri_translate handler is invoked after Apache has parsed
out the request. Its job is to take the request, which is in the form
of a partial URI, and transform it into a filename. The handler can also step in to alter the URI itself, to change the
request method, or to install new handlers based on the URI. The URI translation
phase is often used to recognize and handle proxy requests; we give examples
in Chapter 7. Example:
PerlTransHandler Apache::AdBlocker
Apache will walk through the registered uri_translate handlers
until one returns a status other than DECLINED. This is in contrast to
most of the other phases, for which Apache will continue to invoke registered
handlers even after one has returned OK.
Like PerlPostReadRequestHandler, the PerlTransHandler
directive may appear in the main configuration files and within virtual
host sections but not within <Directory>, <Location>,
or <Files> sections or within .htaccess files. This
is because the request has not yet been associated with a particular file
or directory.
- PerlHeaderParserHandler
After the URI translation phase, Apache again gives you another chance
to examine the request headers and to take special action in the header_parser
phase. Unlike the post_ read_request phase, at this point the
URI has been mapped to a physical pathname. Therefore PerlHeaderParserHandler
is the first handler directive that can appear within <Directory>,
<Location>, or <Files> sections or within
.htaccess files. The header_parser phase is free to examine and change request
fields in the HTTP header, or even to abort the transaction entirely.
For this reason, it's common to use this phase to block abusive robots
before they start chewing into the resources that may be required in the
phases that follow. All registered header_parser handlers will
be run unless one returns an error code or DONE . Example:
PerlHeaderParserHandler Apache::BlockRobots
- PerlAccessHandler
The access_checker handler is the first of three handlers that
are involved in authentication and authorization. We go into this topic
in greater depth in Chapter 6. The access_checker handler is designed to do simple access
control based on the browser's IP address, hostname, phase of the moon,
or other aspects of the transaction that have nothing to do with the remote
user's identity. The handler is expected to return OK to
allow the transaction to continue, FORBIDDEN to abort the
transaction with an unauthorized access error, or DECLINED
to punt the decision to the next handler. Apache will continue to step
through all registered access handlers until one returns a code other
than DECLINED or OK . Example:
PerlAccessHandler Apache::DayLimit
The PerlAccessHandler directive can occur anywhere, including
<Directory> sections and .htaccess files.
- PerlAuthenHandler
The authentication handler (sometimes referred to in the Apache
documentation as check_ user_id) is called whenever the requested
file or directory is password-protected. This, in turn, requires that
the directory be associated with AuthName, AuthType,
and at least one require directive. The interactions among these
directives is covered more fully in Chapter 6.
It is the job of the authentication handler to check a user's
identification credentials, usually by checking the username and password
against a database. If the credentials check out, the handler should return
OK . Otherwise the handler returns AUTH_REQUIRED
to indicate that the user has not authenticated successfully. When Apache
sends the HTTP header with this code, the browser will normally pop up
a dialog box that prompts the user for login information. Apache will call all registered authentication handlers, only
ending the phase after the last handler has had a chance to weigh in on
the decision or when a handler aborts the transaction by returning AUTH_REQUIRED
or another error code. As usual, handlers may also return DECLINED
to defer the decision to the next handler in line. Example:
PerlAuthenHandler Apache::AuthAnon
PerlAuthenHandler can occur anywhere in the server configuration
or in .htaccess files.
- PerlAuthzHandler
Provided that the authentication handler has successfully verified the
user's identity, the transaction passes into the authorization
handler, where the server determines whether the authenticated user is
authorized to access the requested URI. This is often used in conjunction
with databases to restrict access to a document based on the user's membership
in a particular group. However, the authorization handler can base its
decision on anything that can be derived from the user's name, such as
the user's position in an organizational chart or the user's gender. Handlers for the authorization phase are only called when the file or
directory is password-protected, using the same criteria described earlier
for authentication. The handler is expected to return DECLINED
to defer the decision, OK to indicate its acceptance of the
user's authorization, or AUTH_REQUIRED to indicate that the
user is not authorized to access the requested document. Like the authentication
handler, Apache will try all the authorization handlers in turn until
one returns AUTH_REQUIRED or another error code. The authorization handler interacts with the require
directive in a way described fully in Chapter 6.
Example:
PerlAuthzHandler Apache::AuthzGender
The PerlAuthzHandler directive can occur anywhere in the server
configuration files or in individual .htaccess files.
- PerlTypeHandler
After the optional access control and authentication phases, Apache
enters the type_ checker phase. It is the responsibility of the
type_checker handler to assign a provisional MIME type to the
requested document. The assigned MIME type will be taken into consideration
when Apache decides what content handler to call to generate the body
of the document. Because content handlers are free to change the MIME
types of the documents they process, the MIME type chosen during the type
checking phase is not necessarily the same MIME type that is ultimately
sent to the browser. The type checker is also used by Apache's automatic
directory indexing routines to decide what icon to display next to the
filename. The default Apache type checker generally just looks up the filename
extension in a table of MIME types. By declaring a custom type checker,
you can replace this with something more sophisticated, such as looking
up the file's MIME type in a document management database. Because it makes no sense to have multiple handlers trying to set the
MIME type of a file according to different sets of rules, the type checker
handlers behave like content handlers and URI translation handlers. Apache
steps through each registered handler in turn until one returns OK
or aborts with an error code. The phase finishes as soon as one module
indicates that it has successfully handled the transaction. Example:
PerlTypeHandler Apache::MimeDBI
The PerlTypeHandler directive can occur anywhere in the server
configuration or in .htaccess files.
- PerlFixupHandler
After the type_checker phase but before the content handling
phase is an odd beast called the fixup phase. This phase is a
chance to make any last-minute changes to the transaction before the response
is sent. The fixup handler's job is like that of the restaurant
prep cook who gets all the ingredients cut, sorted, and put in their proper
places before the chef goes to work. As an example alluded to earlier,
mod_env defines a fixup handler to add variables to the environment
from configured SetEnv and PassEnv directives. These
variables are put to use by several different modules in the upcoming
response phase, including mod_cgi, mod_include, and
mod_perl . All fixup handlers are run during an HTTP request, stopping
only when a module aborts with an error code. Example:
PerlFixupHandler Apache::HTTP::Equiv
The PerlFixupHandler directive can occur anywhere in the server
configuration files or in .htaccess files.
- PerlHandler
The next step is the content generation, or response phase,
installed by the generic-sounding PerlHandler directive. Because
of its importance, probably 90 percent of the modules you'll write will
handle this part of the transaction. The content handler is the master
chef of the Apache kitchen, taking all the ingredients assembled by the
previous phases--the URI, the translated pathname, the provisional MIME
type, and the parsed HTTP headers--whipping them up into a tasty document
and serving the result to the browser. Apache chooses the content handler according to a set of rules governed
by the SetHandler , AddHandler, AddType, and
ForceType directives. We go into the details in Chapter 4.
For historical reasons as much as anything else, the idiom for installing
a Perl content handler uses a combination of the SetHandler and
PerlHandler directives:
<Directory /home/http/htdocs/compressed>
SetHandler perl-script
PerlHandler Apache::Uncompress
</Directory>
The SetHandler directive tells Apache that the Perl interpreter
will be the official content handler for all documents in this directory.
The PerlHandler directive in turn tells Perl to hand off responsibility
for the phase to the handler() subroutine in the Apache::Uncompress
package. If no PerlHandler directive is specified, Perl will
return an empty document. It is also possible to use the <Files> and <FilesMatch>
directives to assign mod_perl content handlers selectively
to individual files based on their names. In this example, all files ending
with the suffix .gz are passed through Apache::Uncompress:
<FilesMatch "\.gz$">
SetHandler perl-script
PerlHandler Apache::Uncompress
</FilesMatch>
There can be only one master chef in a kitchen, and so it is with Apache
content handlers. If multiple modules have registered their desire to
be the content handler for a request, Apache will try them each in turn
until one returns OK or aborts the transaction with an error
code. If a handler returns DECLINED , Apache moves on to the
next module in the list. The Perl API relaxes this restriction somewhat, allowing several content
handlers to collaborate to build up a composite document using a technique
called "chaining." We show you how to take advantage of this feature in
the next chapter.
The PerlHandler directive can appear anywhere in Apache's configuration
files, including virtual host sections, <Location> sections,
<Directory> sections, and <Files> sections.
It can also appear in .htaccess files.
- PerlLogHandler
Just before entering the cleanup phase, the log handler will be called
in the logging phase. This is true regardless of whether the
transaction was successfully completed or was aborted somewhere along
the way with an error. Everything known about the transaction, including
the original request, the translated file name, the MIME type, the number
of bytes sent and received, the length of time the transaction took, and
the status code returned by the last handler to be called, is passed to
the log handler in the request record. The handler typically records the
information in some way, either by writing the information to a file,
as the standard logging modules do, or by storing the information into
a relational database. Log handlers can of course do whatever they like
with the information, such as keeping a running total of the number of
bytes transferred and throwing out the rest. We show several practical
examples of log handlers in Chapter 7. All registered log handlers are called in turn, even after one of them
returns OK . If a log handler returns an HTTP error status,
it and all the log handlers that ordinarily follow it, including the built-in
ones, will be aborted. This should be avoided unless you really want to
prevent some transactions from being logged. Example:
PerlLogHandler Apache::LogMail
The PerlLogHandler directive can occur anywhere in the server
configuration files or in .htaccess files.
- PerlCleanupHandler
After each transaction is done, Apache cleans up. During this phase
any module that has registered a cleanup handler will be called. This
gives the module a chance to deallocate shared memory structures, close
databases, clean up temporary files, or do whatever other housekeeping
tasks it needs to perform. This phase is always invoked after logging,
even if some previous handlers aborted the request handling process by
returning some error code. Internally the cleanup phase is different from the other phases we've
discussed. In fact, there isn't really a cleanup phase per se. In the
C API, modules that need to perform post-transaction housekeeping tasks
register one or more function callbacks with the resource pool that they
are passed during initialization. Before the resource pool is deallocated,
Apache calls each of the module's callbacks in turn. For this reason,
the structure of a cleanup handler routine in the C API is somewhat different
from the standard handler. It has this function prototype:
void cleanup_handler (void* data);
We discuss how to register and use C-language cleanup handlers in Chapter 10.
The Perl API simplifies the situation by making cleanup handlers look
and act like other handlers. The PerlCleanupHandler directive
installs a Perl subroutine as a cleanup handler. Modules may also use
the register_cleanup() call to install cleanup handlers themselves.
Like other handlers in the Perl API, the cleanup subroutine will be called
with the Apache request object as its argument. Unlike other handlers,
however, a cleanup handler doesn't have to return a function result. If
it does return a result code, Apache will ignore the value. An important
implication of this is that all registered cleanup functions are always
called, despite the status code returned by previous handlers. Example:
PerlCleanupHandler Apache::Plotter::clean_ink_cartridges
The PerlCleanupHandler directive can occur anywhere in the server
configuration files or in .htaccess files.
- PerlChildExitHandler
The last handler to be called is the child exit handler. This is called
just before the child server process dies. On Unix systems the child exit
handler will be called multiple times (but only once per process). On
NT systems, the exit handler is called just once before the server itself
exits. Example:
PerlChildExitHandler Apache::Plotter::close_driver
- PerlFreshRestart
When this directive is set to On, mod_perl will
reload all the modules found in %INC whenever the server
is restarted. This feature is very useful during module development because
otherwise, changes to .pm files would not take effect until the
server was completely stopped and restarted. The standard Apache::Registry module also respects the value
of PerlFresh-Restart by flushing its cache and reloading all
scripts when the server is restarted.
This directive can only appear in the main part of the configuration files
or in <VirtualHost> sections.
- PerlDispatchHandler
PerlRestartHandler These two handlers are not part of the Apache API, but pseudophases
added by mod_ perl to give programmers the ability to fine-tune
the Perl API. They are rarely used but handy for certain specialized applications.
The PerlDispatchHandler callback, if defined, takes over the
process of loading and executing handler code. Instead of processing the
Perl*Handler directives directly, mod_perl will invoke
the routine pointed to by PerlDispatchHandler and pass it the Apache
request object and a second argument indicating the handler that would ordinarily
be invoked to process this phase. If the handler has already been compiled,
then the second argument is a CODE reference. Otherwise, it is the name
of the handler's module or subroutine.
The dispatch handler should handle the request, which it will usually do
by running the passed module's handler() method. The Apache::Safe
module, currently under development, takes advantage of PerlDispatchHandler
to put handlers into a restricted execution space using Malcom Beattie's Safe
library. Unlike other Perl*Handler directives, PerlDispatchHandler
must always point to a subroutine name, not to a module name. This means that
the dispatch module must be preloaded using PerlModule:
PerlModule Apache::Safe
<Files *.shtml>
PerlDispatchHandler Apache::Safe::handler
</Files>
PerlRestartHandler points to a routine that is called when the
server is restarted. This gives you the chance to step in and perform any
cleanup required to tweak the Perl interpreter. For example, you could use
this opportunity to trim the global @INC path or collect statistics
about the modules that have been loaded. Show Contents Previous Page Next Page Copyright © 1999 by O'Reilly & Associates, Inc. |