Show Contents Previous Page Next Page
Chapter 7 - Other Request Phases Customizing the Type Checking Phase In this section... Introduction Show Contents Go to Top Previous Page Next Page Following the successful completion of the access control and authentication
steps (if configured), Apache tries to determine the MIME type (e.g., image/gif)
and encoding type (e.g., x-gzip) of the requested document. The types
and encodings are usually determined by filename extensions. (The term "suffix"
is used interchangeably with "extension" in the Apache source code and documentation.)
Table 7-1 lists a few common examples.
Table 7-1. MIME Types and Encodings for
Common File Extensions |
---|
MIME types | |
---|
extension | type |
---|
.txt | text/plain | .html , .htm | text/html | .gif | image/gif | .jpg , .jpeg | image/jpeg | .mpeg , .mpg | video/mpeg | pdf | application/pdf | Encodings | extension | encoding | .gz | x-gzip | .Z | x-compress |
By default, Apache's type checking phase is handled by the standard mod_mime module, which combines the information stored in the server's conf/mime.types file with AddType and AddEncoding directives to map file extensions onto MIME types and encodings.
The contents of the request record's content_type field are used to set the default outgoing Content-Type header, which the client uses to decide how to render the document. However, as we've seen, content handlers can, and often do, change the content type during the later response phase.
In addition to its responsibility for choosing MIME and encoding types for
the requested document, the type checking phase handler also performs the crucial
task of selecting the content handler for the document. mod_mime looks
first for a SetHandler directive in the current directory or location.
If one is set, it uses that handler for the requested document. Otherwise, it
dispatches the request based on the MIME type of the document. This process
was described in more detail at the beginning of Chapter 4.
Also see "Reimplementing mod_mime in Perl," in Chapter 8,
Customizing the Apache Configuration Process, where we reproduce
all of mod_mime's functionality with a Perl module. A DBI-Based Type Checker Show Contents Go to Top Previous Page Next Page In this section, we'll show you a simple type checking handler that determines
the MIME type of the document on the basis of a DBI database lookup. Each record
ofþthe database table will contain the name of the file, its MIME type,
and its encoding.6 If no type is registered in
the database, we fall through to the default mod_mime handler.
This module, Apache::MimeDBI, makes use of the simple Tie::DBI class that was introduced in the previous chapter. Briefly, this class lets you tie a hash to a relational database table. The tied variable appears as a hash of hashes in which the outer hash is a list of table records indexed by the table's primary key and the inner hash contains the columns of that record, indexed by column name. To give a concrete example, for the purposes of this module we'll set up a database table named doc_types having this structure:
+----------+------------+------------+
| filename | mime_type | encoding |
+----------+------------+------------+
| test1 | text/plain | NULL |
| test2 | text/html | NULL |
| test3 | text/html | x-compress |
| test4 | text/html | x-gzip |
| test5 | image/gif | NULL |
+----------+------------+------------+
Assuming that a hash named %DB is tied to this table, we'll be able to access its columns in this way:
$type = $DB{'test2'}{'mime_type'};
$encoding = $DB{'test2'}{'encoding'};
Example 7-6 gives the source for Apache::MimeDBI.
package Apache::MimeDBI;
# file Apache/MimeDBI.pm
use strict;
use Apache::Constants qw(:common);
use Tie::DBI ();
use File::Basename qw(basename);
use constant DEFAULT_DSN => 'mysql:test_www';
use constant DEFAULT_LOGIN => ':';
use constant DEFAULT_TABLE => 'doc_types';
use constant DEFAULT_FIELDS => 'filename:mime_type:encoding';
The module starts by pulling in necessary Perl libraries, including Tie::DBI and the File::Basename filename parser. It also defines a series of default configuration constants. DEFAULT_DSN is the default DBI data source to use, in the format : : : . DEFAULT_LOGIN is the username and password for the web server to use to log into the database, separated by a : character. Both fields are blank by default, indicating no password needs to be provided. DEFAULT_TABLE is the name of the table in which to look for the MIME type and encoding information. DEFAULT_FIELDS are the names of the filename, MIME type, and encoding columns, again separated by the : character. These default values can be overridden with the per-directory Perl configuration variables MIME-Data-base, MIME-Login, MIMETable, and MIMEFields.
sub handler {
my $r = shift;
# get filename
my $file = basename $r->filename;
# get configuration information
my $dsn = $r->dir_config('MIMEDatabase') || DEFAULT_DSN;
my $table = $r->dir_config('MIMETable') || DEFAULT_TABLE;
my($filefield, $mimefield, $encodingfield) =
split ':',$r->dir_config('MIMEFields') || DEFAULT_FIELDS;
my($user, $pass) =
split ':', $r->dir_config('MIMELogin') || DEFAULT_LOGIN;
The handler() subroutine begins by shifting the request object off the subroutine call stack and using it to recover the requested document's filename. The directory part of the filename is then stripped away using the basename() routine imported from File::Basename. Next, we fetch the values of our four configuration variables. If any are undefined, we default to the values defined by the previously declared constants.
tie my %DB, 'Tie::DBI', {
'db' => $dsn, 'table' => $table, 'key' => $filefield,
'user' => $user, 'password' => $pass,
};
my $record;
We now tie a hash named %DB to the indicated database by calling the tie() operator. If the hash is successfully tied to the database, this routine will return a true value (actually, an object reference to the underlying Tie::DBI object itself). Otherwise, we return a value of DECLINED and allow other modules their chance at the MIME checking phase.
return DECLINED unless tied %DB and $record = $DB{$file};
The next step is to check the tied hash to see if there is a record corresponding to the current filename. If there is, we store the record in a variable named $record . Otherwise, we again return DECLINED . This allows files that are not specifically named in the database to fall through to the standard file extension-based MIME type determination.
$r->content_type($record->{$mimefield});
$r->content_encoding($record->{$encodingfield})
if $record->{$encodingfield};
Since the file is listed in the database, we fetch the values of the MIME type and encoding columns and write them into the request record by calling the request object's content_type() and content_encoding(), respectively. Since most documents do not have an encoding type, we only call content_encoding() if the column is defined.
return OK;
}
Our work is done, so we exit the handler subroutine with an OK status code.
At the end of the code listing is a short shell script which you can use to initialize a test database named test_www. It will create the table shown in this example.
To install this module, add a PerlTypeHandler directive like this one to one of the configuration files or a .htaccess file:
<Location /mimedbi>
PerlTypeHandler Apache::MimeDBI
</Location>
If you need to change the name of the database, the login information, or the table structure, be sure to include the appropriate PerlSetVar directives as well.
Figure 7-2 shows the automatic listing of a directory
under the control of Apache::MimeDBI. The directory contains several
files. test1 through test5 are listed in the database
with the MIME types and encodings shown in the previous table. Their icons reflect
the MIME types and encodings returned by the handler subroutine. This MIME type
will also be passed to the browser when it loads and renders the document. test6.html
doesn't have an entry in the database, so it falls through to the standard MIME
checking module, which figures out its type through its file extension. test7
has neither an entry in the database nor a recognized file extension, so it
is displayed with the "unknown document" icon. Without help from Apache::MimeDBI,
all the files without extensions would end up as unknown MIME types.
Figure 7-2. An automatic listing of a directory
controlled by Apache::MimeDBI If you use this module, you should be sure to install and load Apache::DBI
during the server startup phase, as described in Chapter 5.
This will make the underlying database connections persistent, dramatically
decreasing the time necessary for the handler to do its work.
Example 7-6. A DBI-Based MIME Type
Checker
package Apache::MimeDBI;
# file Apache/MimeDBI.pm
use strict;
use Apache::Constants qw(:common);
use Tie::DBI ();
use File::Basename qw(basename);
use constant DEFAULT_DSN => 'mysql:test_www';
use constant DEFAULT_LOGIN => ':';
use constant DEFAULT_TABLE => 'doc_types';
use constant DEFAULT_FIELDS => 'filename:mime_type:encoding';
sub handler {
my $r = shift;
# get filename
my $file = basename $r->filename;
# get configuration information
my $dsn = $r->dir_config('MIMEDatabase') || DEFAULT_DSN;
my $table = $r->dir_config('MIMETable') || DEFAULT_TABLE;
my($filefield, $mimefield, $encodingfield) =
split ':', $r->dir_config('MIMEFields') || DEFAULT_FIELDS;
my($user, $pass) =
split ':', $r->dir_config('MIMELogin') || DEFAULT_LOGIN;
# pull information out of the database
tie my %DB, 'Tie::DBI', {
'db' => $dsn, 'table' => $table, 'key' => $filefield,
'user' => $user, 'password' => $pass,
};
my $record;
return DECLINED unless tied %DB and $record = $DB{$file};
# set the content type and encoding
$r->content_type($record->{$mimefield});
$r->content_encoding($record->{$encodingfield})
if $record->{$encodingfield};
return OK;
}
1;
__END__
# Here's a shell script to add the test data:
#!/bin/sh
mysql test_www <<END
DROP TABLE doc_types;
CREATE TABLE doc_types (
filename char(127) primary key,
mime_type char(30) not null,
encoding char(30)
);
INSERT into doc_types values ('test1','text/plain',null);
INSERT into doc_types values ('test2','text/html',null);
INSERT into doc_types values ('test3','text/html','x-compress');
INSERT into doc_types values ('test4','text/html','x-gzip');
INSERT into doc_types values ('test5','image/gif',null);
END
Footnotes 6 An obvious limitation of this module is that it can't distinguish
between similarly named files in different directories.
Show Contents Go to Top Previous Page Next Page Copyright © 1999 by O'Reilly & Associates, Inc. |