Apache Configuration in Perl (Practical mod

4.4. Apache Configuration in Perl

With <Perl> ... </Perl>sections, you can configure your server entirely in Perl. It's probably not worth it if you have simple configuration files, but if you run many virtual hosts or have complicated setups for any other reason, <Perl>sections become very handy. With <Perl>sections you can easily create the configuration on the fly, thus reducing duplication and easing maintenance.[27]

[27]You may also find that mod_macro is useful to simplify the configuration if you have to insert many repetitive configuration snippets.

To enable <Perl>sections, build mod_perl with:

panic% perl Makefile.PL PERL_SECTIONS=1 [ ... ]

or with EVERYTHING=1.

4.4.1. Constructing <Perl> Sections

<Perl>sections can contain any and as much Perl code as you wish. <Perl>sections are compiled into a special package called Apache::ReadConfig. mod_perl looks through the symbol table for Apache::ReadConfig for Perl variables and structures to grind through the Apache core configuration gears. Most of the configuration directives can be represented as scalars ($scalar) or arrays (@array). A few directives become hashes.

How do you know which Perl global variables to use? Just take the Apache directive name and prepend either $, @, or % (as shown in the following examples), depending on what the directive accepts. If you misspell the directive, it is silently ignored, so it's a good idea to check your settings.

Since Apache directives are case-insensitive, their Perl equivalents are case-insensitive as well. The following statements are equivalent:

$User = 'stas';
$user = 'stas'; # the same

Let's look at all possible cases we might encounter while configuring Apache in Perl:

Directives that accept zero or one argument are represented as scalars. For example, CacheNegotiatedDocs is a directive with no arguments. In Perl, we just assign it an empty string:

<Perl>
    $CacheNegotiatedDocs = '';
</Perl>

Directives that accept a single value are simple to handle. For example, to configure Apache so that child processes run as user httpd and group httpd, use:

User  = httpd
Group = httpd

What if we don't want user and group definitions to be hardcoded? Instead, what if we want to define them on the fly using the user and group with which the server is started? This is easily done with <Perl>sections:

<Perl>
    $User  = getpwuid($>) || $>;
    $Group = getgrgid($)) || $);
</Perl>

We use the power of the Perl API to retrieve the data on the fly. $User is set to the name of the effective user ID with which the server was started or, if the name is not defined, the numeric user ID. Similarly, $Group is set to either the symbolic value of the effective group ID or the numeric group ID.

Notice that we've just taken the Apache directives and prepended a $, as they represent scalars.

Directives that accept more than one argument are represented as arrays or as a space-delimited string. For example, this directive:

PerlModule Mail::Send Devel::Peek

becomes:

<Perl>
    @PerlModule = qw(Mail::Send Devel::Peek);
</Perl>

@PerlModule is an array variable, and we assign it a list of modules. Alternatively, we can use the scalar notation and pass all the arguments as a space-delimited string:

<Perl>
    $PerlModule = "Mail::Send Devel::Peek";
</Perl>

Directives that can be repeated more than once with different values are represented as arrays of arrays. For example, this configuration:

AddEncoding x-compress Z
AddEncoding x-gzip gz tgz

becomes:

<Perl>
    @AddEncoding = (
        ['x-compress' => qw(Z)],
        ['x-gzip'     => qw(gz tgz)],
    );
</Perl>

Directives that implement a container block, with beginning and ending delimiters such as <Location> ... </Location>, are represented as Perl hashes. In these hashes, the keys are the arguments of the opening directive, and the values are the contents of the block. For example:

Alias /private /home/httpd/docs/private
<Location /private>
    DirectoryIndex  index.html index.htm
    AuthType        Basic
    AuthName        "Private Area"
    AuthUserFile    /home/httpd/docs/private/.htpasswd
    Require         valid-user
</Location>

These settings tell Apache that URIs starting with /private are mapped to the physical directory /home/httpd/docs/private/ and will be processed according to the following rules:

The users are to be authenticated using basic authentication.
PrivateArea will be used as the title of the pop-up box displaying the login and password entry form.
Only valid users listed in the password file /home/httpd/docs/private/.htpasswd and who provide a valid password may access the resources under /private/.
If the filename is not provided, Apache will attempt to respond with the index.html or index.htm directory index file, if found.

Now let's see the equivalent <Perl>section:

<Perl>
    push @Alias, qw(/private /home/httpd/docs/private);
    $Location{"/private"} = {
        DirectoryIndex => [qw(index.html index.htm)],
        AuthType       => 'Basic',
        AuthName       => '"Private Area"',
        AuthUserFile   => '/home/httpd/docs/private/.htpasswd',
        Require        => 'valid-user',
    };
</Perl>

First, we convert the Alias directive into an array @Alias. Instead of assigning, however, we push the values at the end. We do this because it's possible that we have assigned values earlier, and we don't want to overwrite them. Alternatively, you may want to push references to lists, like this:

push @Alias, [qw(/private /home/httpd/docs/private)];

Second, we convert the Location block, using /private as a key to the hash %Location and the rest of the block as its value. When the structures are nested, the normal Perl rules apply—that is, arrays and hashes turn into references. Therefore, DirectoryIndex points to an array reference. As shown earlier, we can always replace this array with a space-delimited string:

$Location{"/private"} = {
    DirectoryIndex => 'index.html index.htm',
    ...
};

Also notice how we specify the value of the AuthName attribute:

AuthName => '"Private Area"',

The value is quoted twice because Apache expects a single value for this argument, and if we write:

AuthName => 'Private Area',

<Perl> will pass two values to Apache, "Private" and "Area", and Apache will refuse to start, with the following complaint:

[Thu May 16 17:01:20 2002] [error] <Perl>: AuthName takes one
argument, The authentication realm (e.g. "Members Only")

If a block section accepts two or more identical keys (as the <VirtualHost> ... </VirtualHost>section does), the same rules as in the previous case apply, but a reference to an array of hashes is used instead.

In one company, we had to run an Intranet machine behind a NAT/firewall (using the 10.0.0.10 IP address). We decided up front to have two virtual hosts to make both the management and the programmers happy. We had the following simplistic setup:

NameVirtualHost 10.0.0.10

<VirtualHost 10.0.0.10>
    ServerName  tech.intranet
    DocumentRoot /home/httpd/docs/tech
    ServerAdmin webmaster@tech.intranet
</VirtualHost>

<VirtualHost 10.0.0.10>
    ServerName   suit.intranet
    DocumentRoot /home/httpd/docs/suit
    ServerAdmin  webmaster@suit.intranet
</VirtualHost>

In Perl, we wrote it as follows:

<Perl>
    $NameVirtualHost => '10.0.0.10';
    my $doc_root = "/home/httpd/docs";
    $VirtualHost{'10.0.0.10'} = [
        {
         ServerName   => 'tech.intranet',
         DocumentRoot => "$doc_root/tech",
         ServerAdmin  => 'webmaster@tech.intranet',
        },
        {
         ServerName   => 'suit.intranet',
         DocumentRoot => "$doc_root/suit",
         ServerAdmin  => 'webmaster@suit.intranet',
        },
    ];
</Perl>

Because normal Perl rules apply, more entries can be added as needed using push( ).[28] Let's say we want to create a special virtual host for the company's president to show off to his golf partners, but his fancy vision doesn't really fit the purpose of the Intranet site. We just let him handle his own site:

[28]For complex configurations with multiple entries, consider using the module Tie::DxHash, which implements a hash that preserves insertion order and allows duplicate keys.

push @{ $VirtualHost{'10.0.0.10'} },
    {
     ServerName   => 'president.intranet',
     DocumentRoot => "$doc_root/president",
     ServerAdmin  => 'webmaster@president.intranet',
    };

Nested block directives naturally become Perl nested data structures. Let's extend an example from the previous section:

<Perl>
    my $doc_root = "/home/httpd/docs";
    push @{ $VirtualHost{'10.0.0.10'} },
        {
         ServerName   => 'president.intranet',
         DocumentRoot => "$doc_root/president",
         ServerAdmin  => 'webmaster@president.intranet',
         Location     => {
             "/private"    => {
                 Options       => 'Indexes',
                 AllowOverride => 'None',
                 AuthType      => 'Basic',
                 AuthName      => '"Do Not Enter"',
                 AuthUserFile  => 'private/.htpasswd',
                 Require       => 'valid-user',
             },
             "/perlrun" => {
                 SetHandler     => 'perl-script',
                 PerlHandler    => 'Apache::PerlRun',
                 PerlSendHeader => 'On',
                 Options        => '+ExecCGI',
             },
         },
        };
</Perl>

We have added two Location blocks. The first, /private, is for the juicy stuff and accessible only to users listed in the president's password file. The second, /perlrun, is for running dirty Perl CGI scripts, to be handled by the Apache::PerlRun handler.

<Perl>sections don't provide equivalents for <IfModule> and <IfDefine> containers. Instead, you can use the module( ) and define( ) methods from the Apache package. For example:
```
<IfModule mod_ssl.c>
    Include ssl.conf
</IfModule>
```
can be written as:
```
if (Apache->module("mod_ssl.c")) {
    push @Include, "ssl.conf";
}
```
And this configuration example:
```
<IfDefine SSL>
    Include ssl.conf
</IfDefine>
```
can be written as:
```
if (Apache->define("SSL")) {
    push @Include, "ssl.conf";
}
```
Now that you know how to convert the usual configuration directives to Perl code, there's no limit to what you can do with it. For example, you can put environment variables in an array and then pass them all to the children with a single configuration directive, rather than listing each one via PassEnv or PerlPassEnv:
```
<Perl>
    my @env = qw(MYSQL_HOME CVS_RSH);
    push @PerlPassEnv, \@env;
</Perl>
```
Or suppose you have a cluster of machines with similar configurations and only small distinctions between them. Ideally, you would want to maintain a single configuration file, but because the configurations aren't exactly the same (for example, the ServerName directive will have to differ), it's not quite that simple.

<Perl>sections come to the rescue. Now you can have a single configuration file and use the full power of Perl to tweak the local configuration. For example, to solve the problem of the ServerName directive, you might have this <Perl>section:
```
<Perl>
    use Sys::Hostname;
    $ServerName = hostname( );
</Perl>
```
and the right machine name will be assigned automatically.
Or, if you want to allow personal directories on all machines except the ones whose names start with secure, you can use:
```
<Perl>
    use Sys::Hostname;
    $ServerName = hostname( );
    if ($ServerName !~ /^secure/) {
        $UserDir = "public.html";
    }
</Perl>
```

4.4.2. Breaking Out of <Perl> Sections

Behind the scenes, mod_perl defines a package called Apache::ReadConfig in which it keeps all the variables that you define inside the <Perl> sections. So <Perl>sections aren't the only way to use mod_perl to configure the server: you can also place the Perl code in a separate file that will be called during the configuration parsing with either PerlModule or PerlRequire directives, or from within the startup file. All you have to do is to declare the package Apache::ReadConfig before writing any code in this file.

Using the last example from the previous section, we place the code into a file named apache_config.pl, shown in Example 4-4.

Example 4-4. apache_config.pl

package Apache::ReadConfig;

use Sys::Hostname;
$ServerName = hostname( );
if ($ServerName !~ /^secure/) {
    $UserDir = "public.html";
}
1;

Then we execute it either from httpd.conf:

PerlRequire /home/httpd/perl/lib/apache_config.pl

or from the startup.pl file:

require "/home/httpd/perl/lib/apache_config.pl";

4.4.3. Cheating with Apache->httpd_conf

In fact, you can create a complete configuration file in Perl. For example, instead of putting the following lines in httpd.conf:

NameVirtualHost         10.0.0.10

<VirtualHost 10.0.0.10>
    ServerName  tech.intranet
    DocumentRoot /home/httpd/httpd_perl/docs/tech
    ServerAdmin webmaster@tech.intranet
</VirtualHost>

<VirtualHost 10.0.0.10>
    ServerName   suit.intranet
    DocumentRoot /home/httpd/httpd_perl/docs/suit
    ServerAdmin webmaster@suit.intranet
</VirtualHost>

You can write it in Perl:

use Socket;
use Sys::Hostname;
my $hostname = hostname( );
(my $domain = $hostname) =~ s/[^.]+\.//;
my $ip = inet_ntoa(scalar gethostbyname($hostname || 'localhost'));
my $doc_root = '/home/httpd/docs';

Apache->httpd_conf(qq{
NameVirtualHost $ip

<VirtualHost $ip>
  ServerName  tech.$domain
  DocumentRoot $doc_root/tech
  ServerAdmin webmaster\@tech.$domain
</VirtualHost>

<VirtualHost $ip>
  ServerName   suit.$domain
  DocumentRoot $doc_root/suit
  ServerAdmin  webmaster\@suit.$domain
</VirtualHost>
 });

First, we prepare the data, such as deriving the domain name and IP address from the hostname. Next, we construct the configuration file in the "usual" way, but using the variables that were created on the fly. We can reuse this configuration file on many machines, and it will work anywhere without any need for adjustment.

Now consider that you have many more virtual hosts with a similar configuration. You have probably already guessed what we are going to do next:

use Socket;
use Sys::Hostname;
my $hostname = hostname( );
(my $domain = $hostname) =~ s/[^.]+\.//;
my $ip = inet_ntoa(scalar gethostbyname($hostname || 'localhost'));
my $doc_root = '/home/httpd/docs';
my @vhosts = qw(suit tech president);

Apache->httpd_conf("NameVirtualHost $ip");

for my $vh (@vhosts) {
  Apache->httpd_conf(qq{
<VirtualHost $ip>
  ServerName  $vh.$domain
  DocumentRoot $doc_root/$vh
  ServerAdmin webmaster\@$vh.$domain
</VirtualHost>
 });
}

In the loop, we create new virtual hosts. If we need to create 100 hosts, it doesn't take a long time—just adjust the @vhosts array.

4.4.4. Declaring Package Names in Perl Sections

Be careful when you declare package names inside <Perl> sections. For example, this code has a problem:

<Perl>
    package Book::Trans;
    use Apache::Constants qw(:common);
    sub handler { OK }

    $PerlTransHandler = "Book::Trans";
</Perl>

When you put code inside a <Perl>section, by default it goes into the Apache::ReadConfig package, which is already declared for you. This means that the PerlTransHandler we tried to define will be ignored, since it's not a global variable in the Apache::ReadConfig package.

If you define a different package name within a <Perl>section, make sure to close the scope of that package and return to the Apache::ReadConfig package when you want to define the configuration directives. You can do this by either explicitly declaring the Apache::ReadConfig package:

<Perl>
    package Book::Trans;
    use Apache::Constants qw(:common);
    sub handler { OK }

    package Apache::ReadConfig;
    $PerlTransHandler = "Book::Trans";
</Perl>

or putting the code that resides in a different package into a block:

<Perl>
    {
        package Book::Trans;
        use Apache::Constants qw(:common);
        sub handler { OK }
    }

    $PerlTransHandler = "Book::Trans";
</Perl>

so that when the block is over, the Book::Trans package's scope is over, and you can use the configuration variables again.

However, it's probably a good idea to use <Perl>sections only to create or adjust configuration directives. If you need to run some other code not related to configuration, it might be better to place it in the startup file or in its own module. Your mileage may vary, of course.

4.4.5. Verifying <Perl> Sections

How do we know whether the configuration made inside <Perl>sections was correct?

First we need to check the validity of the Perl syntax. To do that, we should turn it into a Perl script, by adding #!perl at the top of the section:

<Perl>
#!perl
# ... code here ...
_ _END_ _
</Perl>

Notice that #!perl and _ _END_ _ must start from the column zero. Also, the same rules as we saw earlier with validation of the startup file apply: if the <Perl>section includes some modules that can be loaded only when mod_perl is running, this validation is not applicable.

Now we may run:

perl -cx httpd.conf

If the Perl code doesn't compile, the server won't start. If the Perl code is syntactically correct, but the generated Apache configuration is invalid, <Perl>sections will just log a warning and carry on, since there might be globals in the section that are not intended for the configuration at all.

If you have more than one <Perl>section, you will have to repeat this procedure for each section, to make sure they all work.

To check the Apache configuration syntax, you can use the variable $Apache::Server::StrictPerlSections, added in mod_perl Version 1.22. If you set this variable to a true value:

$Apache::Server::StrictPerlSections = 1;

then mod_perl will not tolerate invalid Apache configuration syntax and will croak (die) if it encounters invalid syntax. The default value is 0. If you don't set $Apache::Server::StrictPerlSections to 1, you should localize variables unrelated to configuration with my( ) to avoid errors.

If the syntax is correct, the next thing we need to look at is the parsed configuration as seen by Perl. There are two ways to see it. First, we can dump it at the end of the section:

<Perl>
    use Apache::PerlSections ( );
    # code goes here
    print STDERR Apache::PerlSections->dump( );
</Perl>

Here, we load the Apache::PerlSections module at the beginning of the section, and at the end we can use its dump( ) method to print out the configuration as seen by Perl. Notice that only the configuration created in the section will be seen in the dump. No plain Apache configuration can be found there.

For example, if we adjust this section (parts of which we have seen before) to dump the parsed contents:

<Perl>
    use Apache::PerlSections ( );
    $User  = getpwuid($>) || $>;
    $Group = getgrgid($)) || $);
    push @Alias, [qw(/private /home/httpd/docs/private)];
    my $doc_root = "/home/httpd/docs";
    push @{ $VirtualHost{'10.0.0.10'} },
        {
         ServerName   => 'president.intranet',
         DocumentRoot => "$doc_root/president",
         ServerAdmin  => 'webmaster@president.intranet',
         Location     => {
             "/private"    => {
                 Options       => 'Indexes',
                 AllowOverride => 'None',
                 AuthType      => 'Basic',
                 AuthName      => '"Do Not Enter"',
                 AuthUserFile  => 'private/.htpasswd',
                 Require       => 'valid-user',
             },
             "/perlrun" => {
                 SetHandler     => 'perl-script',
                 PerlHandler    => 'Apache::PerlRun',
                 PerlSendHeader => 'On',
                 Options        => '+ExecCGI',
             },
         },
        };
    print STDERR Apache::PerlSections->dump( );
</Perl>

This is what we get as a dump:

package Apache::ReadConfig;
#hashes:

%VirtualHost = (
  '10.0.0.10' => [
    {
      'Location' => {
        '/private' => {
          'AllowOverride' => 'None',
          'AuthType' => 'Basic',
          'Options' => 'Indexes',
          'AuthUserFile' => 'private/.htpasswd',
          'AuthName' => '"Do Not Enter"',
          'Require' => 'valid-user'
        },
        '/perlrun' => {
          'PerlHandler' => 'Apache::PerlRun',
          'Options' => '+ExecCGI',
          'PerlSendHeader' => 'On',
          'SetHandler' => 'perl-script'
        }
      },
      'DocumentRoot' => '/home/httpd/docs/president',
      'ServerAdmin' => 'webmaster@president.intranet',
      'ServerName' => 'president.intranet'
    }
  ]
);

#arrays:

@Alias = (
  [
    '/private',
    '/home/httpd/docs/private'
  ]
);

#scalars:

$Group = 'stas';

$User = 'stas';

1;
_ _END_ _

You can see that the configuration was created properly. The dump places the output into three groups: arrays, hashes, and scalars. The server was started as user stas, so the $User and $Groupsettings were dynamically assigned to the user stas.

A different approach to seeing the dump at any time (not only during startup) is to use the Apache::Status module (see Chapter 9). First we store the Perl configuration:

<Perl>
    $Apache::Server::SaveConfig = 1;
    # the actual configuration code
</Perl>

Now the Apache::ReadConfig namespace (in which the configuration data is stored) will not be flushed, making configuration data available to Perl modules at request time. If the Apache::Status module is configured, you can view it by going to the /perl-status URI (or another URI that you have chosen) in your browser and selecting "Perl Section Configuration" from the menu. The configuration data should look something like that shown in Figure 4-1.

Figure 4-1. <Perl> sections configuration dump

Since the Apache::ReadConfig namespace is not flushed when the server is started, you can access the configuration values from your code—the data resides in the Apache::ReadConfig package. So if you had the following Perl configuration:

<Perl>
    $Apache::Server::SaveConfig = 1;
    $DocumentRoot = "/home/httpd/docs/mine";
</Perl>

at request time, you could access the value of $DocumentRoot with the fully qualified name $Apache::ReadConfig::DocumentRoot. But usually you don't need to do this, because mod_perl provides you with an API to access to the most interesting and useful server configuration bits.

4.4.6. Saving the Perl Configuration

Instead of dumping the generated Perl configuration, you may decide to store it in a file. For example, if you want to store it in httpd_config.pl, you can do the following:

<Perl>
    use Apache::PerlSections ( );
    # code goes here
    Apache::PerlSections->store("httpd_config.pl");
</Perl>

You can then require( ) that file in some other <Perl>section. If you have the whole server configuration in Perl, you can start the server using the following trick:

panic% httpd -C "PerlRequire httpd_config.pl"

Apache will fetch all the configuration directives from httpd_config.pl, so you don't need httpd.conf at all.

4.4.7. Debugging

If your configuration doesn't seem to do what it's supposed to do, you should debug it. First, build mod_perl with:

panic% perl Makefile.PL PERL_TRACE=1 [...]

Next, set the environment variable MOD_PERL_TRACE to s (as explained in Chapter 21). Now you should be able to see how the <Perl>section globals are converted into directive string values. For example, suppose you have the following Perl section:

<Perl>
    $DocumentRoot = "/home/httpd/docs/mine";
</Perl>

If you start the server in single-server mode (e.g., under bash):

panic% MOD_PERL_TRACE=s httpd -X

you will see these lines among the printed trace:

...
SVt_PV: $DocumentRoot = `/home/httpd/docs/mine'
handle_command (DocumentRoot /home/httpd/docs/mine): OK
...

But what if you mistype the directory name and pass two values instead of a single value? When you start the server, you'll see the following error:

...
SVt_PV: $DocumentRoot = `/home/httpd/docs/ mine'
handle_command (DocumentRoot /home/httpd/docs/ mine):
DocumentRoot takes one argument,
Root directory of the document tree
...

and of course the error will be logged in the error_log file:

[Wed Dec 20 23:47:31 2000] [error]
(2)No such file or directory: <Perl>:
DocumentRoot takes one argument,
Root directory of the document tree