home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam  


Practical mod_perlPractical mod_perlSearch this book

6.4. Perl Specifics in the mod_perl Environment

In the following sections, we discuss the specifics of Perl's behavior under mod_perl.

6.4.1. exit( )

Perl's core exit( ) function shouldn't be used in mod_perl code. Calling it causes the mod_perl process to exit, which defeats the purpose of using mod_perl. The Apache::exit( ) function should be used instead. Starting with Perl Version 5.6.0, mod_perl overrides exit( ) behind the scenes using CORE::GLOBAL::, a new magical package.

The CORE:: Package

CORE:: is a special package that provides access to Perl's built-in functions. You may need to use this package to override some of the built-in functions. For example, if you want to override the exit( ) built-in function, you can do so with:

use subs qw(exit);
exit( ) if $DEBUG;
sub exit { warn "exit( ) was called"; }

Now when you call exit( ) in the same scope in which it was overridden, the program won't exit, but instead will just print a warning "exit( ) was called". If you want to use the original built-in function, you can still do so with:

# the 'real' exit
CORE::exit( );

Apache::Registry and Apache::PerlRun override exit( ) with Apache::exit( ) behind the scenes; therefore, scripts running under these modules don't need to be modified to use Apache::exit( ).

If CORE::exit( ) is used in scripts running under mod_perl, the child will exit, but the current request won't be logged. More importantly, a proper exit won't be performed. For example, if there are some database handles, they will remain open, causing costly memory and (even worse) database connection leaks.

If the child process needs to be killed, Apache::exit(Apache::Constants::DONE)should be used instead. This will cause the server to exit gracefully, completing the logging functions and protocol requirements.

If the child process needs to be killed cleanly after the request has completed, use the $r->child_terminate method. This method can be called anywhere in the code, not just at the end. This method sets the value of the MaxRequestsPerChild configuration directive to 1 and clears the keepalive flag. After the request is serviced, the current connection is broken because of the keepalive flag, which is set to false, and the parent tells the child to cleanly quit because MaxRequestsPerChild is smaller than or equal to the number of requests served.

In an Apache::Registryscript you would write:

Apache->request->child_terminate;

and in httpd.conf:

PerlFixupHandler "sub { shift->child_terminate }"

You would want to use the latter example only if you wanted the child to terminate every time the registered handler was called. This is probably not what you want.

You can also use a post-processing handler to trigger child termination. You might do this if you wanted to execute your own cleanup code before the process exits:

my $r = shift;
$r->post_connection(\&exit_child);

sub exit_child {
    # some logic here if needed
    $r->child_terminate;
}

This is the code that is used by the Apache::SizeLimit module, which terminates processes that grow bigger than a preset quota.

6.4.3. Global Variable Persistence

Under mod_perl a child process doesn't exit after serving a single request. Thus, global variables persist inside the same process from request to request. This means that you should be careful not to rely on the value of a global variable if it isn't initialized at the beginning of each request. For example:

# the very beginning of the script
use strict;
use vars qw($counter);
$counter++;

relies on the fact that Perl interprets an undefined value of $counter as a zero value, because of the increment operator, and therefore sets the value to 1. However, when the same code is executed a second time in the same process, the value of $counter is not undefined any more; instead, it holds the value it had at the end of the previous execution in the same process. Therefore, a cleaner way to code this snippet would be:

use strict;
use vars qw($counter);
$counter = 0;
$counter++;

In practice, you should avoid using global variables unless there really is no alternative. Most of the problems with global variables arise from the fact that they keep their values across functions, and it's easy to lose track of which function modifies the variable and where. This problem is solved by localizing these variables with local( ). But if you are already doing this, using lexical scoping (with my( )) is even better because its scope is clearly defined, whereas localized variables are seen and can be modified from anywhere in the code. Refer to the perlsub manpage for more details. Our example will now be written as:

use strict;
my $counter = 0;
$counter++;

Note that it is a good practice to both declare and initialize variables, since doing so will clearly convey your intention to the code's maintainer.

You should be especially careful with Perl special variables, which cannot be lexically scoped. With special variables, local( ) must be used. For example, if you want to read in a whole file at once, you need to undef( ) the input record separator. The following code reads the contents of an entire file in one go:

open IN, $file or die $!;
$/ = undef;
$content = <IN>; # slurp the whole file in
close IN;

Since you have modified the special Perl variable $/ globally, it'll affect any other code running under the same process. If somewhere in the code (or any other code running on the same server) there is a snippet reading a file's content line by line, relying on the default value of $/ (\n), this code will work incorrectly. Localizing the modification of this special variable solves this potential problem:

{
  local $/; # $/ is undef now
  $content = <IN>; # slurp the whole file in
}

Note that the localization is enclosed in a block. When control passes out of the block, the previous value of $/ will be restored automatically.

6.4.7. Formats

The interface to file handles that are linked to variables with Perl's tie( ) function is not yet complete. The format( ) and write( ) functions are missing. If you configure Perl with sfio, write( ) and format( )should work just fine.

Instead of format( ), you can use printf( ). For example, the following formats are equivalent:

format   printf
---------------
##.##    %2.2f
####.##  %4.2f

To print a string with fixed-length elements, use the printf( ) format %n.ms where n is the length of the field allocated for the string and m is the maximum number of characters to take from the string. For example:

printf "[%5.3s][%10.10s][%30.30s]\n",
       12345, "John Doe", "1234 Abbey Road"

prints:

[  123][  John Doe][                1234 Abbey Road]

Notice that the first string was allocated five characters in the output, but only three were used because m=5 and n=3 (%5.3s). If you want to ensure that the text will always be correctly aligned without being truncated, n should always be greater than or equal to m.

You can change the alignment to the left by adding a minus sign (-) after the %. For example:

printf "[%-5.5s][%-10.10s][%-30.30s]\n",
       123, "John Doe", "1234 Abbey Road"

prints:

[123  ][John Doe  ][1234 Abbey Road                ]

You can also use a plus sign (+) for the right-side alignment. For example:

printf "[%+5s][%+10s][%+30s]\n",
       123, "John Doe", "1234 Abbey Road"

prints:

[  123][  John Doe][                1234 Abbey Road]

Another alternative to format( ) and printf( ) is to use the Text::Reform module from CPAN.

In the examples above we've printed the number 123 as a string (because we used the %s format specifier), but numbers can also be printed using numeric formats. See perldoc -f sprintf for full details.

6.4.9. BEGIN blocks

Perl executes BEGIN blocks as soon as possible, when it's compiling the code. The same is true under mod_perl. However, since mod_perl normally compiles scripts and modules only once, either in the parent process or just once per child, BEGIN blocks are run only once. As the perlmod manpage explains, once a BEGIN block has run, it is immediately undefined. In the mod_perl environment, this means that BEGIN blocks will not be run during the response to an incoming request unless that request happens to be the one that causes the compilation of the code. However, there are cases when BEGIN blocks will be rerun for each request.

BEGIN blocks in modules and files pulled in via require( ) or use( ) will be executed:

  • Only once, if pulled in by the parent process.

  • Once per child process, if not pulled in by the parent process.

  • One additional time per child process, if the module is reloaded from disk by Apache::StatINC.

  • One additional time in the parent process on each restart, if PerlFreshRestart is On.

  • On every request, if the module with the BEGIN block is deleted from %INC, before the module's compilation is needed. The same thing happens when do( ) is used, which loads the module even if it's already loaded.

BEGIN blocks in Apache::Registry scripts will be executed:

  • Only once, if pulled in by the parent process via Apache::RegistryLoader.

  • Once per child process, if not pulled in by the parent process.

  • One additional time per child process, each time the script file changes on disk.

  • One additional time in the parent process on each restart, if pulled in by the parent process via Apache::RegistryLoader and PerlFreshRestart is On.

Note that this second list is applicable only to the scripts themselves. For the modules used by the scripts, the previous list applies.

6.4.10. END Blocks

As the perlmod manpage explains, an ENDsubroutine is executed when the Perl interpreter exits. In the mod_perl environment, the Perl interpreter exits only when the child process exits. Usually a single process serves many requests before it exits, so END blocks cannot be used if they are expected to do something at the end of each request's processing.

If there is a need to run some code after a request has been processed, the $r->register_cleanup( ) function should be used. This function accepts a reference to a function to be called during the PerlCleanupHandler phase, which behaves just like the END block in the normal Perl environment. For example:

$r->register_cleanup(sub { warn "$$ does cleanup\n" });

or:

sub cleanup { warn "$$ does cleanup\n" };
$r->register_cleanup(\&cleanup);

will run the registered code at the end of each request, similar to END blocks under mod_cgi.

As you already know by now, Apache::Registry handles things differently. It does execute all END blocks encountered during compilation of Apache::Registryscripts at the end of each request, like mod_cgi does. That includes any END blocks defined in the packages use( ) d by the scripts.

If you want something to run only once in the parent process on shutdown and restart, you can use register_cleanup( ) in startup.pl:

warn "parent pid is $$\n";
Apache->server->register_cleanup(
    sub { warn "server cleanup in $$\n" });

This is useful when some server-wide cleanup should be performed when the server is stopped or restarted.



Library Navigation Links

Copyright © 2003 O'Reilly & Associates. All rights reserved.