home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam  


Practical mod_perlPractical mod_perlSearch this book

6.3. Namespace Issues

If your service consists of a single script, you will probably have no namespace problems. But web services usually are built from many scripts and handlers. In the following sections, we will investigate possible namespace problems and their solutions. But first we will refresh our understanding of two special Perl variables, @INC and %INC.

6.3.2. The %INC Hash

Perl's %INC hash is used to cache the names of the files and modules that were loaded and compiled by use( ), require( ), or do( )statements. Every time a file or module is successfully loaded, a new key-value pair is added to %INC. The key is the name of the file or module as it was passed to one of the three functions we have just mentioned. If the file or module was found in any of the @INC directories (except "."), the filenames include the full path. Each Perl interpreter, and hence each process under mod_perl, has its own private %INC hash, which is used to store information about its compiled modules.

Before attempting to load a file or a module with use( ) or require( ), Perl checks whether it's already in the %INC hash. If it's there, the loading and compiling are not performed. Otherwise, the file is loaded into memory and an attempt is made to compile it. Note that do( ) loads the file or module unconditionally—it does not check the %INC hash. We'll look at how this works in practice in the following examples.

First, let's examine the contents of @INC on our system:

panic% perl -le 'print join "\n", @INC'
/usr/lib/perl5/5.6.1/i386-linux
/usr/lib/perl5/5.6.1
/usr/lib/perl5/site_perl/5.6.1/i386-linux
/usr/lib/perl5/site_perl/5.6.1
/usr/lib/perl5/site_perl
.

Notice . (the current directory) as the last directory in the list.

Let's load the module strict.pm and see the contents of %INC:

panic% perl -le 'use strict; print map {"$_ => $INC{$_}"} keys %INC'
strict.pm => /usr/lib/perl5/5.6.1/strict.pm

Since strict.pm was found in the /usr/lib/perl5/5.6.1/ directory and /usr/lib/perl5/5.6.1/ is a part of @INC, %INC includes the full path as the value for the key strict.pm.

Let's create the simplest possible module in /tmp/test.pm:

1;

This does absolutely nothing, but it returns a true value when loaded, which is enough to satisfy Perl that it loaded correctly. Let's load it in different ways:

panic% cd /tmp
panic% perl -e 'use test; \
       print map { "$_ => $INC{$_}\n" } keys %INC'
test.pm => test.pm

Since the file was found in . (the directory the code was executed from), the relative path is used as the value. Now let's alter @INC by appending /tmp:

panic% cd /tmp
panic% perl -e 'BEGIN { push @INC, "/tmp" } use test; \
       print map { "$_ => $INC{$_}\n" } keys %INC'
test.pm => test.pm

Here we still get the relative path, since the module was found first relative to ".". The directory /tmp was placed after . in the list. If we execute the same code from a different directory, the "." directory won't match:

panic% cd /
panic% perl -e 'BEGIN { push @INC, "/tmp" } use test; \
       print map { "$_ => $INC{$_}\n" } keys %INC'
test.pm => /tmp/test.pm

so we get the full path. We can also prepend the path with unshift( ), so that it will be used for matching before ".". We will get the full path here as well:

panic% cd /tmp
panic% perl -e 'BEGIN { unshift @INC, "/tmp" } use test; \
       print map { "$_ => $INC{$_}\n" } keys %INC'
test.pm => /tmp/test.pm

The code:

BEGIN { unshift @INC, "/tmp" }

can be replaced with the more elegant:

use lib "/tmp";

This is almost equivalent to our BEGIN block and is the recommended approach.

These approaches to modifying @INC can be labor intensive: moving the script around in the filesystem might require modifying the path.

6.3.3. Name Collisions with Modules and Libraries

In this section, we'll look at two scenarios with failures related to namespaces. For the following discussion, we will always look at a single child process.

6.3.3.1. A first faulty scenario

It is impossible to use two modules with identical names on the same server. Only the first one found in a use( ) or a require( )statement will be loaded and compiled. All subsequent requests to load a module with the same name will be skipped, because Perl will find that there is already an entry for the requested module in the %INC hash.

Let's examine a scenario in which two independent projects in separate directories, projectA and projectB, both need to run on the same server. Both projects use a module with the name MyConfig.pm, but each project has completely different code in its MyConfig.pm module. This is how the projects reside on the filesystem (all located under the directory /home/httpd/perl):

projectA/MyConfig.pm
projectA/run.pl
projectB/MyConfig.pm
projectB/run.pl

Examples Example 6-6, Example 6-7, Example 6-8, and Example 6-9 show some sample code.

Example 6-6. projectA/run.pl

use lib qw(.);
use MyConfig;
print "Content-type: text/plain\n\n";
print "Inside project: ", project_name( );

Example 6-7. projectA/MyConfig.pm

sub project_name { return 'A'; }
1;

Example 6-8. projectB/run.pl

use lib qw(.);
use MyConfig;
print "Content-type: text/plain\n\n";
print "Inside project: ", project_name( );

Example 6-9. projectB/MyConfig.pm

sub project_name { return 'B'; }
1;

Both projects contain a script, run.pl, which loads the module MyConfig.pm and prints an indentification message based on the project_name( ) function in the MyConfig.pm module. When a request to /perl/projectA/run.pl is issued, it is supposed to print:

Inside project: A

Similarly, /perl/projectB/run.pl is expected to respond with:

Inside project: B

When tested using single-server mode, only the first one to run will load the MyConfig.pm module, although both run.pl scripts call use MyConfig. When the second script is run, Perl will skip the use MyConfig;statement, because MyConfig.pm is already located in %INC. Perl reports this problem in the error_log:

Undefined subroutine
&Apache::ROOT::perl::projectB::run_2epl::project_name called at
/home/httpd/perl/projectB/run.pl line 4.

This is because the modules didn't declare a package name, so the project_name( )subroutine was inserted into projectA/run.pl's namespace, Apache::ROOT::perl::projectB::run_2epl. Project B doesn't get to load the module, so it doesn't get the subroutine either!

Note that if a library were used instead of a module (for example, config.pl instead of MyConfig.pm), the behavior would be the same. For both libraries and modules, a file is loaded and its filename is inserted into %INC.

6.3.3.6. A third solution

This solution makes use of package-name declaration in the require( ) d modules. For example:

package ProjectA::Config;

Similarly, for ProjectB, the package name would be ProjectB::Config.

Each package name should be unique in relation to the other packages used on the same httpd server. %INC will then use the unique package name for the key instead of the filename of the module. It's a good idea to use at least two-part package names for your private modules (e.g., MyProject::Carp instead of just Carp), since the latter will collide with an existing standard package. Even though a package with the same name may not exist in the standard distribution now, in a later distribution one may come along that collides with a name you've chosen.

What are the implications of package declarations? Without package declarations in the modules, it is very convenient to use( ) and require( ), since all variables and subroutines from the loaded modules will reside in the same package as the script itself. Any of them can be used as if it was defined in the same scope as the script itself. The downside of this approach is that a variable in a module might conflict with a variable in the main script; this can lead to hard-to-find bugs.

With package declarations in the modules, things are a bit more complicated. Given that the package name is PackageA, the syntax PackageA::project_name( )should be used to call a subroutine project_name( ) from the code using this package. Before the package declaration was added, we could just call project_name( ). Similarly, a global variable $foo must now be referred to as $PackageA::foo, rather than simply as $foo. Lexically defined variables (declared with my( )) inside the file containing PackageA will be inaccessible from outside the package.

You can still use the unqualified names of global variables and subroutines if these are imported into the namespace of the code that needs them. For example:

use MyPackage qw(:mysubs sub_b $var1 :myvars);

Modules can export any global symbols, but usually only subroutines and global variables are exported. Note that this method has the disadvantage of consuming more memory. See the perldoc Exporter manpage for information about exporting other variables and symbols.

Let's rewrite the second scenario in a truly clean way. This is how the files reside on the filesystem, relative to the directory /home/httpd/perl:

project/MyProject/Config.pm
project/runA.pl
project/runB.pl

Examples Example 6-15, Example 6-16, and Example 6-17 show how the code will look.

Example 6-15. project/MyProject/Config.pm

package MyProject::Config
sub project_name { return 'Super Project'; }
1;

Example 6-16. project/runB.pl

use lib qw(.);
use MyProject::Config;
print "Content-type: text/plain\n\n";
print "Script B\n";
print "Inside project: ", MyProject::Config::project_name( );

Example 6-17. project/runA.pl

use lib qw(.);
use MyProject::Config;
print "Content-type: text/plain\n\n";
print "Script A\n";
print "Inside project: ", MyProject::Config::project_name( );

As you can see, we have created the MyProject/Config.pm file and added a package declaration at the top of it:

package MyProject::Config

Now both scripts load this module and access the module's subroutine, project_name( ), with a fully qualified name, MyProject::Config::project_name( ).

See also the perlmodlib and perlmod manpages.

From the above discussion, it also should be clear that you cannot run development and production versions of the tools using the same Apache server. You have to run a dedicated server for each environment. If you need to run more than one development environment on the same server, you can use Apache::PerlVINC, as explained in Appendix B.



Library Navigation Links

Copyright © 2003 O'Reilly & Associates. All rights reserved.