5. Packages, Modules, and Object ClassesThis chapter, more than any other in this book, is about Laziness, Impatience, and Hubris - because this chapter is about good software design. We've all fallen into the trap of using cut-and-paste when we should have chosen to define a higher-level abstraction, if only just a loop or subroutine.[ 1 ] To be sure, some folks have gone to the opposite extreme of defining ever-growing mounds of higher-level abstractions when they should have used cut-and-paste.[ 2 ] Generally, though, most of us need to think about using more abstraction rather than less.
(Caught somewhere in the middle are the people who have a balanced view of how much abstraction is good, but who jump the gun on writing their own abstractions when they should be reusing existing code.)[ 3 ]
Whenever you're tempted to do any of these things, you need to sit back and think about what will do the most good for you and your neighbor over the long haul. If you're going to pour your creative energies into a lump of code, why not make the world a better place while you're at it? (Even if you're only aiming for the program to succeed , you need to make sure it fits its ecological niche.) The first step toward ecologically sustainable programming is simply: don't litter in the park. When you write a chunk of code, think about giving the code its own namespace, so that your variables and functions don't clobber anyone else's, or vice versa. A namespace is a bit like your home, where you're allowed to be as messy as you like, as long as you keep your external interface to other citizens moderately civil. In Perl, a namespace is called a package . Packages provide the fundamental building block upon which the higher-level concepts of modules and classes are constructed. Like the notion of "home", the notion of "package" is a bit nebulous. Packages are independent of files. You can have many packages in a single file, or a single package that spans several files, just as your home could be one part of a larger building, if you live in an apartment, or could comprise several buildings, if your name happens to be Queen Elizabeth. But the usual size of a home is one building, and the usual size of a package is one file. Perl has some special help for people who want to put one package in one file, as long as you're willing to name the file with the same name as the package and give your file an extension of " .pm ", which is short for "perl module". The module is the unit of reusability in Perl. Indeed, the way you use a module is with the use command, which is a compiler directive that controls the importation of functions and variables from a module. Every example of use you've seen until now has been an example of module reuse. Object classes are another concept built on the package concept. The concept of classes therefore cuts across the concepts of files and modules. But the typical class is nevertheless implemented with a module. (If you're starting to get the feeling that much of Perl culture is governed by mere convention, then you're starting to get the right feeling, civilly speaking. The trend over the last 20 years or so has been to design computer languages that enforce a state of paranoia. You're expected to program every module as if it were in a state of siege. Certainly there are some feudal cultures where this is appropriate, but not all cultures are like this. In Perl culture, by contrast, you're expected to stay out of someone's home because you weren't invited in, not because there are bars[ 4 ] on the windows.)
Anyway, back to classes. When you use a module that implements a class, you're benefiting from the direct reuse of the software that implements that module. But with object classes you can get the additional benefits of indirect software reuse when the class you're using turns around and reuses other classes that it gets some characteristics from. But this is not primarily a book about object-oriented methodology, and we're not here to convert you into a raving object-oriented zealot, even if you want to be converted. There are already plenty of books out there for that. Perl's philosophy of object-oriented design fits right in with Perl's philosophy of everything else: use object-oriented design where it makes sense, and avoid it where it doesn't. Your call. As we mentioned in the previous chapter, object-oriented programming in Perl is accomplished through use of references that happen to refer to thingies that know which class they're associated with. In fact, now that you know about references, you know almost everything hard about objects. The rest of it just "lays under the fingers", as a violinist would say. You will need to practice a little, though. In this chapter we will discuss creation and use of packages, modules, and classes. Then we will review some of the essentials of object-oriented programming, explain how references become objects, and illustrate how these objects are manipulated as members of one or more classes. We'll also tell you how to tie ordinary variables into object classes to turn them into magical variables. 5.1 PackagesPerl provides a mechanism to protect different sections of code from inadvertently tampering with each other's variables. In fact, apart from certain magical variables, there's really no such thing as a global variable in Perl. Code is always compiled in the current package . The initial current package is package main, but at any time you can switch the current package to another one using the package declaration. The current package determines which symbol table is used for name lookups (for names that aren't otherwise package-qualified). The notion of "current package" is both a compile-time and run-time concept. Most name lookups happen at compile-time, but run-time lookups happen when symbolic references are dereferenced, and also when new bits of code are parsed under eval . In particular, eval operations know which package they were invoked in, and propagate that package inward as the current package of the evaluated code. (You can always switch to a different package within the eval string, of course, since an eval string counts as a block, as does a file loaded in with do , require , or use .) The scope of a package declaration is from the declaration itself through the end of the innermost enclosing block (or until another package declaration at the same level, which hides the earlier one). All subsequent identifiers (except those declared with my , or those qualified with a different package name) will be placed in the symbol table belonging to the package. Typically, you would put a package declaration as the first declaration in a file to be included by require or use . But again, that's by convention. You can put a package declaration anywhere you can put a statement. You could even put it at the end of a block, in which case it would have no effect whatsoever. You can switch into a package in more than one place; it merely influences which symbol table is used by the compiler for the rest of that block. (This is how a given package can span more than one file.)
You can refer to identifiers[
5
]
in other packages by prefixing ("qualifying") the identifier with the
package name and a double colon:
Packages may be nested inside other packages:
Only identifiers (names starting with letters or underscore) are stored
in the current package's symbol table. All other symbols are kept in
package main, including all the magical punctuation-only variables
like
$!
and
$_
. In addition, the identifiers Assignment of a string to %SIG assumes the signal handler specified is in the main package, if the name assigned is unqualified. Qualify the signal handler name if you want to have a signal handler in a package, or don't use a string at all: assign a typeglob or a function reference instead:
$SIG{QUIT} = "quit_catcher"; # implies "main::quit_catcher" $SIG{QUIT} = *quit_catcher; # forces current package's sub $SIG{QUIT} = \&quit_catcher; # forces current package's sub $SIG{QUIT} = sub { print "Caught SIGQUIT\n" }; # anonymous sub See my and local in Chapter 3, Functions , for other scoping issues. See the "Signals" section in Chapter 6, Social Engineering , for more on signal handlers. 5.1.1 Symbol Tables
The symbol table for a package happens to be stored in a hash whose name
is the same as the package name with two colons appended. The
main symbol table's name is thus
When we say that a symbol table "contains" another symbol table, we mean that it contains a reference to the other symbol table. Since
package main is a top-level package, it contains a reference to itself,
with the result that
The keys in a symbol table hash are the identifiers of the symbols in
the symbol table. The values in a symbol table hash are the
corresponding typeglob values. So when you use the
local *somesym = *main::variable; local *somesym = $main::{"variable"}; Since a package is a hash, you can look up the keys of the package, and hence all the variables of the package. Try this:
foreach $symname (sort keys %main::) { local *sym = $main::{$symname}; print "\$$symname is defined\n" if defined $sym; print "\@$symname is defined\n" if defined @sym; print "\%$symname is defined\n" if defined %sym; } Since all packages are accessible (directly or indirectly) through package main, you can visit every package variable in the program, using code written in Perl. The Perl debugger does precisely that when you ask it to dump all your variables. Assignment to a typeglob performs an aliasing operation; that is,
*dick = *richard;
causes everything accessible via the identifier
*dick = \$richard;
This makes This mechanism may be used to pass and return cheap references into or from subroutines if you don't want to copy the whole thing:
%some_hash = (); *some_hash = fn( \%another_hash ); sub fn { local *hashsym = shift; # now use %hashsym normally, and you # will affect the caller's %another_hash my %nhash = (); # populate this hash at will return \%nhash; }
On return, the reference will overwrite the hash slot in the
symbol table specified by the Another use of symbol tables is for making "constant" scalars:
*PI = \3.14159265358979;
Now you cannot alter When you do that assignment, you're just replacing one reference within the typeglob. If you think about it sideways, the typeglob itself can be viewed as a kind of hash, with entries for the different variable types in it. In this case, the keys are fixed, since a typeglob can contain exactly one scalar, one array, one hash, and so on. But you can pull out the individual references, like this:
*pkg::sym{SCALAR} # same as \$pkg::sym *pkg::sym{ARRAY} # same as \@pkg::sym *pkg::sym{HASH} # same as \%pkg::sym *pkg::sym{CODE} # same as \&pkg::sym *pkg::sym{GLOB} # same as \*pkg::sym *pkg::sym{FILEHANDLE} # internal filehandle, no direct equivalent *pkg::sym{NAME} # "sym" (not a reference) *pkg::sym{PACKAGE} # "pkg" (not a reference) This is primarily used to get at the internal filehandle reference, since the other internal references are already accessible in other ways. But we thought we'd generalize it because it looks kind of pretty. Sort of. You probably don't need to remember all this unless you're planning to write a Perl debugger. So let's get back to the topic of writing good software. 5.1.2 Package Constructors and Destructors: BEGIN and END
Two special subroutine definitions that function as package
constructors and destructors[
7
]
are the
A
An
When you use the
-n
and
-p
switches to Perl,
die "green\n"; END { print "blue\n" } BEGIN { print "red\n" }
Just as
eval
provides a way to get compilation behavior during run-time,
so too
system "rm -rf '$dir'"
you should always check that 5.1.3 Autoloading
Normally you can't call a subroutine that isn't defined. However, if
there is a subroutine named
Most
The standard AutoSplit module is a tool used by module writers to
help split their modules into separate files (with filenames ending
in
.al
), each holding one routine. The files are placed in
the
auto/
directory of the Perl library. These files can then be loaded
on demand by the standard AutoLoader module. A similar approach is
taken by the SelfLoader module, except that it autoloads functions from
the file's own
But an sub AUTOLOAD { my $program = $AUTOLOAD; $program =~ s/.*:://; # trim package name system($program, @_); } date(); who('am', 'i'); ls('-l'); In fact, if you predeclare the functions you want to call that way, you don't even need the parentheses:
use subs qw(date who ls); date; who "am", "i"; ls "-l"; A more complete example of this is the standard Shell module described in Chapter 7 , which can treat undefined subroutine calls as calls to programs. |
|