home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam  

Book Home Programming PerlSearch this book

24.5. Program Generation

Almost from the time people first figured out that they could write programs, they started writing programs that write other programs. We often call these program generators. (If you're a history buff, you might know that RPG stood for Report Program Generator long before it stood for Role Playing Game.) Nowadays they'd probably be called "program factories", but the generator people got there first, so they got to name it.

Now, anyone who has written a program generator knows that it can make your eyes go crossed even when you're wide awake. The problem is simply that much of your program's data looks like real code, but isn't (at least not yet). The same text file contains both stuff that does something and similar looking stuff that doesn't. Perl has various features that make it easy to mix Perl together with other languages, textually speaking.

(Of course, these features also make it easier to write Perl in Perl, but that's rather to be expected by now, we should think.)

24.5.1. Generating Other Languages in Perl

Perl is (among other things) a text-processing language, and most computer languages are textual. Beyond that, Perl's lack of arbitrary limits together with the various quoting and interpolation mechanisms make it easy to visually isolate the code of the other language you're spitting out. For example, here is a small chunk of s2p, the sed-to-perl translator:

print &q(<<"EOT");
:       #!$bin/perl
:       eval 'exec $bin/perl -S \$0 \${1+"\$@"}'
:               if \$running_under_some_shell;
Here the enclosed text happens to be legal in two languages, both Perl and sh. We've used an idiom right off the bat that will preserve your sanity in the writing of a program generator: the trick of putting a "noise" character and a tab on the front of every quoted line, which visually isolates the enclosed code, so you can tell at a glance that it's not the code that is actually being executed. One variable, $bin, is interpolated in the multiline quote in two places, and then the string is passed through a function to strip the colon and tab.

Of course, you aren't required to use multiline quotes. One often sees CGI scripts containing millions of print statements, one per line. It seems a bit like driving to church in an F-16, but hey, if it gets you there... (We will admit that a column of print statements has its own form of visual distinctiveness.)

When you are embedding a large, multiline quote containing some other language (such as HTML), it's often helpful to pretend you're programming inside-out, enclosing Perl into the other language instead, much as you might do with overtly everted languages such as PHP:

print <<"XML";
    blah blah blah @{[ scalar EXPR ]} blah blah blah
    blah blah blah @{[ LIST ]} blah blah blah
You can use either of those two tricks to interpolate the values of arbitrarily complicated expressions into the long string.

Some program generators don't look much like program generators, depending on how much of their work they hide from you. In Chapter 22, "CPAN", we saw how a small Makefile.PL program could be used to write a Makefile. The Makefile can easily be 100 times bigger than the Makefile.PL that produced it. Think how much wear and tear that saves your fingers. Or don't think about it--that's the point, after all.

24.5.2. Generating Perl in Other Languages

It's easy to generate other languages in Perl, but the converse is also true. Perl can easily be generated in other languages because it's both concise and malleable. You can pick your quotes not to interfere with the other language's quoting mechanisms. You don't have to worry about indentation, or where you put your line breaks, or whether to backslash your backslashes Yet Again. You aren't forced to define a package as a single string in advance, since you can slide into your package's namespace repeatedly, whenever you want to evaluate more code in that package.

Another thing that makes it easy to write Perl in other languages (including Perl) is the #line directive. Perl knows how to process these as special directives that reconfigure its idea of the current filename and line number. This can be useful in error or warning messages, especially for strings processed with eval (which, when you think about it, is just Perl writing Perl). The syntax for this mechanism is the one used by the C preprocessor: when Perl encounters a # symbol and the word line, followed by a number and a filename, it sets __LINE__ to the number and __FILE__ to the filename.[3]

[3]Technically, it matches the pattern /^#\s*line\s+(\d+)\s*(?:\s"([^"]+)")?\s*$/, with $1 providing the line number for the next line, and $2 providing the optional filename specified within quotes. (A null filename leaves __FILE__ unchanged.)

Here are some examples that you can test by typing into perl directly. We've used a Control-D to indicate end-of-file, which is typical on Unix. DOS/Windows and VMS users can type Control-Z. If your shell uses something else, you'll have to use that to tell perl you're done. Alternatively, you can always type in __END__ to tell the compiler there's nothing left to parse.

Here, Perl's built-in warn function prints out the new filename and line number:

% perl
# line 2000 "Odyssey"
# the "#" on the previous line must be the first char on line
warn "pod bay doors";  # or die
pod bay doors at Odyssey line 2001.
And here, the exception raised by die within the eval found its way into the $@ ($EVAL_ERROR) variable, along with the temporary new filename and line:
# line 1996 "Odyssey"
eval qq{
#line 2025 "Hal"
    die "pod bay doors";
print "Problem with $@";
warn "I'm afraid I can't do that";
Problem with pod bay doors at Hal line 2025.
I'm afraid I can't do that at Odyssey line 2001.
This shows how a #line directive affects only the current compilation unit (file or evalSTRING), and that when that unit is done being compiled, the previous settings are automatically restored. This way you can set up your own messages inside an evalSTRING or doFILE without affecting the rest of your program.

Perl has a -P switch that invokes the C preprocessor, which emits #line directives. The C preprocessor was the original impetus for implementing #line, but it is seldom used these days, since there are usually better ways to do what we used to rely on it for. Perl has a number of other preprocessors, however, including the AutoSplit module. The JPL (Java Perl Lingo) preprocessor turns .jpl files into .java, .pl, .h, and .c files. It makes use of #line to keep the error messages accurate.

One of the very first Perl preprocessors was the sed-to-perl translator, s2p. In fact, Larry delayed the initial release of Perl in order to complete s2p and awk-to-perl (a2p), because he thought they'd improve the acceptance of Perl. Hmm, maybe they did.

See the online docs for more on these, as well as the find2perl translator.

24.5.3. Source Filters

If you can write a program to translate random stuff into Perl, then why not have a way of invoking that translator from within Perl?

The notion of a source filter started with the idea that a script or module should be able to decrypt itself on the fly, like this:

use MyDecryptFilter;
But the idea grew from there, and now a source filter can be defined to do any transformation on the input text you like. Put that together with the notion of the -x switch mentioned in Chapter 19, "The Command-Line Interface", and you have a general mechanism for pulling any chunk of program out of a message and executing it, regardless of whether it's written in Perl or not.

Using the Filter module from CPAN, one can now even do things like programming Perl in awk:

use Filter::exec "a2p";         # the awk-to-perl translator
1,30 { print $1 }
Now that's definitely what you might call idiomatic. But we won't pretend for a moment that it's common practice.

Library Navigation Links

Copyright © 2001 O'Reilly & Associates. All rights reserved.