6.4 Cooperating with Other LanguagesJust as there are many levels on which languages can compete, so too there are many levels on which languages can cooperate. Here we'll talk primarily about generation, translation and embedding (via linking). 6.4.1 Program GenerationAlmost from the time people first figured out that they could write programs, they started writing programs that write other programs. These are called program generators . (If you're a history buff, you might know that RPG stood for Report Program Generator long before it stood for Role Playing Game.) Now, anyone who has written a program generator knows that it can make your eyes go crossed even when you're wide awake. The problem is simply that much of your program's data looks like real code, but isn't (at least not yet). The same text file contains both stuff that does something and similar looking stuff that doesn't. Perl has various features that make it easier to mix it together with other languages, textually speaking. Of course, these features also make it easier to write Perl in Perl, but it's rather expected that Perl would cooperate with itself. 6.4.1.1 Generating other languages in PerlPerl is, of course, a text-processing language, and most computer languages are textual. Beyond that, the lack of arbitrary limits together with the various quoting and interpolation mechanisms make it pretty easy to visually isolate the code of the other language you're spitting out. For example, here is a small chunk of s2p , the sed -to- perl translator: print &q(<<"EOT"); : #!$bin/perl : eval 'exec $bin/perl -S \$0 \${1+"\$@"}' : if \$running_under_some_shell; : EOT
Here the enclosed text happens to be legal in two languages, both Perl
and shell. We've used the trick of putting a colon and a tab on the
front of every line, which visually isolates the enclosed code. One
variable, Of course, you aren't required to use multi-line quotes. One often sees CGI scripts containing millions of print statements, one per line. It seems a bit like driving to church in an F-16, but hey, if it gets you there.... When you are embedding a large, multi-line quote containing some other language (such as HTML), it's sometimes helpful to pretend you're enclosing Perl into the other language instead: print <<"END"; stuff blah blah blah ${ \(
You can use either of those two tricks to interpolate the value of any
scalar 6.4.1.2 Generating Perl in other languagesPerl can easily be generated in other languages because it's both concise and malleable. You can pick your quotes not to interfere with the other language's quoting mechanisms. You don't have to worry about indentation, or where you put your line breaks, or whether to backslash your backslashes yet again. You aren't forced to define a package as a single string in advance, since you can slide into your package's namespace repeatedly, whenever you want to evaluate more code in that package. 6.4.2 Translation from Other LanguagesOne of the very first Perl applications was the sed -to- perl translator, s2p . In fact, Larry delayed the initial release of Perl in order to complete s2p and awk-to-perl ( a2p ), because he thought they'd improve the acceptance of Perl. Hmm, maybe they did. 6.4.2.1 s2pThe s2p program takes a sed script specified on the command line (or from standard input) and produces a comparable Perl script on the standard output. Options include:
The Perl script produced looks very sed -like, and there may very well be better ways to express what you want to do in Perl. For instance, s2p does not make any use of the split operator, but you might want to. The Perl script you end up with may be either faster or slower than the original sed script. If you're only interested in speed you'll just have to try it both ways. Of course, if you want to do something sed doesn't do, you have no choice. It's often possible to speed up the Perl script by various methods, such as deleting all references to $\ and chop . 6.4.2.2 a2pThe a2p program takes an awk script specified on the command line (or from standard input) and produces a comparable Perl script on the standard output. Options include:
a2p cannot do as good a job translating as a human would, but it usually does pretty well. There are some areas where you may want to examine the Perl script produced and tweak it some. Here are some of them, in no particular order.
There is an
awk
idiom of putting
Perl differentiates numeric comparison from string comparison.
awk
has
one operator for both that decides at run-time which comparison to do.
a2p
does not try to do a complete job of
awk
emulation at this
point. Instead it guesses which one you want. It's almost always
right, but it can be spoofed. All such guesses are marked with the
comment It would be possible to emulate awk 's behavior in selecting string versus numeric operations at run-time by inspection of the operands, but it would be gross and inefficient. Besides, a2p almost always guesses right.
Perl does not attempt to emulate the behavior of
awk
in which
nonexistent array elements spring into existence simply by being
referenced. If somehow you are relying on this mechanism to create null
entries for a subsequent
If
a2p
makes a
split
command that assigns to a list of variables
that looks like
The "exit" statement in
awk
doesn't necessarily exit; it
goes to the
Perl has two kinds of arrays, numerically indexed and associative.
awk
arrays are usually translated to associative arrays, but if you
happen to know that the index is always going to be numeric, you could
change the
awk
starts by assuming OFMT has the value Near the top of the line loop will be the split operator that is implicit in the awk script. There are times when you can move this operator down past some conditionals that test the entire record, so that the split is not done as often.
For aesthetic reasons you may wish to change the array base
$[
from Cute comments that say: # Here's a workaround because awk is so dumb. are, of course, passed through unmodified. awk scripts are often embedded in a shell script that pipes stuff into and out of awk . Often the shell script wrapper can be incorporated into the Perl script, since Perl can start up pipes into and out of itself, and can do other things that awk can't do by itself.
Scripts that refer to the special variables The produced Perl script may have subroutines defined to deal with awk 's semantics regarding "getline" and "print". Since a2p usually picks correctness over efficiency, it is almost always possible to rewrite such code to be more efficient by discarding the semantic sugar.
6.4.2.3 find2perlThe find2perl program is really easy to understand if you already understand the UNIX find (1) program. Just type find2perl instead of find , and give it the same arguments you would give to find . It will spit out an equivalent Perl script. There are a couple of options you can use that your ordinary find (1) command probably doesn't support:
6.4.2.4 Source filtersThe notion of a source filter started with the idea that a script or module should be able to decrypt itself on the fly, like this: #!/usr/bin/perl use MyDecryptFilter; @*x$]`0uN&k^Zx02jZ^X{.?s!(f;9Q/^A^@~~8H]|,%@^P:q-= ... But the idea grew from there, and now a source filter can be defined to do any transformation on the input text you like. One can now even do things like this: #!/usr/bin/perl use Filter::exec "a2p"; 1,30{print $1} Put that together with the notion of the -x switch mentioned at the beginning of this chapter, and you have a general mechanism for pulling any chunk of program out of an article and executing it, regardless of whether it's written in Perl or not. Now that's cooperation. The Filter module is available from CPAN. 6.4.3 Translation to Other LanguagesHistorically, the Perl interpreter has been rather self-contained. When Perl was redesigned for Version 5, however, one of the requirements was that it be possible to write extension modules that could traverse the parsed syntax tree and emit code in other languages, either low-level or high-level. This has now come to pass.
More precisely, this is now coming to pass. Malcolm Beattie has been
developing a "real compiler" for Perl. As of this writing, it's in
Alpha 2 state, which means it mostly works, except for the really hard
bits. The compiler consists of an ordinary Perl parser and
interpreter (since you need to be able to execute perl -MO=C foo.pl >foo.c There are three backends at the moment. The C backend rather woodenly spits out C calls into the ordinary Perl interpreter, but it can translate almost anything except the most egregious abuses of the dynamic capabilities of the interpreter. The Bytecode module is also fairly complete, and spits out an external Perl bytecode representation, which can then be read back in and executed by a suitably clued version of Perl. Finally, the CC backend attempts to translate into more idiomatic C with a lot of optimization. Obviously, that's a bit harder to do than the other thing. Nevertheless, it already works on a majority of the Perl regression tests. It's possible with some care to get C code that runs considerably faster than Perl 5's interpreter, which is no slouch to begin with. And Malcolm hasn't put in all the optimizations he wants to yet. This is an ongoing topic of research, but you'll want to keep track of it. You are quite likely to be using this someday soon, if you aren't already. Look for it on CPAN of course, if it's not already a part of the standard Perl distribution by the time you read this. 6.4.4 Embedding Perl in C and C++
Another part of the design of Perl 5 was that it be possible to embed a
Perl interpreter in a C or C++ program. And in fact, the ordinary
perl
executable pretends to have an embedded interpreter in it; the
PerlInterpreter *my_perl; int main(int argc, char **argv) { int exitstatus; my_perl = perl_alloc(); perl_construct( my_perl ); exitstatus = perl_parse( my_perl, xs_init, argc, argv, (char **) NULL ); if (exitstatus) exit( exitstatus ); exitstatus = perl_run( my_perl ); perl_destruct( my_perl ); perl_free( my_perl ); exit(exitstatus); }
The important parts are the calls to
There are many other useful entry points into the interpreter, such as
A number of programs in the real world already have Perl embedded in
them - the authors know of several proprietary products shipping with
embedded Perl interpreters. There are also a couple of modules for the
Apache
HTTP servers that use an embedded Perl interpreter to avoid
process startup costs on CGI-like scripting. And then there's the version
of Berkeley's
nvi
editor with a Perl engine in it. Watch out,
emacs
, you've got company. 6.4.5 Embedding C and C++ in PerlIf a respectable number of programs embed a Perl interpreter, then a veritable flood of extension modules embed C and C++ into Perl. Again, the Perl distribution itself does this with many of its standard extension modules, including DB_File, DynaLoader, Fcntl, FileHandle, GDBM_File, NDBM_File, ODBM_File, POSIX, Safe, SDBM_File, and Socket. And many of the modules on CPAN do this. So if you decide to do it yourself, you won't feel like you're researching a Ph.D. dissertation. And again, we only have space to give you teasers for the online documentation, which is exhaustively extensive. We recommend you start with the perlxstut (3) manpage, which is a tutorial on the XS language, a preprocessor that spits out the glue routines you need to do the "impedance matching" between Perl and C or C++. You'll also be interested in perlxs (3), perlguts (3), and perlcall (3). And once again, let us reiterate that your best resource is the Perl community itself. They invented a lot of this stuff, and are emotionally committed to making you like it, whether you like it or not. You'd better cooperate. |
|