Compiling Your Code (Programming Perl)

18.2. Compiling Your Code

Perl is always in one of two modes of operation: either it is compiling your program, or it is executing it--never both at the same time. Throughout this book, we refer to certain events as happening at compile time, or we say that "the Perl compiler does this and that". At other points, we mention that something else occurs at run time, or that "the Perl interpreter does this and that". Although you can get by with thinking of both the compiler and interpreter as simply "Perl", understanding which of these two roles Perl is playing at any given point is essential to understanding why many things happen as they do. The perl executable implements both roles: first the compiler, then the interpreter. (Other roles are possible, too; perl is also an optimizer and a code generator. Occasionally, it's even a trickster--but all in good fun.)

It's also important to understand the distinction between compile phase and compile time, and between run phase and run time. A typical Perl program gets one compile phase, and then one run phase. A "phase" is a large-scale concept. But compile time and run time are small-scale concepts. A given compile phase does mostly compile-time stuff, but it also does some run-time stuff via BEGIN blocks. A given run phase does mostly run-time stuff, but it can do compile-time stuff through operators like evalSTRING.

In the typical course of events, the Perl compiler reads through your entire program source before execution starts. This is when Perl parses the declarations, statements, and expressions to make sure they're syntactically legal.[3] If it finds a syntax error, the compiler attempts to recover from the error so it can report any other errors later in the source. Sometimes this works, and sometimes it doesn't; syntax errors have a noisy tendency to trigger a cascade of false alarms. Perl bails out in frustration after about 10 errors.

[3] No, there's no formal syntax diagram like a BNF, but you're welcome to peruse the perly.y file in the Perl source tree, which contains the yacc(1) grammar Perl uses. We recommend that you stay out of the lexer, which has been known to induce eating disorders in lab rats.

In addition to the interpreter that processes the BEGIN blocks, the compiler processes your program with the connivance of three notional agents. The lexer scans for each minimal unit of meaning in your program. These are sometimes called "lexemes", but you'll more often hear them referred to as tokens in texts about programming languages. The lexer is sometimes called a tokener or a scanner, and what it does is sometimes called lexing or tokenizing. The parser then tries to make sense out of groups of these tokens by assembling them into larger constructs, such as expressions and statements, based on the grammar of the Perl language. The optimizer rearranges and reduces these larger groupings into more efficient sequences. It picks its optimizations carefully, not wasting time on marginal optimizations, because the Perl compiler has to be blazing fast when used as a load-and-go compiler.

This doesn't happen in independent stages, but all at once with a lot of cross talk between the agents. The lexer occasionally needs hints from the parser to know which of several possible token types it's looking at. (Oddly, lexical scope is one of the things the lexical analyzer doesn't understand, because that's the other meaning of "lexical".) The optimizer also needs to keep track of what the parser is doing, because some optimizations can't happen until the parse has reached a certain point, like finishing an expression, statement, block, or subroutine.

You may think it odd that the Perl compiler does all these things at once instead of one after another, but it's really just the same messy process you go through to understand natural language on the fly, while you're listening to it or reading it. You don't wait till the end of a chapter to figure out what the first sentence meant. You could think of the following correspondences:

Computer Language	Natural Language
Character	Letter
Token	Morpheme
Term	Word
Expression	Phrase
Statement	Sentence
Block	Paragraph
File	Chapter
Program	Story

Assuming the parse goes well, the compiler deems your input a valid story, er, program. If you use the -c switch when running your program, it prints out a "syntax OK" message and exits. Otherwise, the compiler passes the fruits of its efforts on to other agents. These "fruits" come in the form of a parse tree. Each fruit on the tree--or node, as it's called--represents one of Perl's internal opcodes, and the branches on the tree represent that tree's historical growth pattern. Eventually, the nodes will be strung together linearly, one after another, to indicate the execution order in which the run-time system will visit those nodes.

Each opcode is the smallest unit of executable instruction that Perl can think about. You might see an expression like $a = -($b + $c) as one statement, but Perl thinks of it as six separate opcodes. Laid out in a simplified format, the parse tree for that expression would look like Figure 18-2. The numbers represent the visitation order that the Perl run-time system will eventually follow.

Figure 18.2. Opcode visitation order of $a = -($b + $c)

Perl isn't a one-pass compiler as some might imagine. (One-pass compilers are great at making things easy for the computer and hard for the programmer.) It's really a multipass, optimizing compiler consisting of at least three different logical passes that are interleaved in practice. Passes 1 and 2 run alternately as the compiler repeatedly scurries up and down the parse tree during its construction; pass 3 happens whenever a subroutine or file is completely parsed. Here are those passes:

Pass 1: Bottom-Up Parsing

During this pass, the parse tree is built up by the yacc(1) parser using the tokens it's fed from the underlying lexer (which could be considered another logical pass in its own right). Bottom-up just means that the parser knows about the leaves of the tree before it knows about its branches and root. It really does figure things out from bottom to top in Figure 18-2, since we drew the root at the top, in the idiosyncratic fashion of computer scientists. (And linguists.)

As each opcode node is constructed, per-opcode sanity checks verify correct semantics, such as the correct number and types of arguments used to call built-in functions. As each subsection of the tree takes shape, the optimizer considers what transformations it can apply to the entire subtree now beneath it. For instance, once it knows that a list of values is being fed to a function that takes a specific number of arguments, it can throw away the opcode that records the number of arguments for functions that take a varying number of arguments. A more important optimization, known as constant folding, is described later in this section.

This pass also constructs the node visitation order used later for execution, which is a really neat trick because the first place to visit is almost never the top node. The compiler makes a temporary loop of opcodes, with the top node pointing to the first opcode to visit. When the top-level opcode is incorporated into something bigger, that loop of opcodes is broken, only to make a bigger loop with the new top node. Eventually the loop is broken for good when the start opcode gets poked into some other structure such as a subroutine descriptor. The subroutine caller can still find that first opcode despite its being way down at the bottom of the tree, as it is in Figure 18-2. There's no need for the interpreter to recurse back down the parse tree to figure out where to start.

Pass 2: Top-Down Optimizer

A person reading a snippet of Perl code (or of English code, for that matter) cannot determine the context without examining the surrounding lexical elements. Sometimes you can't decide what's really going on until you have more information. Don't feel bad, though, because you're not alone: neither can the compiler. In this pass, the compiler descends back down the subtree it's just built to apply local optimizations, the most notable of which is context propagation. The compiler marks subjacent nodes with the appropriate contexts (void, scalar, list, reference, or lvalue) imposed by the current node. Unwanted opcodes are nulled out but not deleted, because it's now too late to reconstruct the execution order. We'll rely on the third pass to remove them from the provisional execution order determined by the first pass.

Pass 3: Peephole Optimizer

Certain units of code have their own storage space in which they keep lexically scoped variables. (Such a space is called a scratchpad in Perl lingo.) These units include evalSTRINGs, subroutines, and entire files. More importantly from the standpoint of the optimizer, they each have their own entry point, which means that while we know the execution order from here on, we can't know what happened before, because the construct could have been called from anywhere. So when one of these units is done being parsed, Perl runs a peephole optimizer on that code. Unlike the previous two passes, which walked the branch structure of the parse tree, this pass traverses the code in linear execution order, since this is basically the last opportunity to do so before we cut the opcode list off from the parser. Most optimizations were already performed in the first two passes, but some can't be.

Assorted late-term optimizations happen here, including stitching together the final execution order by skipping over nulled out opcodes, and recognizing when various opcode juxtapositions can be reduced to something simpler. The recognition of chained string concatenations is one important optimization, since you'd really like to avoid copying a string back and forth each time you add a little bit to the end. This pass doesn't just optimize; it also does a great deal of "real" work: trapping barewords, generating warnings on questionable constructs, checking for code unlikely to be reached, resolving pseudohash keys, and looking for subroutines called before their prototypes had been compiled.

Pass 4: Code Generation

This pass is optional; it isn't used in the normal scheme of things. But if any of the three code generators--B::Bytecode, B::C, and B::CC--are invoked, the parse tree is accessed one final time. The code generators emit either serialized Perl bytecodes used to reconstruct the parse tree later or literal C code representing the state of the compile-time parse tree.

Generation of C code comes in two different flavors. B::C simply reconstructs the parse tree and runs it using the usual runops() loop that Perl itself uses during execution. B::CC produces a linearized and optimized C equivalent of the run-time code path (which resembles a giant jump table) and executes that instead.

During compilation, Perl optimizes your code in many, many ways. It rearranges code to make it more efficient at execution time. It deletes code that can never be reached during execution, like an if (0) block, or the elsifs and the else in an if (1) block. If you use lexically typed variables declared with my ClassName $var or our ClassName $var, and the ClassName package was set up with the use fields pragma, accesses to constant fields from the underlying pseudohash are typo-checked at compile time and converted into array accesses instead. If you supply the sort operator with a simple enough comparison routine, such as {$a <=> $b} or {$b cmp $a}, this is replaced by a call to compiled C code.

Perl's most dramatic optimization is probably the way it resolves constant expressions as soon as possible. For example, consider the parse tree shown in Figure 18-2. If nodes 1 and 2 had both been literals or constant functions, nodes 1 through 4 would have been replaced by the result of that computation, something like Figure 18-3.

Figure 18.3. Constant folding

This is called constant folding. Constant folding isn't limited to simple cases such as turning 2**10 into 1024 at compile time. It also resolves function calls--both built-ins and user-declared subroutines that meet the criteria from the section Section 18.4.1, "Inlining Constant Functions" in Chapter 6, "Subroutines". Reminiscent of FORTRAN compilers' notorious knowledge of their own intrinsic functions, Perl also knows which of its own built-ins to call during compilation. That's why if you try to take the log of 0.0 or the sqrt of a negative constant, you'll incur a compilation error, not a run-time error, and the interpreter is never run at all.[4]

[4] Actually, we're oversimplifying here. The interpreter does get run, because that's how the constant folder is implemented. But it is run immediately at compile time, similar to how BEGIN blocks are executed.

Even arbitrarily complicated expressions are resolved early, sometimes triggering the deletion of complete blocks such as the one here:

if (2 * sin(1)/cos(1) < 3 && somefn()) { whatever() }

No code is generated for what can never be evaluated. Because the first part is always false, neither somefn nor whatever can ever be called. (So don't expect to goto labels inside that block, because it won't even exist at run time.) If somefn were an inlinable constant function, then even switching the evaluation order like this:

if (somefn() && 2 * sin(1)/cos(1) < 3)) { whatever() }

wouldn't change the outcome, since the entire expression still resolves at compile time. If whatever were inlinable, it wouldn't be called at run time, nor even during compilation; its value would just be inserted as though it were a literal constant. You would then incur a warning about a "Useless use of a constant in void context". This might surprise you if you didn't realize it was a constant. However, if whatever were the last statement evaluated in a function called in a nonvoid context (as determined by the optimizer), you wouldn't see the warning.

You can see the final result of the constructed parse tree after all optimization stages with perl -Dx. (The -D switch requires a special, debugging-enabled build of Perl). Also see the section on B::Deparse described below.

All in all, the Perl compiler works hard (but not too hard) to optimize code so that, come run time, overall execution is sped up. It's about time to get your program running, so let's do that now.