2.6 Statements and DeclarationsA Perl program consists of a sequence of declarations and statements. A declaration may be placed anywhere a statement may be placed, but it has its primary (or only) effect at compile time. (Some declarations do double duty as ordinary statements, while others are totally transparent at run-time.) After compilation, the main sequence of statements is executed just once, unlike in sed and awk scripts, where the sequence of statements is executed for each input line. While this means that you must explicitly loop over the lines of your input file (or files), it also means you have much more control over which files and which lines you look at.[ 36 ] Unlike many high-level languages, Perl requires only subroutines and report formats to be explicitly declared. All other user-created objects spring into existence with a null or 0 value unless they are defined by some explicit operation such as assignment.[ 37 ]
You
may
declare your variables though, if you like. You may even
make it an error to use an undeclared variable. This kind of discipline
is fine, but you have to declare that you want the discipline. (This
seems appropriate, somehow.) See 2.6.1 Simple StatementsA simple statement is an expression evaluated for its side effects. Every simple statement must end in a semicolon, unless it is the final statement in a block. In this case, the semicolon is optional (but strongly encouraged in any multiline block, since you may eventually add another line).
Even though some operators (like Any simple statement may optionally be followed by a single modifier, just before the terminating semicolon (or block ending). The possible modifiers are:
if The if and unless modifiers work pretty much as you'd expect if you speak English:
$trash->take('out') if $you_love_me; shutup() unless $you_want_me_to_leave; The while and until modifiers evaluate repeatedly as long as the modifier is true:
$expression++ while -e "$file$expression"; kiss('me') until $I_die;
The
while
and
until
modifiers also have the usual while-loop
semantics (conditional evaluated first), except when applied to a
do { $line = <STDIN>; ... } until $line eq ".\n"; See the do entry in Chapter 3 . Note also that the loop-control statements described later will not work in this construct, since modifiers don't take loop labels. Sorry. You can always wrap another block around it to do that sort of thing. Or write a real loop with multiple loop-control commands inside. Speaking of real loops, we'll talk about compound statements next. 2.6.2 Compound Statements
A sequence of statements that defines a scope is called a
block
.
Sometimes a block is delimited by the file containing it (in the
case of either a "
require
d" file, or the program as a whole), and
sometimes it's delimited by the extent of a string (in the case of an
eval
). But
generally, a block is delimited by braces (
Compound statements are built out of expressions and The following conditionals and loops may be used to control flow:
if (
Note that unlike in C and Pascal, these are defined in terms of
if (!open(FOO, $foo)) { die "Can't open $foo: $!"; } die "Can't open $foo: $!" unless open(FOO, $foo); open(FOO, $foo) or die "Can't open $foo: $!"; # FOO or bust! open(FOO, $foo) ? 'hi mom' : die "Can't open $foo: $!"; # a bit exotic, that last one Your readers would tend to prefer the third of those under most circumstances. 2.6.3 If Statements
The
if
statement is straightforward. Since If you use unless in place of if , the sense of the test is reversed. That is:
unless ($OS_ERROR) ... is equivalent to:[ 38 ]
if (not $OS_ERROR) ... 2.6.4 Loop Statements
All compound loop statements have an optional 2.6.4.1 While statements
The
while
statement repeatedly executes the block as long as the
The while statement has an optional extra block on the end called a continue block. This is a block that is executed every time the block is continued, either by falling off the end of the first block, or by an explicit loop-control command that goes to the next iteration. The continue block is not heavily used in practice, but it's in there so we can define the for loop rigorously. So let's do that. 2.6.4.2 For loopsThe C-style for loop has three semicolon-separated expressions within its parentheses. These three expressions function respectively as the initialization, the condition, and the re-initialization expressions of the loop. (All three expressions are optional, and the condition, if omitted, is assumed to be true.) The for loop can be defined in terms of the corresponding while loop. Thus, the following:
for ($i = 1; $i < 10; $i++) { ... } is the same as:
$i = 1; while ($i < 10) { ... } continue { $i++; } (Defining the for loop in terms of a continue block allows us to preserve the correct semantics even when the loop is continued via a next statement. This is unlike C, in which there is no way to write the exact equivalent of a continued for loop without chicanery.) If you want to iterate through two variables simultaneously, just separate the parallel expressions with commas:
for ($i = 0, $bit = 1; $mask & $bit; $i++, $bit <<= 1) { print "Bit $i is set\n"; } Besides the normal array index looping, for can lend itself to many other interesting applications. There doesn't even have to be an explicit loop variable. Here's one example that avoids the problem you get into if you explicitly test for end-of-file on an interactive file descriptor, causing your program to appear to hang.
$on_a_tty = -t STDIN && -t STDOUT; sub prompt { print "yes? " if $on_a_tty } for ( prompt(); <STDIN>; prompt() ) { # do something } One final application for the for loop results from the fact that all three expressions are optional. If you do leave all three expressions out, you have written an "infinite" loop in a way that is customary in the culture of both Perl and C:
for (;;) { ... } If the notion of infinite loops bothers you, we should point out that you can always terminate such a loop from the inside with an appropriate loop-control command. Of course, if you're writing the code to control a cruise missile, you may not actually need to write a loop exit. The loop will be terminated automatically at the appropriate moment.[ 39 ]
2.6.4.3 Foreach loops
The
foreach
loop iterates over a list value and sets the
control variable (
foreach The variable is implicitly local to the loop and regains its former value upon exiting the loop. If the variable was previously declared with my , that variable instead of the global one is used, but it's still localized to the loop.
The
foreach
keyword is actually a synonym for the
for
keyword, so
you can use
foreach
for readability or
for
for brevity. If
for (@ary) { s/ham/turkey/ } # substitution
foreach $elem (@elements) { # multiply by 2
$elem *= 2;
}
for $count (10,9,8,7,6,5,4,3,2,1,'BOOM') { # do a countdown
print $count, "\n"; sleep(1);
}
for $count (reverse 'BOOM', 1..10) { # same thing
print $count, "\n"; sleep(1);
}
for $item (split /:[\\\n:]*/, $TERMCAP) { # any
That last one is the canonical way to print out the values of a hash in sorted order. Note that there is no way with foreach to tell where you are in a list. You can compare adjacent elements by remembering the previous one in a variable, but sometimes you just have to break down and write an ordinary for loop with subscripts. That's what for is there for, after all. Here's how a C programmer might code up a particular algorithm in Perl:
for ($i = 0; $i < @ary1; $i++) { for ($j = 0; $j < @ary2; $j++) { if ($ary1[$i] > $ary2[$j]) { last; # can't go to outer :-( } $ary1[$i] += $ary2[$j]; } # this is where that last takes me } Whereas here's how a Perl programmer more comfortable with list processing might do it:
WID: foreach $this (@ary1) { JET: foreach $that (@ary2) { next WID if $this > $that; $this += $that; } } See how much easier this is? It's cleaner, safer, and faster. It's cleaner because it's less noisy. It's safer because if code gets added between the inner and outer loops later on, the new code won't be accidentally executed: next explicitly iterates the other loop rather than merely terminating the inner one. And it's faster because Perl executes a foreach statement more rapidly than it would the equivalent for loop because the elements are accessed directly instead of through subscripting. Like the while statement, the foreach statement can also take a continue block. We keep dropping hints about next , but now we're going to explain it. 2.6.4.4 Loop control
We mentioned that you can put a
Loops are typically named for the item the loop is processing on each
iteration. This interacts nicely with the loop-control commands, which
are designed to read like English when used with an appropriate label
and a statement modifier. The archetypical loop processes lines,
so the
archetypical loop label is
next LINE if /^#/; # discard comments The syntax for the loop-control commands is:
last
The
The
last
command is like the
LINE: while (<STDIN>) { last LINE if /^$/; # exit when done with header ... }
The
next
command is like the
LINE: while (<STDIN>) { next LINE if /^#/; # skip comments next LINE if /^$/; # skip blank lines ... } continue { $count++; } The redo command restarts the loop block without evaluating the conditional again. The continue block, if any, is not executed. This command is normally used by programs that want to lie to themselves about what was just input. Suppose you are processing a file like /etc/termcap . If your input line ends with a backslash to indicate continuation, skip ahead and get the next record.
while (<>) { chomp; if (s/\\$//) { $_ .= <>; redo; } # now process $_ } which is Perl shorthand for the more explicitly written version:
LINE: while ($line = <ARGV>) { chomp($line); if ($line =~ s/\\$//) { $line .= <ARGV>; redo LINE; } # now process $line }
One more point about loop-control commands. You may have noticed that
we are not calling them "statements". That's because they aren't
statements, though they can be used for statements. (This is unlike C,
where
open FILE, $file or warn "Can't open $file: $!\n", next FILE; # WRONG
The intent is fine, but the
open FILE, $file or warn("Can't open $file: $!\n"), next FILE; # okay 2.6.5 Bare Blocks and Case Structures
A The bare block is particularly nice for doing case structures (multiway switches).
SWITCH: { if (/^abc/) { $abc = 1; last SWITCH; } if (/^def/) { $def = 1; last SWITCH; } if (/^xyz/) { $xyz = 1; last SWITCH; } $nothing = 1; } There is no official switch statement in Perl, because there are already several ways to write the equivalent. In addition to the above, you could write: [ 42 ]
SWITCH: { $abc = 1, last SWITCH if /^abc/; $def = 1, last SWITCH if /^def/; $xyz = 1, last SWITCH if /^xyz/; $nothing = 1; } or:
SWITCH: { /^abc/ && do { $abc = 1; last SWITCH; }; /^def/ && do { $def = 1; last SWITCH; }; /^xyz/ && do { $xyz = 1; last SWITCH; }; $nothing = 1; } or, formatted so it stands out more as a "proper" switch statement:
SWITCH: { /^abc/ && do { $abc = 1; last SWITCH; }; /^def/ && do { $def = 1; last SWITCH; }; /^xyz/ && do { $xyz = 1; last SWITCH; }; $nothing = 1; } or:
SWITCH: { /^abc/ and $abc = 1, last SWITCH; /^def/ and $def = 1, last SWITCH; /^xyz/ and $xyz = 1, last SWITCH; $nothing = 1; } or even, horrors:
if (/^abc/) { $abc = 1 } elsif (/^def/) { $def = 1 } elsif (/^xyz/) { $xyz = 1 } else { $nothing = 1 } You might think it odd to write a loop over a single value, but a common idiom for a switch statement is to use foreach 's aliasing capability to make a temporary assignment to $_ for convenient matching:
for ($some_ridiculously_long_variable_name) { /In Card Names/ and do { push @flags, '-e'; last; }; /Anywhere/ and do { push @flags, '-h'; last; }; /In Rulings/ and do { last; }; die "unknown value for form variable where: `$where'"; }
Notice how the
last
commands in that example ignore the
2.6.6 Goto
Although not for the faint of heart (or the pure of heart, for that
matter), Perl does support a
goto
command. There are three forms:
The
The
goto ("FOO", "BAR", "GLARCH")[$i]; In almost all cases like this, it's usually a far, far better idea to use the structured control flow mechanisms of next , last , or redo instead of resorting to a goto . For certain applications, a hash of function pointers or the catch-and-throw pair of eval and die for exception processing can also be prudent approaches.
The 2.6.7 Global DeclarationsSubroutine and format declarations are global declarations. No matter where you place them, they declare global thingies (actually, package thingies, but packages are global) that are visible from everywhere. Global declarations can be put anywhere a statement can, but have no effect on the execution of the primary sequence of statements - the declarations take effect at compile time. Typically the declarations are put at the beginning or the end of your program, or off in some other file. However, if you're using lexically scoped private variables created with my , you'll want to make sure your format or subroutine definition is within the same block scope as the my if you expect to be able to access those private variables.[ 43 ] Formats are bound to a filehandle and accessed implicitly via the write function. For more on formats, see "Formats" later in this chapter.
Subroutines are generally accessed directly, but don't actually have to
be defined before calls to them can be compiled. The difference between
a subroutine definition and a mere declaration is that the definition
supplies a Declaring a subroutine allows a subroutine name to be used as if it were a list operator from that point forward in the compilation. You can declare a subroutine without defining it by just saying:
sub myname; $me = myname $0 or die "can't get myname";
Note that it functions as a list operator, though, not as a unary
operator, so be careful to use
or
instead of
Subroutine definitions can be loaded from other files with the require statement, but there are two problems with that. First, the other file will typically insert the subroutine names into a package (a namespace) of its own choosing, not your package. Second, a require happens at run-time, so the declaration occurs too late to serve as a declaration in the file invoking the require . A more useful way to pull in declarations and definitions is via the use declaration, which essentially performs a require at compile time and then lets you import declarations into your own namespace. Because it is importing names into your own (global) package at compile time, this aspect of use can be considered a kind of global declaration. See Chapter 5 for details on this. 2.6.8 Scoped DeclarationsLike global declarations, lexically scoped declarations have an effect at the time of compilation. Unlike global declarations, lexically scoped declarations have an effect only from the point of the declaration to the end of the innermost enclosing block. That's why we call them lexically scoped, though perhaps "textually scoped" would be more accurate, since lexical scoping has nothing to do with lexicons. But computer scientists the world around know what "lexically scoped" means, so we perpetuate the usage here.
We mentioned that some aspects of
use
could be considered global
declarations, but there are other aspects that are lexically scoped.
In particular,
use
is not only used to perform symbol importation but also to implement various magical
pragmas
(compiler hints). Most such pragmas are lexically scoped, including the A package declaration, oddly enough, is lexically scoped, despite the fact that a package is a global entity. But a package declaration merely declares the identity of the default package for the rest of the enclosing block. Undeclared, unqualified variable names will be looked up in that package. In a sense, a package isn't declared at all, but springs into existence when you refer to a variable that belongs in the package. It's all very Perlish. The most frequently seen form of lexically scoped declaration is the declaration of my variables. A related form of scoping known as dynamic scoping applies to local variables, which are really global variables in disguise. If you refer to a variable that has not been declared, its visibility is global by default, and its lifetime is forever. A variable used at one point in your program is accessible from anywhere else in the program.[ 45 ] If this were all there were to the matter, Perl programs would quickly become unwieldy as they grew in size. Fortunately, you can easily create private variables using my , and semi-private values of global variables using local . A my or a local declares the listed variables (in the case of my ), or the values of the listed global variables (in the case of local ), to be confined to the enclosing block, subroutine, eval , or file. If more than one variable is listed, the list must be placed in parentheses. All listed elements must be legal lvalues. (For my the constraints are even tighter: the elements must be simple scalar, array, or hash variables, and nothing else.) Here are some examples of declarations of lexically scoped variables:
my $name = "fred"; my @stuff = ("car", "house", "club"); my ($vehicle, $home, $tool) = @stuff; (These declarations also happen to perform an initializing assignment at run-time.) A local variable is dynamically scoped , whereas a my variable is lexically scoped . The difference is that any dynamic variables are also visible to functions called from within the block in which those variables are declared. Lexical variables are not. They are totally hidden from the outside world, including any called subroutines (even if it's the same subroutine called from itself or elsewhere - every instance of the subroutine gets its own copy of the variables).[ 46 ] In either event, the variable (or local value) disappears when the program exits the lexical scope in which the my or local finds itself. By and large, you should prefer to use my over local because it's faster and safer. But you have to use local if you want to temporarily change the value of an existing global variable, such as any of the special variables listed at the end of this chapter. Only alphanumeric identifiers may be lexically scoped. We won't talk much more about the semantics of local here. See local in Chapter 3 for more information.
Syntactically, my and local are simply modifiers (adjectives) on an lvalue expression. When you assign to a modified lvalue, the modifier doesn't change whether the lvalue is viewed as a scalar or a list. To figure how the assignment will work, just pretend that the modifier isn't there. So: my ($foo) = <STDIN>; my @FOO = <STDIN>; both supply a list context to the right-hand side, while: my $foo = <STDIN>; supplies a scalar context. The my binds more tightly (with higher precedence) than the comma does. The following only declares one variable because the list following my is not enclosed in parentheses: my $foo, $bar = 1; This has the same effect as: my $foo; $bar = 1; (You'll get a warning about the mistake if you use -w .) The declared variable is not introduced (is not visible) until after the current statement. Thus: my $x = $x;
can be used to initialize the new inner my $x = 123 and $x == 123
is false unless the old Declaring a lexical variable of a particular name hides any previously declared lexical variable of the same name. It also hides any unqualified global variable of the same name, but you can always get to the global variable by explicitly qualifying it with the name of the package the global is in. For example:
A statement sequence may contain declarations of lexically scoped variables, but apart from declaring variable names, the declarations act like ordinary statements, and each of them is elaborated within the sequence of statements as if it were an ordinary statement. 2.6.9 PragmasMany languages allow you to give hints to the compiler. In Perl these hints are conveyed to the compiler with the use declaration. Some of the pragmas are:
use integer use strict use lib use sigtrap use subs use vars All the Perl pragmas are described in Chapter 7 , but we'll talk about some of the more useful ones here. By default, Perl assumes that it must do much of its arithmetic in floating point. But by saying:
use integer; you may tell the compiler that it's okay to use integer operations from here to the end of the enclosing block. An inner block may countermand this by saying:
no integer; which lasts until the end of that inner block. Some users may wish to encourage the use of lexical variables. As an aid to catching implicit references to package variables, if you say:
use strict 'vars'; then any variable reference from there to the end of the enclosing block must either refer to a lexical variable, or must be fully qualified with the package name. A compilation error results otherwise. An inner block may countermand this with:
no strict 'vars'
You can also turn on strict checking of symbolic references and
barewords with this pragma. Often people say
Subroutines and variables that are imported from other modules have
special privileges in Perl. Imported subroutines can
override
many
built-in operators, and imported variables are exempt from
use subs qw(&read &write); and:
use vars qw($fee $fie $foe $foo @sic); Finally, Perl searches for modules in a standard list of locations. You need to be able to add to that list at compile time, because when you use modules they're loaded at compile time, and adding to the list at run-time would be too late. So you can put:
use lib "/my/own/lib/directory"; at the front of your program to do this. Note that these last three pragmas all modify global structures, and can therefore have effects outside of the current lexical scope. |
|