Scoped Declarations (Programming Perl)

4.8. Scoped Declarations

Like global declarations, lexically scoped declarations have an effect at the time of compilation. Unlike global declarations, lexically scoped declarations only apply from the point of the declaration through the end of the innermost enclosing scope (block, file, or eval--whichever comes first). That's why we call them lexically scoped, though perhaps "textually scoped" would be more accurate, since lexical scoping has little to do with lexicons. But computer scientists the world over know what "lexically scoped" means, so we perpetuate the usage here.

Perl also supports dynamically scoped declarations. A dynamic scope also extends to the end of the innermost enclosing block, but in this case "enclosing" is defined dynamically at run time rather than textually at compile time. To put it another way, blocks nest dynamically by invoking other blocks, not by including them. This nesting of dynamic scopes may correlate somewhat to the nesting of lexical scopes, but the two are generally not identical, especially when any subroutines have been invoked.

We mentioned that some aspects of use could be considered global declarations, but other aspects of use are lexically scoped. In particular, use not only imports package symbols but also implements various magical compiler hints, known as pragmas (or if you're into classical forms, pragmata). Most pragmas are lexically scoped, including the use strict 'vars' pragma which forces you to declare your variables before you can use them. See the later section Section 4.9, "Pragmas".

A package declaration, oddly enough, is itself lexically scoped, despite the fact that a package is a global entity. But a package declaration merely declares the identity of the default package for the rest of the enclosing block. Undeclared, unqualified variable names[5] are looked up in that package. In a sense, a package is never declared at all, but springs into existence when you refer to something that belongs to that package. It's all very Perlish.

[5]Also unqualified names of subroutines, filehandles, directory handles, and formats.

4.8.1. Scoped Variable Declarations

Most of the rest of the chapter is about using global variables. Or rather, it's about not using global variables. There are various declarations that help you not use global variables--or at least, not use them foolishly.

We already mentioned the package declaration, which was introduced into Perl long ago to allow globals to be split up into separate packages. This works pretty well for certain kinds of variables. Packages are used by libraries, modules, and classes to store their interface data (and some of their semi-private data) to avoid conflicting with variables and functions of the same name in your main program or in other modules. If you see someone write $Some::stuff,[6] they're using the $stuff scalar variable from the package Some. See Chapter 10, "Packages".

[6] Or the archaic $Some'stuff, which probably shouldn't be encouraged outside of Perl poetry.

If this were all there were to the matter, Perl programs would quickly become unwieldy as they got longer. Fortunately, Perl's three scoping declarations make it easy to create completely private variables (using my), to give selective access to global ones (using our), and to provide temporary values to global variables (using local):

my $nose;
our $House;
local $TV_channel;

If more than one variable is listed, the list must be placed in parentheses. For my and our, the elements may only be simple scalar, array, or hash variables. For local, the constraints are somewhat more relaxed: you may also localize entire typeglobs and individual elements or slices of arrays and hashes:

my ($nose, @eyes, %teeth);
our ($House, @Autos, %Kids);
local (*Spouse, $phone{HOME});

Each of these modifiers offers a different sort of "confinement" to the variables they modify. To oversimplify slightly: our confines names to a scope, local confines values to a scope, and my confines both names and values to a scope.

Each of these constructs may be assigned to, though they differ in what they actually do with the values, since they have different mechanisms for storing values. They also differ somewhat if you don't (as we didn't above) assign any values to them: my and local cause the variables in question to start out with values of undef or (), as appropriate; our, on the other hand, leaves the current value of its associated global unchanged.

Syntactically, my, our, and local are simply modifiers (like adjectives) on an lvalue expression. When you assign to a modified lvalue, the modifier doesn't change whether the lvalue is viewed as a scalar or a list. To figure how the assignment will work, just pretend that the modifier isn't there. So either of:

my ($foo) = <STDIN>;
my @array = <STDIN>;

supplies a list context to the righthand side, while:

my $foo = <STDIN>;

supplies a scalar context.

Modifiers bind more tightly (with higher precedence) than the comma does. The following example erroneously declares only one variable, not two, because the list following the modifier is not enclosed in parentheses.

my $foo, $bar = 1;              # WRONG

This has the same effect as:

my $foo;
$bar = 1;

You'll get a warning about the mistake if warnings are enabled, whether via the -w or -W command-line switches, or, preferably, through the use warnings declaration explained later in Section 4.9, "Pragmas".

In general, it's best to declare a variable in the smallest possible scope that suits it. Since variables declared in a control-flow statement are visible only in the block governed by that statement, their visibility is reduced. It reads better in English this way, too.

sub check_warehouse {
    for my $widget (our @Current_Inventory) {
        print "I have a $widget in stock today.\n";
    }
}

The most frequently seen form of declaration is my, which declares lexically scoped variables for which both the names and values are stored in the current scope's temporary scratchpad and may not be accessed globally. Closely related is the our declaration, which enters a lexically scoped name in the current scope, just as my does, but actually refers to a global variable that anyone else could access if they wished. In other words, it's a global variable masquerading as a lexical.

The other form of scoping, dynamic scoping, applies to local variables, which despite the word "local" are really global variables and have nothing to do with the local scratchpad.

4.8.2. Lexically Scoped Variables: my

To help you avoid the maintenance headaches of global variables, Perl provides lexically scoped variables, often called lexicals for short. Unlike globals, lexicals guarantee you privacy. Assuming you don't hand out references to these private variables that would let them be fiddled with indirectly, you can be certain that every possible access to these private variables is restricted to code within one discrete and easily identifiable section of your program. That's why we picked the keyword my, after all.

A statement sequence may contain declarations of lexically scoped variables. Such declarations tend to be placed at the front of the statement sequence, but this is not a requirement. In addition to declaring variable names at compile time, the declarations act like ordinary run-time statements: each of them is elaborated within the sequence of statements as if it were an ordinary statement without the modifier:

my $name = "fred";
my @stuff = ("car", "house", "club");
my ($vehicle, $home, $tool) = @stuff;

These lexical variables are totally hidden from the world outside their immediately enclosing scope. Unlike the dynamic scoping effects of local (see the next section), lexicals are hidden from any subroutine called from their scope. This is true even if the same subroutine is called from itself or elsewhere--each instance of the subroutine gets its own "scratchpad" of lexical variables.

Unlike block scopes, file scopes don't nest; there's no "enclosing" going on, at least not textually. If you load code from a separate file with do, require, or use, the code in that file cannot access your lexicals, nor can you access lexicals from that file.

However, any scope within a file (or even the file itself) is fair game. It's often useful to have scopes larger than subroutine definitions, because this lets you share private variables among a limited set of subroutines. This is how you create variables that a C programmer would think of as "static":

{
    my $state = 0;

    sub on     { $state = 1 }
    sub off    { $state = 0 }
    sub toggle { $state = !$state }
}

The evalSTRING operator also works as a nested scope, since the code in the eval can see its caller's lexicals (as long as the names aren't hidden by identical declarations within the eval's own scope). Anonymous subroutines can likewise access any lexical variables from their enclosing scopes; if they do so, they're what are known as closures.[7] Combining those two notions, if a block evals a string that creates an anonymous subroutine, the subroutine becomes a closure with full access to the lexicals of both the eval and the block, even after the eval and the block have exited. See the section Section 4.3.7, "Closures" in Chapter 8, "References".

[7]As a mnemonic, note the common element between "enclosing scope" and "closure". (The actual definition of closure comes from a mathematical notion concerning the completeness of sets of values and operations on those values.)

The newly declared variable (or value, in the case of local) does not show up until the statement after the statement containing the declaration. Thus you could mirror a variable this way:

my $x = $x;

That initializes the new inner $x with the current value $x, whether the current meaning of $x is global or lexical. (If you don't initialize the new variable, it starts out with an undefined or empty value.)

Declaring a lexical variable of a particular name hides any previously declared lexical of the same name. It also hides any unqualified global variable of the same name, but you can always get to the global variable by explicitly qualifying it with the name of the package the global is in, for example, $PackageName::varname.

4.8.3. Lexically Scoped Global Declarations: our

A better way to access globals, especially for programs and modules running under the use strict declaration, is the our declaration. This declaration is lexically scoped in that it applies only through the end of the current scope. But unlike the lexically scoped my or the dynamically scoped local, our does not isolate anything to the current lexical or dynamic scope. Instead, it provides access to a global variable in the current package, hiding any lexicals of the same name that would have otherwise hidden that global from you. In this respect, our variables act just like my variables.

If you place an our declaration outside any brace-delimited block, it lasts through the end of the current compilation unit. Often, though, people put it just inside the top of a subroutine definition to indicate that they're accessing a global variable:

sub check_warehouse {
    our @Current_Inventory;
    my  $widget;
    foreach $widget (@Current_Inventory) {
        print "I have a $widget in stock today.\n";
    }
}

Since global variables are longer in life and broader in visibility than private variables, we like to use longer and flashier names for them than for temporary variable. This practice alone, if studiously followed, can do as much as use strict can toward discouraging the use of global variables, especially in less prestidigitatorial typists.

Repeated our declarations do not meaningfully nest. Every nested my produces a new variable, and every nested local a new value. But every time you use our, you're talking about the same global variable, irrespective of nesting. When you assign to an our variable, the effects of that assignment persist after the scope of the declaration. That's because our never creates values; it just exposes a limited form of access to the global, which lives forever:

our $PROGRAM_NAME = "waiter";
{
    our $PROGRAM_NAME = "server";
    # Code called here sees "server".
    ...
}
# Code executed here still sees "server".

Contrast this with what happens under my or local, where after the block, the outer variable or value becomes visible again:

my $i = 10;
{
    my $i = 99;
    ...
}
# Code compiled here sees outer variable.


local $PROGRAM_NAME = "waiter";
{
    local $PROGRAM_NAME = "server";
    # Code called here sees "server".
    ...
}
# Code executed here sees "waiter" again.

It usually only makes sense to assign to an our declaration once, probably at the very top of the program or module, or, more rarely, when you preface the our with a local of its own:

{
    local our @Current_Inventory = qw(bananas);
    check_warehouse();  # no, we haven't no bananas :-)
}

4.8.4. Dynamically Scoped Variables: local

Using a local operator on a global variable gives it a temporary value each time local is executed, but it does not affect that variable's global visibility. When the program reaches the end of that dynamic scope, this temporary value is discarded and the original value restored. But it's always still a global variable that just happens to hold a temporary value while that block is executing. If you call some other function while your global contains the temporary value and that function accesses that global variable, it sees the temporary value, not the original one. In other words, that other function is in your dynamic scope, even though it's presumably not in your lexical scope.[8]

[8] That's why lexical scopes are sometimes called static scopes: to contrast them with dynamic scopes and emphasize their compile-time determinability. Don't confuse this use of the term with how static is used in C or C++. The term is heavily overloaded, which is why we avoid it.

If you have a local that looks like this:

{
    local $var = $newvalue;
    some_func();
    ...
}

you can think of it purely in terms of run-time assignments:

{
    $oldvalue = $var;
    $var = $newvalue;
    some_func();
    ...
}
continue {
    $var = $oldvalue;
}

The difference is that with local the value is restored no matter how you exit the block, even if you prematurely return from that scope. The variable is still the same global variable, but the value found there depends on which scope the function was called from. That's why it's called dynamic scoping--because it changes during run time.

As with my, you can initialize a local with a copy of the same global variable. Any changes to that variable during the execution of a subroutine (and any others called from within it, which of course can still see the dynamically scoped global) will be thrown away when the subroutine returns. You'd certainly better comment what you are doing, though:

# WARNING: Changes are temporary to this dynamic scope.
local $Some_Global = $Some_Global;

A global variable then is still completely visible throughout your whole program, no matter whether it was explicitly declared with our or just allowed to spring into existence, or whether it's holding a local value destined to be discarded when the scope exits. In tiny programs, this isn't so bad, but for large ones, you'll quickly lose track of where in the code all these global variables are being used. You can forbid accidental use of globals, if you want, through the use strict 'vars' pragma, described in the next section.

Although both my and local confer some degree of protection, by and large you should prefer my over local. Sometimes, though, you have to use local so you can temporarily change the value of an existing global variable, like those listed in Chapter 28, "Special Names". Only alphanumeric identifiers may be lexically scoped, and many of those special variables aren't strictly alphanumeric. You also need to use local to make temporary changes to a package's symbol table as shown in the section Section 4.1, "Symbol Tables" in Chapter 10, "Packages". Finally, you can use local on a single element or a whole slice of an array or a hash. This even works if the array or hash happens to be a lexical variable, layering local's dynamic scoping behavior on top of those lexicals. We won't talk much more about the semantics of local here. See local in Chapter 29, "Functions" for more information.