home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam  


Book HomeJava and XSLTSearch this book

4.2. Data Types and Variables

Perl has three basic data types: scalars, arrays, and hashes.

Scalars are essentially simple variables. They are preceded by a dollar sign ($). A scalar is either a number, a string, or a reference. (A reference is a scalar that points to another piece of data. References are discussed later in this chapter.) If you provide a string in which a number is expected or vice versa, Perl automatically converts the operand using fairly intuitive rules.

Arrays are ordered lists of scalars accessed with a numeric subscript (subscripts start at 0). They are preceded by an "at" sign (@).

Hashes are unordered sets of key/value pairs accessed with the keys as subscripts. They are preceded by a percent sign (%).

4.2.2. String Interpolation

Strings are sequences of characters. String literals are usually delimited by either single (') or double (") quotes. Double-quoted string literals are subject to backslash and variable interpolation, and single-quoted strings are not (except for \' and \\, used to put single quotes and backslashes into single-quoted strings). You can embed newlines directly in your strings.

Table 4-1 lists all the backslashed or escape characters that can be used in double-quoted strings.

Table 4-1. Double-quoted string representations

Code

Meaning

\n

Newline

\r

Carriage return

\t

Horizontal tab

\f

Form feed

\b

Backspace

\a

Alert (bell)

\e

ESC character

\033

ESC in octal

\x7f

DEL in hexadecimal

\cC

Ctrl-C

\\

Backslash

\"

Double quote

\u

Force next character to uppercase

\l

Force next character to lowercase

\U

Force all following characters to uppercase

\L

Force all following characters to lowercase

\Q

Backslash all following non-alphanumeric characters

\E

End \U, \L, or \Q

Table 4-2 lists alternative quoting schemes that can be used in Perl. These are useful in diminishing the number of commas and quotes you may have to type, and they allow you not to worry about escaping characters such as backslashes when there are many instances in your data. The generic forms allow you to use any non-alphanumeric, non-whitespace characters as delimiters in place of the slash (/). If the delimiters are single quotes, no variable interpolation is done on the pattern. Parentheses, brackets, braces, and angle brackets can be used as delimiters in their standard opening and closing pairs.

Table 4-2. Quoting syntax in Perl

Customary

Generic

Meaning

Interpolation

''

q//

Literal

No

""

qq//

Literal

Yes

''

qx//

Command

Yes

( )

qw//

Word list

No

( )

qr//

Pattern

Yes

//

m//

Pattern match

Yes

s///

s///

Substitution

Yes

y///

tr///

Translation

No

4.2.3. Here Documents

A line-oriented form of quoting is based on the Unix shell "here-document" syntax. Following a <<, you specify a string to terminate the quoted material, and all lines following the current line down to the terminating string are the value of the item. This is of particular importance if you're trying to print something like HTML that would be cleaner to print as a chunk instead of as individual lines. For example:

#!/usr/local/bin/perl -w

my $Price = 'right';
    
print <<"EOF";
The price is $Price.
EOF

The terminating string does not have to be quoted. For example, the previous example could have been written as:

#!/usr/local/bin/perl -w

my $Price = 'right';
    
print <<EOF;
The price is $Price.
EOF

You can assign here documents to a string:

my $assign_this_heredoc =<< "EOS";
This string is assigned to $whatever.
EOS

You can use a here document to execute commands:

#!/usr/local/bin/perl -w

print <<`CMD`;
ls -l
CMD

You can stack here documents:

#!/usr/local/bin/perl -w

print <<"joe", <<"momma"; # You can stack them
I said foo.
joe
I said bar.
momma

One caveat about here documents: you may have noticed in each of these examples that the quoted text is always left-justified. That's because any whitespace used for indentation will be included in the string. For example:

#!/usr/local/bin/perl -w

print <<"    INDENTED";
    Same old, same old.
    INDENTED

Although you can use a trick of including whitespace in the terminating tag to keep it indented (as we did here), the string itself will have the whitespace embedded—in this case, it will be Same old, same old..

4.2.5. Variables

A variable always begins with the character that identifies its type: $, @, or %. Most of the variable names you create can begin with a letter or underscore, followed by any combination of letters, digits, or underscores, up to 255 characters in length. Upper- and lowercase letters are distinct. Variable names that begin with a digit can contain only digits, and variable names that begin with a character other than an alphanumeric or underscore can contain only that character. The latter forms are usually predefined variables in Perl, so it is best to name your variables beginning with a letter or underscore.

Variables have the undef value before they are first assigned or when they become "empty." For scalar variables, undef evaluates to 0 when used as a number, and a zero-length, empty string ("") when used as a string.

Simple variable assignment uses the assignment operator (=) with the appropriate data. For example:

$age = 26;                # Assigns 26 to $age
@date = (8, 24, 70);      # Assigns the three-element list to @date
%fruit = ('apples', 3, 'oranges', 6); 
 # Assigns the list elements to %fruit in key/value pairs

Scalar variables are always named with an initial $, even when referring to a scalar value that is part of an array or hash.

Every variable type has its own namespace. You can, without fear of conflict, use the same name for a scalar variable, an array, or a hash (or, for that matter, a filehandle, a subroutine name, or a label). This means that $foo and @foo are two different variables. It also means that $foo[1] is an element of @foo, not a part of $foo.

4.2.6. Scalar and List Contexts

Every operation that you invoke in a Perl script is evaluated in a specific context, and how that operation behaves may depend on the context it is being called in. There are two major contexts: scalar and list. All operators know which context they are in, and some return lists in contexts wanting a list and scalars in contexts wanting a scalar. For example, the localtime function returns a nine-element list in list context:

($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime( );

But in a scalar context, localtime returns the number of seconds since January 1, 1970:

$now = localtime( );

Statements that look confusing are easy to evaluate by identifying the proper context. For example, assigning what is commonly a list literal to a scalar variable:

$a = (2, 4, 6, 8);

gives $a the value 8. The context forces the right side to evaluate to a scalar, and the action of the comma operator in the expression (in the scalar context) returns the value farthest to the right.

Another type of statement that might be confusing is the evaluation of an array or hash variable as a scalar. For example:

$b = @c;

When an array variable is evaluated as a scalar, the number of elements in the array is returned. This type of evaluation is useful for finding the number of elements in an array. The special $#array form of an array value returns the index of the last member of the list (one less than the number of elements).

If necessary, you can force a scalar context in the middle of a list by using the scalar function.



Library Navigation Links

Copyright © 2002 O'Reilly & Associates. All rights reserved.