home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam  


Programming Perl, Second Edition

Previous Chapter 2
The Gory Details
Next
 

2.5 Operators

The terms of an expression often need to be combined and modified in various ways, and that's what operators are for. The tightness with which operators bind is controlled by the precedence of the operators. Perl operators have the following associativity and precedence, listed from highest precedence to lowest.[27]

[27] Classic Camel readers will note that we reversed this table from the old edition. The higher precedence operators are now higher on the page, which makes some kind of metaphorical sense.

Associativity Operators
Left Terms and list operators (leftward)
Left ->
Nonassociative ++ - -
Right **
Right ! ~ \ and unary + and -
Left =~ !~
Left * / % x
Left + - .
Left << >>
Nonassociative Named unary operators
Nonassociative < > <= >= lt gt le ge
Nonassociative == != <=> eq ne cmp
Left &
Left | ^
Left &&
Left ||
Nonassociative .
Right ?:
Right = += -= *= and so on
Left , =>
Nonassociative List operators (rightward)
Right not
Left and
Left or xor

It may seem like there are too many precedence levels. Well, you're right, there are. Fortunately, there are two things going for you here. First, the precedence levels as they're defined usually follow your intuition, presuming you're not psychotic. And second, if you're merely neurotic, you can always put in extra parentheses to relieve your anxiety.

Note that any operators borrowed from C keep the same precedence relationship with each other, even where C's precedence is slightly screwy. (This makes learning Perl easier for C folks.)

In the following sections, these operators are covered in precedence order. With very few exceptions, these all operate on scalar values only, not list values. We'll mention the exceptions as they come up.

Terms and List Operators (Leftward)

Any term is of highest precedence in Perl. These include variables, quote and quotelike operators, any expression in parentheses, and any function whose arguments are parenthesized. Actually, there aren't really any functions in this sense, just list operators and unary operators behaving as functions because you put parentheses around their arguments. These operators are all covered in Chapter 3, Functions.

Now, listen carefully. Here are a couple of rules that are very important and simplify things greatly, but may occasionally produce counterintuitive results for the unwary. If any list operator (such as print) or any named unary operator (such as chdir) is followed by a left parenthesis as the next token on the same line,[28] the operator and its arguments within parentheses are taken to be of highest precedence, just like a normal function call. The rule is: If it looks like a function call, it is a function call. You can make it look like a non-function by prefixing the arguments with a unary plus, which does absolutely nothing, semantically speaking--it doesn't even convert the argument to numeric.

[28] And we nearly had you convinced Perl was a free-form language.

For example, since || has lower precedence than chdir, we get:

chdir $foo    || die;       # (chdir $foo) || die
chdir($foo)   || die;       # (chdir $foo) || die
chdir ($foo)  || die;       # (chdir $foo) || die
chdir +($foo) || die;       # (chdir $foo) || die

but, because * has higher precedence than chdir, we get:

chdir $foo * 20;            # chdir ($foo * 20)
chdir($foo) * 20;           # (chdir $foo) * 20
chdir ($foo) * 20;          # (chdir $foo) * 20
chdir +($foo) * 20;         # chdir ($foo * 20)

Likewise for numeric operators:

rand 10 * 20;               # rand (10 * 20)
rand(10) * 20;              # (rand 10) * 20
rand (10) * 20;             # (rand 10) * 20
rand +(10) * 20;            # rand (10 * 20)

In the absence of parentheses, the precedence of list operators such as print, sort, or chmod is either very high or very low depending on whether you look at the left side of the operator or the right side of it. (That's what the "Leftward" is doing in the title of this section.) For example, in:

@ary = (1, 3, sort 4, 2);
print @ary;         # prints 1324

the commas on the right of the sort are evaluated before the sort, but the commas on the left are evaluated after. In other words, a list operator tends to gobble up all the arguments that follow it, and then act like a simple term with regard to the preceding expression. Note that you have to be careful with parentheses:

# These evaluate exit before doing the print:
print($foo, exit);  # Obviously not what you want.
print $foo, exit;   # Nor is this.
# These do the print before evaluating exit:
(print $foo), exit; # This is what you want.
print($foo), exit;  # Or this.
print ($foo), exit; # Or even this.

Also note that:

print ($foo & 255) + 1, "\n";   # prints ($foo & 255)

probably doesn't do what you expect at first glance. Fortunately, mistakes of this nature generally produce warnings like "Useless use of addition in a void context" when you use the -w command-line switch.

Also parsed as terms are the do {} and eval {} constructs, as well as subroutine and method calls, the anonymous array and hash composers [] and {}, and the anonymous subroutine composer sub {}.

The Arrow Operator

Just as in C and C++, -> is an infix dereference operator. If the right side is either a [...] or {...} subscript, then the left side must be either a hard or symbolic reference to an array or hash (or a location capable of holding a hard reference, if it's an lvalue (assignable)). More on this in Chapter 4, References and Nested Data Structures.

Otherwise, the right side must be a method name or a simple scalar variable containing the method name, and the value of the left side must either be an object (a blessed reference) or a class name (that is, a package name). See Chapter 5, Packages, Modules, and Object Classes.

Autoincrement and Autodecrement

The ++ and - - operators work as in C. That is, if placed before a variable, they increment or decrement the variable before returning the value, and if placed after, they increment or decrement the variable after returning the value. For example, $a++ increments the value of scalar variable $a, returning the value before it performs the increment. Similarly, - -$b{(/(\w+)/)[0]} decrements the element of the hash %b indexed by the first "word" in the default search variable ($_ ) and returns the value after the decrement.[29]

[29] OK, so that wasn't exactly fair. We just wanted to make sure you were paying attention. Here's how that expression works. First the pattern match finds the first word in $_ using the regular expression \w+. The parentheses around that causes the word to be returned as a single-element list value, because the pattern match is in a list context. The list context is supplied by the list slice operator, (...)[0], which returns the first (and only) element of the list. That value is then used as the key for the hash, and the hash entry (value) is decremented and returned. In general, when confronted with a complex expression, analyze it from the inside out to see what order things happen in.

The autoincrement operator has a little extra built-in magic to it. If you increment a variable that is numeric, or that has ever been used in a numeric context, you get a normal increment. If, however, the variable has only been used in string contexts since it was set, and has a value that is not null and matches the pattern /^[a-zA-Z]*[0-9]*$/, the increment is done as a string, preserving each character within its range, with carry:

print ++($foo = '99');      # prints '100'
print ++($foo = 'a0');      # prints 'a1'
print ++($foo = 'Az');      # prints 'Ba'
print ++($foo = 'zz');      # prints 'aaa'

The autodecrement operator, however, is not magical.

Exponentiation

Binary ** is the exponentiation operator. Note that it binds even more tightly than unary minus, so -2**4 is -(2**4), not (-2)**4. The operator is implemented using C's pow (3) function, which works with doubles internally. It calculates using logarithms, which means that it works with fractional powers, but you sometimes get results that aren't as exact as a straight multiplication would produce.

Ideographic Unary Operators

Most unary operators just have names (see "Named Unary and File Test Operators" below), but some operators are deemed important enough to merit their own special symbolic representation. Most of these operators seem to have something to do with negation. Blame the mathematicians.

Unary ! performs logical negation, that is, "not". See also not for a lower precedence version of this. The value of a negated operation is 1 if the operand is false (numeric 0, string "0", null string, or undefined); otherwise, the value is that of the null string.

Unary - performs arithmetic negation if the operand is numeric. If the operand is an identifier, a string consisting of a minus sign concatenated with the identifier is returned. Otherwise, if the string starts with a plus or minus, a string starting with the opposite sign is returned. One effect of these rules is that -bareword is equivalent to `-bareword`. This is most useful for Tk and CGI programmers.

Unary ~ performs bitwise negation, that is, 1's complement. For example, on a 32-bit machine, ~123 is 4294967172. But you knew that already.

(What you perhaps didn't know is that if the argument to ~ happens to be a string instead of a number, a string of identical length is returned, but with all the bits of the string complemented. This is a fast way to flip a lot of bits all at once. See also the bitwise logical operators, which also have stringish variants.)

Unary + has no semantic effect whatsoever, even on strings. It is syntactically useful for separating a function name from a parenthesized expression that would otherwise be interpreted as the complete list of function arguments. (See examples above under the section "Terms and List Operators".)

Unary \ creates a reference to whatever follows it (see Chapter 4, References and Nested Data Structures). Do not confuse this behavior with the behavior of backslash within a string, although both forms do convey the notion of protecting the next thing from interpretation. This resemblance is not entirely accidental.

The \ operator may also be used on a parenthesized list value in a list context, in which case it returns references to each element of the list.

Binding Operators

Binary =~ binds a scalar expression to a pattern match, substitution, or translation. These operations search or modify the string $_ by default. The binding operator makes those operations work on some other string instead. The argument on the right is the search pattern, substitution, or translation. The left argument is what is supposed to be searched, substituted, or translated instead of the default $_. The return value indicates the success of the operation. If the right argument is an expression rather than a search pattern, substitution, or translation, it is interpreted as a search pattern at run-time. That is, $_ =~ $pat is equivalent to $_ =~ /$pat/. This is less efficient than an explicit search, since the pattern must be compiled every time the expression is evaluated. (But /$pat/o doesn't recompile it because of the /o modifier.)

Binary !~ is just like =~ except the return value is negated in the logical sense. The following expressions are functionally equivalent:

$string !~ /pattern/
not $string =~ /pattern/

We said that the return value indicates success, but there are many kinds of success. Substitutions return the number of successful substitutions, as do translations. (In fact, the translation operator is often used to count characters.) Since any non-zero result is true, it all works out. The most spectacular kind of true value is a list value: in a list context, pattern matches can return substrings matched by the parentheses in the pattern. But again, according to the rules of list assignment, the list assignment itself will return true if anything matched and was assigned, and false otherwise. So you sometimes see things like:

if ( ($k,$v) = $string =~ m/(\w+)=(\w*)/ ) {
    print "KEY $k VALUE $v\n";
}

Let's pick that apart. The =~ binds $string to the pattern match on the right, which is scanning for occurrences of things that look like KEY=VALUE in your string. It's in a list context because it's on the right side of a list assignment. If it matches, it does a list assignment to $k and $v. The list assignment itself is in a scalar context, so it returns 2, the number of values on the right side of the assignment. And 2 happens to be true, since our scalar context is also a Boolean context. When the match fails, no values are assigned, which returns 0, which is false.

Multiplicative Operators

Perl provides the C-like operators * (multiply), / (divide), and % (modulus). The * and / work exactly as you might expect, multiplying or dividing their two operands. Division is done in floating-point, unless you've used the integer library module.

The % operator converts its operands to integers before finding the remainder according to integer division. For the same operation in floating-point, you may prefer to use the fmod( ) function from the POSIX module (see Chapter 7, The Standard Perl Library).

Binary x is the repetition operator. In scalar context, it returns a concatenated string consisting of the left operand repeated the number of times specified by the right operand.

print '-' x 80;             # print row of dashes
print "\t" x ($tab/8), ' ' x ($tab%8);      # tab over

In list context, if the left operand is a list in parentheses, the x works as a list replicator rather than a string replicator. This is useful for initializing all the elements of an array of indeterminate length to the same value:

@ones = (1) x 80;           # a list of 80 1's
@ones = (5) x @ones;        # set all elements to 5

Similarly, you can also use x to initialize array and hash slices:

@keys = qw(perls before swine);
@hash{@keys} = ("") x @keys;

If this mystifies you, note that @keys is being used both as a list on the left side of the assignment, and as a scalar value (returning the array length) on the right side of the assignment. The above has the same effect on %hash as:

$hash{perls}  = "";
$hash{before} = "";
$hash{swine}  = "";

Additive Operators

Strangely enough, Perl also has the customary + (addition) and - (subtraction) operators. Both operators convert their arguments from strings to numeric values if necessary, and return a numeric result.

Additionally, Perl provides a string concatenation operator ".". For example:

$almost = "Fred" . "Flintstone";    # returns FredFlintstone

Note that Perl does not place a space between the strings being concatenated. If you want the space, or if you have more than two strings to concatenate, you can use the join operator, described in Chapter 3, Functions. Most often, though, people do their concatenation implicitly inside a double-quoted string:

$fullname = "$firstname $lastname";

Shift Operators

The bit-shift operators (<< and >>) return the value of the left argument shifted to the left (<<) or to the right (>>) by the number of bits specified by the right argument. The arguments should be integers. For example:

1 << 4;     # returns 16
32 >> 4;    # returns 2

Named Unary and File Test Operators

Some of "functions" described in Chapter 3, Functions are really unary operators, including:

-X (file tests) gethostbyname localtime rmdir
alarm getnetbyname log scalar
caller getpgrp lstat sin
chdir getprotobyname my sleep
chroot glob oct sqrt
cos gmtime ord srand
defined goto quotemeta stat
delete hex rand uc
do int readlink ucfirst
eval lc ref umask
exists lcfirst require undef
exit length reset  
exp local return  

These are all unary operators, with a higher precedence than some of the other binary operators. For example:

sleep 4 | 3;

does not sleep for 7 seconds; it sleeps for 4 seconds, and then takes the return value of sleep (typically zero) and ORs that with 3, as if the expression were parenthesized as:

(sleep 4) | 3;

Compare this with:

print 4 | 3;

which does take the value of 4 ORed with 3 before printing it (7 in this case), as if it were written:

print (4 | 3);

This is because print is a list operator, not a simple unary operator. Once you've learned which operators are list operators, you'll have no trouble telling them apart. When in doubt, you can always use parentheses to turn a named unary operator into a function. Remember, if it looks like a function, it is a function.

Another funny thing about named unary operators is that many of them default to $_ if you don't supply an argument. However. If the thing following the named unary operator looks like it might be the start of an argument, Perl will get confused. When the next character in your program is one of the following characters, the Perl tokener returns different token types depending on whether a term or operator is expected:

Char Operator Term
+ Addition Unary plus
- Subtraction Unary minus
* Multiplication *typeglob
/ Division /pattern/
< Less than, left shift <HANDLE>, <<END
. Concatenation .3333
? ?: ?pattern?
% Modulo %assoc
& &, && &subroutine

So a typical boo-boo is:

next if length < 80;

in which the < looks to the parser like the beginning of the <> input symbol (a term) instead of the "less than" (an operator) you were thinking of. There's really no way to fix this, and still keep Perl pathologically eclectic. If you're so incredibly lazy that you cannot bring yourself to type the two characters $_, then say one of these instead:

next if length() < 80;
next if (length) < 80;
next if 80 > length;
next unless length >= 80;

A file test operator is a unary operator that takes one argument, either a filename or a filehandle, and tests the associated file to see if something is true about it. If the argument is omitted, it tests $_, except for -t, which tests STDIN. Unless otherwise documented, it returns 1 for true and `` for false, or the undefined value if the file doesn't exist. The operator may be any of the following:

Operator Meaning
-r File is readable by effective uid/gid.
-w File is writable by effective uid/gid.
-x File is executable by effective uid/gid.
-o File is owned by effective uid.
   
-R File is readable by real uid/gid.
-W File is writable by real uid/gid.
-X File is executable by real uid/gid.
-O File is owned by real uid.
   
-e File exists.
-z File has zero size.
-s File has non-zero size (returns size).
   
-f File is a plain file.
-d File is a directory.
-l File is a symbolic link.
-p File is a named pipe (FIFO).
-S File is a socket.
-b File is a block special file.
-c File is a character special file.
-t Filehandle is opened to a tty.
   
-u File has setuid bit set.
-g File has setgid bit set.
-k File has sticky bit set.
   
-T File is a text file.
-B File is a binary file (opposite of -T).
   
-M Age of file (at startup) in days since modification.
-A Age of file (at startup) in days since last access.
-C Age of file (at startup) in days since inode change.

The interpretation of the file permission operators -r, -R, -w, -W, -x, and -X is based solely on the mode of the file and the user and group IDs of the user. There may be other reasons you can't actually read, write, or execute the file, such as Andrew File System (AFS) access control lists. Also note that for the superuser, -r, -R, -w, and -W always return 1, and -x, and -X return 1 if any execute bit is set in the mode. Scripts run by the superuser may thus need to do a stat in order to determine the actual mode of the file, or temporarily set the uid to something else. Example:

while (<>) {
    chomp;
    next unless -f $_;      # ignore "special" files
    ...
}

Note that -s/a/b/ does not do a negated substitution. Saying -exp($foo) still works as expected, however--only single letters following a minus are interpreted as file tests.

The -T and -B switches work as follows. The first block or so of the file is examined for odd characters such as strange control codes or characters with the high bit set. If too many odd characters (>30%) are found, it's a -B file, otherwise it's a -T file. Also, any file containing null in the first block is considered a binary file. If -T or -B is used on a filehandle, the current input (standard I/O or "stdio") buffer is examined rather than the first block of the file. Both -T and -B return true on a null file, or on a file at EOF (end of file) when testing a filehandle. Because you have to read a file to do the -T test, on most occasions you want to use a -f against the file first, as in:

next unless -f $file && -T _;

If any of the file tests (or either the stat or lstat operators) are given the special filehandle consisting of a solitary underline, then the stat structure of the previous file test (or stat operator) is used, thereby saving a system call. (This doesn't work with -t, and you need to remember that lstat and -l will leave values in the stat structure for the symbolic link, not the real file.)[30] Example:

[30] Likewise, -l _ will always be false after a normal stat.

print "Can do.\n" if -r $a || -w _ || -x _;
stat($filename);
print "Readable\n" if -r _;
print "Writable\n" if -w _;
print "Executable\n" if -x _;
print "Setuid\n" if -u _;
print "Setgid\n" if -g _;
print "Sticky\n" if -k _;
print "Text\n" if -T _;
print "Binary\n" if -B _;

File ages for -M, -A, and -C are returned in days (including fractional days) since the time when the script started running. (This time is stored in the special variable $^T.) Thus, if the file changed after the script started, you would get a negative time. Note that most times (86,399 out of 86,400, on average) are fractional, so testing for equality with an integer without using the int function is usually futile. Examples:

next unless -M $file > .5;      # files older than 12 hours
&newfile if -M $file < 0;       # file is newer than process
&mailwarning if int(-A) == 90;  # file ($_) accessed 90 days ago today

To reset the script's start time to the current time, change $^T as follows:

$^T = time;

Relational Operators

Perl has two classes of relational operators. One class operates on numeric values, and the other class operates on string values. To repeat the table given in the overview:

Numeric String Meaning
> gt Greater than
>= ge Greater than or equal to
< lt Less than
<= le Less than or equal to

These operators return 1 for true, and `` for false. String comparisons are based on the ASCII collating sequence, and unlike in some languages, trailing spaces count in the comparison. Note that relational operators are non-associating, which means that $a < $b < $c is a syntax error.

Equality Operators

The equality operators are much like the relational operators.

Numeric String Meaning
== eq Equal to
!= ne Not equal to
<=> cmp Comparison, with signed result

The equal and not-equal operators return 1 for true, and `` for false (just as the relational operators do). The <=> and cmp operators return -1 if the left operand is less than the right operand, 0 if they are equal, and +1 if the left operand is greater than the right. Although these appear to be very similar to the relational operators, they do have a different precedence level, so $a < $b <=> $c < $d is syntactically valid.

For reasons that are apparent to anyone who has seen Star Wars, the <=> operator is known as the "spaceship" operator.

Bitwise Operators

Like C, Perl has bitwise AND, OR, and XOR (exclusive OR) operators: &, |, and ^. Note from the table at the start of this section that bitwise-AND has a higher precedence. These operators work differently on numeric values than they do on strings. (This is one of the few places where Perl cares about the difference.) If either operand is a number (or has been used as a number), then both operands are converted to type integer, and the bitwise operation is performed between the two integers. These integers are guaranteed to be at least 32 bits long, but may be 64 bits on some machines. The point is that there's an arbitrary limit imposed by the machine's architecture.

If both operands are strings (and have not been used as numbers since being set), these operators do bitwise operations between corresponding bits from the two strings. In this case, there's no arbitrary limit, since strings aren't arbitrarily limited in size. If one string is longer than the other, the shorter string is considered to have a sufficient number of 0 bits on the end to make up the difference.

For example, if you AND together two strings:

"123.45" & "234.56"

you get another string:

"020.44"

But if you AND together a string and a number:

"123.45" & 234.56

The string is first converted to a number, giving:

123.45 & 234.56

The numbers are then converted to integer:

123 & 234

which evaluates to 106.

Note that all bit strings are true (unless they come out to being the string "0"). This means that tests of the form:

if ( "fred" & "\1\2\3\4" ) { ... }

would need to be written instead as:

if ( ("fred" & "\1\2\3\4") !~ /^\0+$/ ) { ... }

C-style Logical (Short Circuit) Operators

Like C, Perl provides the && (logical AND) and || (logical OR) operators. They evaluate from left to right (with && having slightly higher precedence than ||) testing the truth of the statement. These operators are known as short-circuit operators because they determine the truth of the statement by evaluating the fewest number of operands possible. For example, if the left operand of an && operator is false, the right operand is never evaluated because the result of the operator is false regardless of the value of the right operand.

Example Name Result
$a && $b And $a if $a is false, $b otherwise
$a || $b Or $a if $a is true, $b otherwise

Such short circuits are not only time savers, but are frequently used to control the flow of evaluation. For example, an oft-appearing idiom in Perl programs is:

open(FILE, "somefile") || die "Cannot open somefile: $!\n";

In this case, Perl first evaluates the open function. If the value is true (because somefile was successfully opened), the execution of the die function is unnecessary, and is skipped. You can read this literally as "Open some file or die!"

The || and && operators differ from C's in that, rather than returning 0 or 1, they return the last value evaluated. This has the delightful result that you can select the first of a series of values that happens to be true. Thus, a reasonably portable way to find out the home directory might be:

$home = $ENV{HOME} 
     || $ENV{LOGDIR} 
     || (getpwuid($<))[7] 
     || die "You're homeless!\n";

Perl also provides lower precedence and and or operators that are more readable and don't force you to use parentheses as much. They also short-circuit.

Range Operator

The .. range operator is really two different operators depending on the context. In a list context, it returns a list of values counting (by ones) from the left value to the right value. This is useful for writing for (1..10) loops and for doing slice operations on arrays.[31]

[31] Be aware that under the current implementation, a temporary array is created, so you'll burn a lot of memory if you write something like this:

for (1 .. 1_000_000) {
    # code
}

In a scalar context, .. returns a Boolean value. The operator is bi-stable, like an electronic flip-flop, and emulates the line-range (comma) operator of sed, awk, and various editors. Each scalar .. operator maintains its own Boolean state. It is false as long as its left operand is false. Once the left operand is true, the range operator stays true until the right operand is true, after which the range operator becomes false again. (The operator doesn't become false until the next time it is evaluated. It can test the right operand and become false on the same evaluation as the one where it became true (the way awk's range operator behaves), but it still returns true once. If you don't want it to test the right operand until the next evaluation (which is how sed's range operator works), just use three dots (. . .) instead of two.) The right operand is not evaluated while the operator is in the false state, and the left operand is not evaluated while the operator is in the true state.

The precedence is a little lower than || and &&. The value returned is either the null string for false, or a sequence number (beginning with 1) for true. The sequence number is reset for each range encountered. The final sequence number in a range has the string "E0" appended to it, which doesn't affect its numeric value, but gives you something to search for if you want to exclude the endpoint. You can exclude the beginning point by waiting for the sequence number to be greater than 1. If either operand of scalar .. is a numeric literal, that operand is evaluated by comparing it to the $. variable, which contains the current line number for your input file. Examples:

As a scalar operator:

if (101 .. 200) { print; }  # print 2nd hundred lines
next line if (1 .. /^$/);   # skip header lines
s/^/> / if (/^$/ .. eof()); # quote body

As a list operator:

for (101 .. 200) { print; }            # prints 101102...199200
@foo = @foo[0 .. $#foo];               # an expensive no-op
@foo = @foo[ -5 .. -1];                # slice last 5 items

The range operator (in a list context) makes use of the magical autoincrement algorithm if the operands are strings.[32] So you can say:

[32] If the final value specified is not in the sequence that the magical increment would produce, the sequence goes until the next value would be longer than the final value specified.

@alphabet = ('A' .. 'Z');

to get all the letters of the alphabet, or:

$hexdigit = (0 .. 9, 'a' .. 'f')[$num & 15];

to get a hexadecimal digit, or:

@z2 = ('01' .. '31');  print $z2[$mday];

to get dates with leading zeros. You can also say:

@combos = ('aa' .. 'zz');

to get all combinations of two lowercase letters. However, be careful of something like:

@bigcombos = ('aaaaaa' .. 'zzzzzz');

since that will require lots of memory. More precisely, it'll need space to store 308,915,776 scalars. Let's hope you allocated a large swap partition. Perhaps you should consider an iterative approach instead.

Conditional Operator

Trinary ?: is the conditional operator, just as in C. It works as:

TEST_EXPR ? IF_TRUE_EXPR : IF_FALSE_EXPR

much like an if-then-else, except that it can safely be embedded within other operations and functions. If the TEST_EXPR is true, only the IF_TRUE_EXPR is evaluated, and the value of that expression becomes the value of the entire expression. Otherwise, only the IF_FALSE_EXPR is evaluated, and its value becomes the value of the entire expression.

printf "I have %d dog%s.\n", $n, 
        ($n == 1) ? "" : "s";

Scalar or list context propagates downward into the second or third argument, whichever is selected. (The first argument is always in scalar context, since it's a conditional.)

$a = $ok ? $b : $c;  # get a scalar
@a = $ok ? @b : @c;  # get an array
$a = $ok ? @b : @c;  # get a count of elements in one of the arrays

You can assign to the conditional operator[33] if both the second and third arguments are legal lvalues (meaning that you can assign to them), provided that both are scalars or both are lists (or Perl won't know which context to supply to the right side of the assignment):

[33] This is not necessarily guaranteed to contribute to the readability of your program. But it can be used to create some cool entries in an Obfuscated Perl contest.

($a_or_b ? $a : $b) = $c;  # sets either $a or $b to equal $c

Assignment Operators

Perl recognizes the C assignment operators, as well as providing some of its own. There are quite a few of them:

=    **=    +=    *=    &=    <<=    &&=
            -=    /=    |=    >>=    ||=
            .=    %=    ^=
                  x=

Each operator requires an lvalue (a variable or array element) on the left side, and some expression on the right side. For the simple assignment operator, =, the value of the expression is stored into the designated variable. For the other operators, Perl evaluates the expression:

$var OP= $value

as if it were written:

$var = $var OP $value

except that $var is evaluated only once. Compare the following two operations:

$var[$a++] += $value;               # $a is incremented once
$var[$a++] = $var[$a++] + $value;   # $a is incremented twice

Unlike in C, the assignment operator produces a valid lvalue. Modifying an assignment is equivalent to doing the assignment and then modifying the variable that was assigned to. This is useful for modifying a copy of something, like this:

($tmp = $global) += $constant;

which is the equivalent of:

$tmp = $global + $constant;

Likewise:

($a += 2) *= 3;

is equivalent to:

$a += 2;
$a *= 3;

That's not actually very useful, but you often see this idiom:

($new = $old) =~ s/foo/bar/g;

In all cases, the value of the assignment is the new value of the variable. Since assignment operators associate right-to-left, this can be used to assign many variables the same value, as in:

$a = $b = $c = 0;

which assigns 0 to $c, and the result of that (still 0) to $b, and the result of that (still 0) to $a.

List assignment may be done only with the plain assignment operator, =. In a list context, list assignment returns the list of new values just as scalar assignment does. In a scalar context, list assignment returns the number of values that were available on the right side of the assignment, as we mentioned earlier in "List Values and Arrays". This makes it useful for testing functions that return a null list when they're "unsuccessful", as in:

while (($key, $value) = each %gloss) { ... }
next unless ($dev, $ino, $mode) = stat $file;

Comma Operators

Binary "," is the comma operator. In a scalar context it evaluates its left argument, throws that value away, then evaluates its right argument and returns that value. This is just like C's comma operator. For example:

$a = (1, 3);

assigns 3 to $a. Do not confuse the scalar context use with the list context use. In a list context, it's just the list argument separator, and inserts both its arguments into the LIST. It does not throw any values away.

For example, if you change the above to:

@a = (1, 3);

you are constructing a two-element list, while:

atan2(1, 3);

is calling the function atan2 with two arguments.

The => digraph is mostly just a synonym for the comma operator. It's useful for documenting arguments that come in pairs. It also forces any identifier to the left of it to be interpreted as a string.

List Operators (Rightward)

The right side of a list operator governs all the list operator's arguments, which are comma separated, so the precedence of list operators is looser than comma if you're looking to the right.

Logical and, or, not, and xor

As more readable alternatives to &&, ||, and !, Perl provides the and, or and not operators. The behavior of these operators is identical--in particular, they short-circuit the same way.[34]

[34] Obviously the unary not doesn't short circuit, just as ! doesn't.

The precedence of these operators is much lower, however, so you can safely use them after a list operator without the need for parentheses:

unlink "alpha", "beta", "gamma"
        or gripe(), next LINE;

With the C-style operators that would have to be written like this:

unlink("alpha", "beta", "gamma")
        || (gripe(), next LINE);

There is also a logical xor operator that has no exact counterpart in C or Perl, since the other XOR operator (^) works on bits. The best equivalent for $a xor $b is perhaps !$a != !$b.[35] This operator can't short-circuit either, since both sides must be evaluated.

[35] One could also write !$a ^ !$b or even $a ? !$b : !!$b, of course. The point is that both $a and $b have to evaluate to true or false in a Boolean context, and the existing bitwise operator doesn't provide a Boolean context.

C Operators Missing from Perl

Here is what C has that Perl doesn't:

unary &

The address-of operator. Perl's \ operator (for taking a reference) fills the same ecological niche, however:

$ref_to_var = \$var;

But references are much safer than addresses.

unary *

The dereference-address operator. Since Perl doesn't have addresses, it doesn't need to dereference addresses. It does have references though, so Perl's variable prefix characters serve as dereference operators, and indicate type as well: $, @, % and &. Oddly enough, there actually is a * dereference operator, but since * is the funny character indicating a typeglob, you wouldn't use it the same way.

(TYPE)

The typecasting operator. Nobody likes to be typecast anyway.


Previous Home Next
Pattern Matching Book Index Statements and Declarations

HTML: The Definitive Guide CGI Programming JavaScript: The Definitive Guide Programming Perl WebMaster in a Nutshell