13.4.1. A Sample Program
What we really like about Perl is that it lets you immediately
jump to the task at hand; you don't have to write extensive
code to set up
data structures, open files or pipes, allocate space for data, and
so on. All these features are taken care of for you in a very
friendly way.
The example of login times, just discussed, serves to introduce many of the basic
features of Perl. First, we'll give the entire script (complete with
comments) and then a description of how it works. This script reads
the output of the last command (see the previous example)
and prints an entry for each
user on the system, describing the total login time and number of
logins for each. (Line numbers are printed to the left of each line
for reference):
1 #!/usr/bin/perl
2
3 while (<STDIN>) { # While we have input...
4 # Find lines and save username, login time
5 if (/^(\S*)\s*.*\((.*):(.*)\)$/) {
6 # Increment total hours, minutes, and logins
7 $hours{$1} += $2;
8 $minutes{$1} += $3;
9 $logins{$1}++;
10 }
11 }
12
13 # For each user in the array...
14 foreach $user (sort(keys %hours)) {
15 # Calculate hours from total minutes
16 $hours{$user} += int($minutes{$user} / 60);
17 $minutes{$user} %= 60;
18 # Print the information for this user
19 print "User $user, total login time ";
20 # Perl has printf, too
21 printf "%02d:%02d, ", $hours{$user}, $minutes{$user};
22 print "total logins $logins{$user}.\n";
23 }
Line 1 tells the loader that this script should be executed
through Perl, not as a shell script.
Line 3 is the beginning of the program. It is the head of a simple
while loop, which C and shell programmers will be
familiar with: the
code within the braces from lines 4-10 should be executed while
a certain expression is true. However, the conditional expression
<STDIN> looks funny. Actually, this expression is true whenever
there is input on the STDIN filehandle--which refers to standard
input, as you might guess.
Perl reads input one line at a time (unless you tell it to do otherwise).
It also reads by default from standard input, again, unless you tell it
to do otherwise. Therefore, this while loop will continuously
read lines from standard input, until there are no lines left to be
read.
The evil-looking mess on line 5 is just an if statement. As with
most programming languages, the code within the braces (on lines
6-9) will be executed if the expression that follows the if is
true. But what is the expression between the parentheses?
Those readers familiar with Unix tools such as grep and sed
will peg this immediately as a regular
expression: a cryptic
but useful way to represent a pattern to be matched in the input text.
Regular expressions are usually found between delimiting slashes
(/…/).
This particular regular expression matches lines of
the form:
mdw ttypf loomer.vpizza.co Sun Jan 16 15:30 - 15:54 (00:23)
As an example, referencing the variable
$hours{'mdw'} returns the total number of hours
that the user mdw was logged in.
Similarly, if the username mdw is stored in the variable $1,
referencing $hours{$1} produces the same effect.
In lines 6-9, we increment the values of these arrays according to the
data on the present line of input. For example, given the input line:
jem ttyq2 mallard.vpizza.c Sun Jan 16 13:55 - 13:59 (00:03)
Line 7 increments the value of the hours array, indexed with
$1 (the username, jem
), by the number of hours that
jem was logged in (stored in the variable $2). The
Perl increment operator += is equivalent to the corresponding
C operator. Line 8 increments the value of minutes for the
appropriate user similarly. Line 9 increments the value of the logins
array by one, using the ++ operator.
Associative arrays are one of the most useful features of Perl. They
allow you to build up complex databases while parsing text. It would
be nearly impossible to use a standard array for this same task. We
would first have to count the number of users in the input stream
and then allocate an array of the appropriate size, assigning a position
in the array to each user (through the use of a hash function or some
other indexing scheme). An associative array, however, allows you to
index data directly using strings and without regard for the size of
the array in question. (Of course, performance issues always arise when
attempting to use large arrays, but for most applications this isn't
a problem.)
Let's move on. Line 14 uses the Perl foreach statement, which
you may be used to if you write shell scripts. (The foreach
loop actually breaks down into a for loop, much like that
found in C.) Here, in each iteration of the loop, the variable
$user is assigned the next value in the list given by the
expression sort(keys %hours). %hours simply
refers to the entire associative array hours that we
have constructed. The function keys returns a list of all
the keys used to index the array, which is in this case a list
of usernames. Finally, the sort function sorts the list returned
by keys. Therefore, we are looping over a sorted list of usernames,
assigning each username in turn to the variable $user.
Lines 16 and 17 simply correct for situations where the number of
minutes is greater than 60; it determines the total number of hours
contained in the minutes entry for this user and increments
hours accordingly. The int
function returns the integral
portion of its argument. (Yes, Perl handles floating-point numbers
as well; that's why use of int is necessary.)
Finally, lines 19-22 print the total login time and number of logins for
each user. The simple print function just prints its arguments,
like the awk function of the same name. Note that variable
evaluation can be done within a print statement, as on lines 19 and 22.
However, if you want to do some fancy text formatting, you need
to use the printf function (which is just like its C equivalent).
In this case, we wish to set the minimum output length of the
hours and minutes values for this user to 2 characters
wide, and to left-pad the output with zeroes. To do this, we use the
printf command on line 21.
If this script is saved in the file logintime, we can execute
it as follows:
papaya$ last | logintime
User johnsonm, total login time 01:07, total logins 11.
User kibo, total login time 00:42, total logins 3.
User linus, total login time 98:50, total logins 208.
User mdw, total login time 153:03, total logins 290.
papaya$
Of course, this example doesn't serve well as a Perl tutorial, but it should
give you some idea of what it can do. We encourage you to read one of
the excellent Perl books out there to learn more.