home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam    

2.4 Pass the Envelope

Let us say we are given a text file containing Academy Award (Oscar) winners by year and category, formatted as follows:

1995:Actor:Nicholas Cage
1995:Supporting Actor:Kevin Spacey
1994:Actor:Tom Hanks
1994:Picture:Forrest Gump

We would like to provide the following services:[ 2 ]

  • Given a year and category, print the corresponding entry.

  • Given a year, print all entries for that year.

  • Given a category, print the year and title of all entries for that category.

  • Print all entries sorted by category or by year.

[2] To see real historical databases for the Oscars, look at http://oscars.guide.com/ . (Illustra, an object-oriented database from Informix, is used for the grunge work.)

2.4.1 Data Representation

Since we would like to retrieve entries by category or by year, we use a double indexing scheme, as shown in Figure 2.2 .

Figure 2.2: Data structure to represent Oscar winners

Figure 2.2

Each entry includes a category, a year, and the name of the corresponding winner. We choose to keep this information in an anonymous array (an anonymous hash would do just as well). The two indices %year_index and %category_index map the year and category to anonymous arrays containing references to the entries. Here is one way to build this structure:

open (F, "oscar.txt") || die "Could not open database: $:";
%category_index = (); %year_index = ();
while ($line = <F>) {
    chomp $line;
    ($year, $category, $name) = split (/:/, $line);
    create_entry($year, $category, $name) if $name;

sub create_entry {             # create_entry (year, category, name)
    my($year, $category, $name) = @_;
    # Create an anonymous array for each entry
    $rlEntry = [$year, $category, $name];
    # Add this to the two indices
    push (@{$year_index {$year}}, $rlEntry);         # By Year
    push (@{$category_index{$category}}, $rlEntry);  # By Category

Notice that each push statement does a fair bit of work. It creates an entry in the index (if required), hangs an anonymous array off that entry (if required), and pushes the reference to the entry into that array.

Another important thing to notice is how braces have been used to specify the correct precedence in the expression @{$year_index{$year}} . If we had omitted the braces, the expression @$year_index would be evaluated first and then indexed as a hash, according to the rules explained in the section "Confusion About Precedence" in Chapter 1 .

2.4.2 Print All Entries for a Given Year

This is a simple matter of traversing the %year_index hash:

sub print_entries_for_year {
    my($year) = @_;
    print ("Year : $year \n");
    foreach $rlEntry (@{$year_index{$year}}) {
        print ("\t",$rlEntry->[1], "  : ",$rlEntry->[2], "\n");

2.4.3 Print All Entries Sorted by Year

We already know how to print entries for a given year. Find out all years for which we have data, sort them, and call print_entries_for_year for each year:

sub print_all_entries_for_year {
    foreach $year (sort keys %year_index) {

2.4.4 Print a Specific Entry, Given a Year and Category

We can traverse either index, and we choose to traverse the %year_index index, since there are substantially fewer categories per year than the number of years for which a category is valid:

sub print_entry {
    my($year, $category) = @_;
    foreach $rlEntry (@{$year_index{$year}}) {
        if ($rlEntry->[1] eq $category) {
            print "$category ($year), ", $rlEntry->[2], "\n";
    print "No entry for $category ($year) \n";