4.6 A Brief Tutorial: Manipulating Lists of ListsThere are many kinds of nested data structures. The simplest kind to build is a list of lists (also called an array of arrays, or a multi-dimensional array). It's reasonably easy to understand, and almost everything that applies here will also be applicable to the fancier data structures. 4.6.1 Composition and AccessHere's how to put together a two-dimensional array value: # assign to an array a list of list references @LoL = ( [ "fred", "barney" ], [ "george", "jane", "elroy" ], [ "homer", "marge", "bart" ], ); print $LoL[2][2]; # prints "bart" The overall list is enclosed by parentheses, not brackets. That's because you're assigning a list to an array. If you didn't want the result to be a list, but rather a reference to an array, then you would use brackets on the outside: # assign to a scalar variable a reference to a list of list references $ref_to_LoL = [ [ "fred", "barney", "pebbles", "bambam", "dino", ], [ "homer", "bart", "marge", "maggie", ], [ "george", "jane", "elroy", "judy", ], ]; print $ref_to_LoL->[2][2]; # prints "elroy"
Remember that there is an implied $LoL[2][2] $ref_to_LoL->[2][2] are equivalent to these two lines: $LoL[2]->[2] $ref_to_LoL->[2]->[2]
There is, however, no implied
4.6.2 Growing Your OwnNow those big list assignments are well and good for creating a fixed data structure, but what if you want to calculate each element on the fly, or otherwise build the structure piecemeal? First, let's look at reading a data structure in from a file. We'll assume that there's a flat file in which each line is a row of the structure, and each word an element. Here's how to proceed: while (<>) { @tmp = split; push @LoL, [ @tmp ]; } You can also load the array from a function: for $i ( 1 .. 10 ) { @tmp = somefunc($i); $LoL[$i] = [ @tmp ]; } Of course, you don't need to name the temporary array: while (<>) { push @LoL, [ split ]; } and: for $i ( 1 .. 10 ) { $LoL[$i] = [ somefunc($i) ]; } You also don't have to use push . You could keep track of where you are in the array, and assign each line of the file to the appropriate row of the array: my (@LoL, $i, $line); for $i ( 0 .. 10 ) { # just first 11 lines $line = <>; $LoL[$i] = [ split ' ', $line ]; } Simplifying, you can avoid the assignment of the line to a mediating variable: my (@LoL, $i); for $i ( 0 .. 10 ) { # just first 11 lines $LoL[$i] = [ split ' ', <> ]; }
In general, you should be leery of using potential list functions like
my (@LoL, $i); for $i ( 0 .. 10 ) { # just first 11 lines $LoL[$i] = [ split ' ', scalar(<>) ]; }
If you want a my $ref_to_LoL; while (<>) { push @$ref_to_LoL, [ split ]; } So much for adding new rows to the list of lists. What about adding new columns? If you're just dealing with matrices, it's often easiest to use simple assignment: for $x (1 .. 10) { for $y (1 .. 10) { $LoL[$x][$y] = func($x, $y); } } for $x ( 3, 7, 9 ) { $LoL[$x][20] += func2($x); }
It doesn't matter whether the subscripted elements of # add new columns to an existing row push @{ $LoL[0] }, "wilma", "betty"; Notice that this wouldn't work: push $LoL[0], "wilma", "betty"; # WRONG!
In fact, that wouldn't even compile, because the argument to
push
must be a real array, not just a reference
to an array. Therefore, the first argument absolutely must begin
with an 4.6.3 Access and PrintingNow it's time to print your data structure. If you only want one element, do this: print $LoL[0][0]; If you want to print the whole thing, though, you can't just say: print @LoL; # WRONG because you'll get references listed, and Perl will never automatically dereference thingies for you. Instead, you have to roll yourself a loop or two. The following code prints the whole structure, using the shell-style for construct to loop through the outer set of subscripts: for $array_ref ( @LoL ) { print "\t [ @$array_ref ],\n"; } Beware of the brackets. In this and the following example, the (non-subscripting) brackets do not indicate the creation of a reference. The brackets occur inside a quoted string, not in a place where a term is expected, and therefore lose their special meaning. They are just part of the string that print outputs. If you want to keep track of subscripts, you might do this: for $i ( 0 .. $#LoL ) { print "\t element $i is [ @{$LoL[$i]} ],\n"; } or maybe even this (notice the inner loop): for $i ( 0 .. $#LoL ) { for $j ( 0 .. $#{$LoL[$i]} ) { print "element $i $j is $LoL[$i][$j]\n"; } } As you can see, things are getting a bit complicated. That's why sometimes it's easier to use a temporary variable on your way through: for $i ( 0 .. $#LoL ) { $aref = $LoL[$i]; for $j ( 0 .. $#{$aref} ) { print "element $i $j is $aref->[$j]\n"; } } But that's still a bit ugly. How about this: for $i ( 0 .. $#LoL ) { $aref = $LoL[$i]; $n = @$aref - 1; for $j ( 0 .. $n ) { print "element $i $j is $aref->[$j]\n"; } } 4.6.4 SlicesIf you want to get at a slice (part of a row) in a multi-dimensional array, you're going to have to do some fancy subscripting. That's because, while we have a nice synonym for a single element via the pointer arrow, no such convenience exists for slices. However, you can always write a loop to do a slice operation. Here's how to create a one-dimensional slice of one subarray of a two-dimensional array, using a loop. We'll assume a list-of-lists variable (rather than a reference to a list of lists): @part = (); $x = 4; for ($y = 7; $y < 13; $y++) { push @part, $LoL[$x][$y]; } That same loop could be replaced with a slice operation: @part = @{ $LoL[4] } [ 7..12 ];
If you want a
two-dimensional slice
, say, with
@newLoL = (); for ($startx = $x = 4; $x <= 8; $x++) { for ($starty = $y = 7; $y <= 12; $y++) { $newLoL[$x - $startx][$y - $starty] = $LoL[$x][$y]; } }
In this example, the individual values within each subarray of
for ($x = 4; $x <= 8; $x++) { push @newLoL, [ @{ $LoL[$x] } [ 7..12 ] ]; }
Of course, if you do this very often, you should probably write a
subroutine called something like 4.6.5 Common MistakesAs mentioned previously, every array or hash in Perl is implemented in one dimension. "Multi-dimensional" arrays, too, are one-dimensional, but the values in this one-dimensional array are references to other data structures. If you print these values out without dereferencing them, you will get the references rather than the data referenced. For example, these two lines: @LoL = ( [2, 3], [4, 5, 7], [0] ); print "@LoL"; result in: ARRAY(0x83c38) ARRAY(0x8b194) ARRAY(0x8b1d0) On the other hand, this line: print $LoL[1][2];
yields
Perl dereferences your variables only when you employ one of the
dereferencing mechanisms. But remember that
my $listref = [ [ "fred", "barney", "pebbles", "bambam", "dino", ], [ "homer", "bart", "marge", "maggie", ], [ "george", "jane", "elroy", "judy", ], ]; print $listref[2][2]; # WRONG!
Here, print $listref->[2][2];
By contrast, use strict 'vars'; # or just use strict then the use of the undeclared array will be flagged as an error at compile time. In constructing an array of arrays, remember to take a reference for the daughter arrays. Otherwise, you will just create an array containing the element counts of the daughter arrays, like this: for $i (1..10) { @list = somefunc($i); $LoL[$i] = @list; # WRONG! }
Here Another common error involves taking a reference to the same memory location over and over again: for $i (1..10) { @list = somefunc($i); $LoL[$i] = \@list; # WRONG! }
Every reference generated by the second line of the
for
loop is
the same, namely, a reference to the single array Here's a more successful approach: for $i (1..10) { @list = somefunc($i); $LoL[$i] = [ @list ]; }
The brackets make a reference to a new array with a
copy
of what's in A similar result - though much more difficult to read - would be produced by: for $i (1..10) { @list = somefunc($i); @{$LoL[$i]} = @list; }
Since
But there
is
a situation in which you might use it.
Suppose $LoL[3] = \@original_list;
And now suppose that you want to change @{$LoL[3]} = @list;
In this case, the reference itself does not change, but the elements of
the array being referred to do. You need to be aware, however,
that this approach overwrites the values of Finally, the following dangerous-looking code actually works fine: for $i (1..10) { my @list = somefunc($i); $LoL[$i] = \@list; } That's because the my variable is created afresh each time through the loop. So even though it looks as though you stored the same variable reference each time, you actually did not. This is a subtle distinction, but the technique can produce more efficient code, at the risk of misleading less enlightened programmers. It's more efficient because there's no copy in the final assignment. On the other hand, if you have to copy the values anyway (which the first assignment above is doing), then you might as well use the copy implied by the brackets and avoid the temporary variable: for $i (1..10) { $LoL[$i] = [ somefunc($i) ]; } In summary: $LoL[$i] = [ @list ]; # safest, sometimes fastest $LoL[$i] = \@list; # fast but risky, depends on my-ness of list @{ $LoL[$i] } = @list; # too tricky for most uses |
|