Chapter 5. Manipulating Complex Data StructuresContents:
Using the Debugger to View Complex Data Now that you've seen the basics of references, let's look at additional ways to manipulate complex data. We'll start by using the debugger to examine complex data structures and then use Data::Dumper to show the data under programmatic control. Next, you'll learn to store and retrieve complex data easily and quickly using Storable, and finally we'll wrap up with a review of grep and map and see how they apply to complex data. 5.1. Using the Debugger to View Complex DataThe Perl debugger can display complex data easily. For example, let's single-step through one version of the byte-counting program from Chapter 4: my %total_bytes; while (<>) { my ($source, $destination, $bytes) = split; $total_bytes{$source}{$destination} += $bytes; } for my $source (sort keys %total_bytes) { for my $destination (sort keys %{ $total_bytes{$source} }) { print "$source => $destination:", " $total_bytes{$source}{$destination} bytes\n"; } print "\n"; } Here's the data you'll use to test it: professor.hut gilligan.crew.hut 1250 professor.hut lovey.howell.hut 910 thurston.howell.hut lovey.howell.hut 1250 professor.hut lovey.howell.hut 450 ginger.girl.hut professor.hut 1218 ginger.girl.hut maryann.girl.hut 199 You can do this a number of ways. One of the easiest is to invoke Perl with a -d switch on the command line: myhost% perl -d bytecounts bytecounts-in Loading DB routines from perl5db.pl version 1.19 Editor support available. Enter h or `h h' for help, or `man perldebug' for more help. main::(bytecounts:2): my %total_bytes; DB<1> s main::(bytecounts:3): while (<>) { DB<1> s main::(bytecounts:4): my ($source, $destination, $bytes) = split; DB<1> s main::(bytecounts:5): $total_bytes{$source}{$destination} += $bytes; DB<1> x $source, $destination, $bytes 0 'professor.hut' 1 'gilligan.crew.hut' 2 1250 If you're playing along at home, be aware that each new release of the debugger works differently than any other, so your screen probably won't look exactly like this. Also, if you get stuck at any time, type h for help, or look at perldoc perldebug. Each line of code is shown before it is executed. That means that, at this point, you're about to invoke the autovivification, and you've got your keys established. The s command single-steps the program, while the x command dumps a list of values in a nice format. You can see that $source, $destination, and $bytes are correct, and now it's time to update the data: DB<2> s main::(bytecounts:3): while (<>) { You've created the hash entries through autovivification. Let's see what you've got: DB<2> x \%total_bytes 0 HASH(0x132dc) 'professor.hut' => HASH(0x37a34) 'gilligan.crew.hut' => 1250 When x is given a hash reference, it dumps the entire contents of the hash, showing the key/value pairs. If any of the values are also hash references, they are dumped as well, recursively. What you'll see is that the %total_bytes hash has a single key of professor.hut, whose corresponding value is another hash reference. The referenced hash contains a single key of gilligan.crew.hut, with a value of 1250, as expected. Let's see what happens just after the next assignment: DB<3> s main::(bytecounts:4): my ($source, $destination, $bytes) = split; DB<3> s main::(bytecounts:5): $total_bytes{$source}{$destination} += $bytes; DB<3> x $source, $destination, $bytes 0 'professor.hut' 1 'lovey.howell.hut' 2 910 DB<4> s main::(bytecounts:3): while (<>) { DB<4> x \%total_bytes 0 HASH(0x132dc) 'professor.hut' => HASH(0x37a34) 'gilligan.crew.hut' => 1250 'lovey.howell.hut' => 910 Now you've added bytes flowing from professor.hut to lovey.howell.hut. The top-level hash hasn't changed, but the second-level hash has added a new entry. Let's continue: DB<5> s main::(bytecounts:4): my ($source, $destination, $bytes) = split; DB<6> s main::(bytecounts:5): $total_bytes{$source}{$destination} += $bytes; DB<6> x $source, $destination, $bytes 0 'thurston.howell.hut' 1 'lovey.howell.hut' 2 1250 DB<7> s main::(bytecounts:3): while (<>) { DB<7> x \%total_bytes 0 HASH(0x132dc) 'professor.hut' => HASH(0x37a34) 'gilligan.crew.hut' => 1250 'lovey.howell.hut' => 910 'thurston.howell.hut' => HASH(0x2f9538) 'lovey.howell.hut' => 1250 Ah, now it's getting interesting. A new entry in the top-level hash has a key of thurston.howell.hut, and a new hash reference, autovivified initially to an empty hash. Immediately after the new empty hash was put in place, a new key/value pair was added, indicating 1250 bytes transferred from thurston.howell.hut to lovey.howell.hut. Let's step some more: DB<8> s main::(bytecounts:4): my ($source, $destination, $bytes) = split; DB<8> s main::(bytecounts:5): $total_bytes{$source}{$destination} += $bytes; DB<8> x $source, $destination, $bytes 0 'professor.hut' 1 'lovey.howell.hut' 2 450 DB<9> s main::(bytecounts:3): while (<>) { DB<9> x \%total_bytes 0 HASH(0x132dc) 'professor.hut' => HASH(0x37a34) 'gilligan.crew.hut' => 1250 'lovey.howell.hut' => 1360 'thurston.howell.hut' => HASH(0x2f9538) 'lovey.howell.hut' => 1250 Now you're adding in some more bytes from professor.hut to lovey.howell.hut, reusing the existing value place. Nothing too exciting there. Let's keep stepping: DB<10> s main::(bytecounts:4): my ($source, $destination, $bytes) = split; DB<10> s main::(bytecounts:5): $total_bytes{$source}{$destination} += $bytes; DB<10> x $source, $destination, $bytes 0 'ginger.girl.hut' 1 'professor.hut' 2 1218 DB<11> s main::(bytecounts:3): while (<>) { DB<11> x \%total_bytes 0 HASH(0x132dc) 'ginger.girl.hut' => HASH(0x297474) 'professor.hut' => 1218 'professor.hut' => HASH(0x37a34) 'gilligan.crew.hut' => 1250 'lovey.howell.hut' => 1360 'thurston.howell.hut' => HASH(0x2f9538) 'lovey.howell.hut' => 1250 This time, you added a new source, ginger.girl.hut. Notice that the top level hash now has three elements, and each element has a different hash reference value. Let's step some more: DB<12> s main::(bytecounts:4): my ($source, $destination, $bytes) = split; DB<12> s main::(bytecounts:5): $total_bytes{$source}{$destination} += $bytes; DB<12> x $source, $destination, $bytes 0 'ginger.girl.hut' 1 'maryann.girl.hut' 2 199 DB<13> s main::(bytecounts:3): while (<>) { DB<13> x \%total_bytes 0 HASH(0x132dc) 'ginger.girl.hut' => HASH(0x297474) 'maryann.girl.hut' => 199 'professor.hut' => 1218 'professor.hut' => HASH(0x37a34) 'gilligan.crew.hut' => 1250 'lovey.howell.hut' => 1360 'thurston.howell.hut' => HASH(0x2f9538) 'lovey.howell.hut' => 1250 Now you've added a second destination to the hash that records information for all bytes originating at ginger.girl.hut. Because that was the final line of data (in this run), a step brings you down to the lower foreach loop: DB<14> s main::(bytecounts:8): for my $source (sort keys %total_bytes) { Even though you can't directly examine the list value from inside those parentheses, you can display it: DB<14> x sort keys %total_bytes 0 'ginger.girl.hut' 1 'professor.hut' 2 'thurston.howell.hut' This is the list the foreach now scans. These are all the sources for transferred bytes seen in this particular logfile. Here's what happens when you step into the inner loop: DB<15> s main::(bytecounts:9): for my $destination (sort keys %{ $total bytes{ $source} }) { At this point, you can determine from the inside out exactly what values will result from the list value from inside the parentheses. Let's look at them: DB<15> x $source 0 'ginger.girl.hut' DB<16> x $total_bytes{$source} 0 HASH(0x297474) 'maryann.girl.hut' => 199 'professor.hut' => 1218 DB<18> x keys %{ $total_bytes{$source } } 0 'maryann.girl.hut' 1 'professor.hut' DB<19> x sort keys %{ $total_bytes{$source } } 0 'maryann.girl.hut' 1 'professor.hut' Note that dumping $total_bytes{$source} shows that it was a hash reference. Also, the sort appears not to have done anything, but the output of keys is not necessarily in a sorted order. The next step finds the data: DB<20> s main::(bytecounts:10): print "$source => $destination:", main::(bytecounts:11): " $total_bytes{$source}{$destination} bytes\n"; DB<20> x $source, $destination 0 'ginger.girl.hut' 1 'maryann.girl.hut' DB<21> x $total_bytes{$source}{$destination} 0 199 As you can see, with the debugger, you can easily show the data, even structured data, to help you understand your program. Copyright © 2003 O'Reilly & Associates. All rights reserved. |
|