11. References and Records
Contents:
With as little a web as this will I ensnare as great a fly as Cassio. - Shakespeare Othello, Act II, scene i 11.0. IntroductionPerl provides three fundamental data types: scalars, arrays, and hashes. It's certainly possible to write many programs without recourse to complex records, but most programs need something more complex than simple variables and lists. Perl's three built-in types combine with references to produce arbitrarily complex and powerful data structures, the records that users of ancient versions of Perl desperately yearned for. Selecting the proper data structure and algorithm can make the difference between an elegant program that does its job quickly and an ungainly concoction that's glacially slow to execute and consumes system resources voraciously. The first part of this chapter shows how to create and use plain references. The second part shows how to use references to create higher order data structures. ReferencesTo grasp the concept of references, you must first understand how Perl stores values in variables. Each defined variable has a name and the address of a chunk of memory associated with it. This idea of storing addresses is fundamental to references because a reference is a value that holds the location of another value. The scalar value that contains the memory address is called a reference . Whatever value lives at that memory address is called a referent . (You may also call it a "thingie" if you prefer to live a whimsical existence.) See Figure 11.1 . The referent could be any of Perl's built-in types (scalar, array, hash, ref, code, or glob) or a user-defined type based on one of the built-in ones. Figure 11.1: Reference and referentReferents in Perl are typed . This means you can't treat a reference to an array as though it were a reference to a hash, for example. Attempting to do so produces a runtime exception. No mechanism for type casting exists in Perl. This is considered a feature. So far, it may look as though a reference were little more than a raw address with strong typing. But it's far more than that. Perl takes care of automatic memory allocation and deallocation (garbage collection) for references, just as it does for everything else. Every chunk of memory in Perl has a reference count associated with it, representing how many places know about that referent. The memory used by a referent is not returned to the process's free pool until its reference count reaches zero. This ensures that you never have a reference that isn't valid - no more core dumps and general protection faults from mismanaged pointers as in C. Freed memory is returned to Perl for later use, but few operating systems reclaim it and decrease the process's memory footprint. This is because most memory allocators use a stack, and if you free up memory in the middle of the stack, the operating system can't take it back without moving the rest of the allocated memory around. That would destroy the integrity of your pointers and blow XS code out of the water.
To follow a reference to its referent, preface the reference with the appropriate type symbol for the data you're accessing. For instance, if print $$sref; # prints the scalar value that the reference $sref refers to $$sref = 3; # assigns to $sref's referent
To access one element of an array or hash whose reference you have, use the
infix pointer-arrow notation, as in
Perl's syntax rules make dereferencing complex expressions tricky - it falls into the category of "hard things that should be possible." Mixing right associative and left associative operators doesn't work out well. For example,
In the simple cases using print ${$sref}; # prints the scalar $sref refers to ${$sref} = 3; # assigns to $sref's referent For safety, some programmers use this notation exclusively.
When passed a reference, the
You can create references in Perl by taking references to things that are already there or by using the $aref = \@array; You can even create references to constant values; future attempts to change the value of the referent will cause a runtime error: $pi = \3.14159; $$pi = 4; # runtime error Anonymous DataTaking references to existing data is helpful when you're using pass-by-reference in a function call, but for dynamic programming, it becomes cumbersome. You need to be able to grow data structures at will, to allocate new arrays and hashes (or scalars or functions) on demand. You don't want to be bogged down with having to give them names each time. Perl can explicitly create anonymous arrays and hashes, which allocate a new array or hash and return a reference to that memory: $aref = [ 3, 4, 5 ]; # new anonymous array $href = { "How" => "Now", "Brown" => "Cow" }; # new anonymous hash Perl can also create a reference implicitly by autovivification . This is what happens when you try to assign through an undefined references and Perl automatically creates the reference you're trying to use.
undef $aref;
@$aref = (1, 2, 3);
print $aref;
Notice how we went from an undefined variable to one with an array reference in it without actually assigning anything? Perl filled in the undefined reference for you. This is the property that permits something like this to work as the first statement in your program: $a[4][23][53][21] = "fred"; print $a[4][23][53][21];
The following table shows mechanisms for producing references to both named and anonymous scalars, arrays, hashes, and functions. (Anonymous typeglobs are too scary to show - and virtually never used. It's best to use
These diagrams illustrate the differences between named and anonymous values. Figure 11.2 shows named values. Figure 11.2: Named values
In other words, saying Figure 11.3 shows anonymous values. Figure 11.3: Anonymous values
Every reference evaluates as true, by definition, so if you write a subroutine that returns a reference, you can return $op_cit = cite($ibid) or die "couldn't make a reference";
The RecordsThe classic use of references in Perl is to circumvent the restriction that arrays and hashes may hold scalars only. References are scalars, so to make an array of arrays, make an array of array references . Similarly, hashes of hashes are implemented as hashes of hash references, arrays of hashes as arrays of hash references, hashes of arrays as hashes of array references, and so on. Once you have these complex structures, you can use them to implement records. A record is a single logical unit composed of different attributes. For instance, a name, an address, and a birthday might comprise a record representing a person. C calls such things structs , and Pascal calls them RECORDs . Perl doesn't have a particular name for these because you can implement this notion in different ways. The most common technique in Perl is to treat a hash as a record, where the keys of the hash are the record's field names and the values of the hash are those fields' values. For instance, we might create a "person" record like this: $Nat = { "Name" => "Leonhard Euler", "Address" => "1729 Ramanujan Lane\nMathworld, PI 31416", "Birthday" => 0x5bb5580, };
Because
The attributes of a record, including the "person" record, are always scalars. You can certainly use numbers as readily as strings there, but that's no great trick. The real power play happens when you use even more references for values in the record. At this point, we've conceptually moved beyond simple records. We're now creating elaborate data structures that represent complicated relationships between the data they hold. Although we can use these to implement traditional data structures like linked lists, the recipes in the second half of this chapter don't deal specifically with any particular structure. Instead, they give generic techniques for loading, printing, copying, and saving generic data structures. The final program example demonstrates how to manipulate binary trees. See AlsoChapter 4 of Programming Perl ; perlref (1), perllol (1), and perldsc (1) | |||||||||||||||
|