Merging Hashes (Perl Cookbook, 2nd Edition)

5.11.3. Discussion

The first method, like the earlier recipe on inverting a hash, uses the hash-list equivalence explained in the introduction. (%A, %B) evaluates to a list of paired keys and values. When we assign it to %merged, Perl turns that list of pairs back into a hash.

Here's an example of that technique:

# %food_color as per the introduction
%drink_color = ( Galliano  => "yellow",
                 "Mai Tai" => "blue" );

%ingested_color = (%drink_color, %food_color);

Keys in both input hashes appear just once in the output hash. If a food and a drink shared the same name, for instance, then the last one seen by the first merging technique would be the one that showed up in the resultant hash.

This style of direct assignment, as in the first example, is easier to read and write, but requires a lot of memory if the hashes are large. That's because Perl has to unroll both hashes into a temporary list before the assignment to the merged hash is done. Step-by-step merging using each, as in the second technique, spares you that cost and lets you decide what to do with duplicate keys.

The first example could be rewritten to use the each technique:

# %food_color per the introduction, then
%drink_color = ( Galliano  => "yellow",
                 "Mai Tai" => "blue" );

%substance_color = ( );
while (($k, $v) = each %food_color) {
    $substance_color{$k} = $v;
}
while (($k, $v) = each %drink_color) {
    $substance_color{$k} = $v;
}

That technique duplicated the while and assignment code. Here's a sneaky way to get around that:

foreach $substanceref ( \%food_color, \%drink_color ) {
    while (($k, $v) = each %$substanceref) {
        $substance_color{$k} = $v;
    }
}

If we're merging hashes with duplicates, we can insert our own code to decide what to do with those duplicates:

foreach $substanceref ( \%food_color, \%drink_color ) {
    while (($k, $v) = each %$substanceref) {
        if (exists $substance_color{$k}) {
            print "Warning: $k seen twice.  Using the first definition.\n";
            next;
        }
        $substance_color{$k} = $v;
    }
}

In the special case of appending one hash to another, we can use the hash slice notation to give an elegant shorthand:

@all_colors{keys %new_colors} = values %new_colors;

This requires enough memory for lists of the keys and values of %new_colors. As with the first technique, the memory requirement might make this technique infeasible when such lists would be large.

5.11. Merging Hashes

5.11.1. Problem

5.11.2. Solution

5.11.3. Discussion

5.11.4. See Also