Book HomePHP CookbookSearch this book

4.24. Finding the Union, Intersection, or Difference of Two Arrays

4.24.3. Discussion

Many necessary components for these calculations are built into PHP, it's just a matter of combining them in the proper sequence.

To find the union, you merge the two arrays to create one giant array with all values. But, array_merge( ) allows duplicate values when merging two numeric arrays, so you call array_unique( ) to filter them out. This can leave gaps between entries because array_unique( ) doesn't compact the array. It isn't a problem, however, as foreach and each( ) handle sparsely filled arrays without a hitch.

The function to calculate the intersection is simply named array_intersection( ) and requires no additional work on your part.

The array_diff( ) function returns an array containing all the unique elements in $old that aren't in $new. This is known as the simple difference:

$old = array('To', 'be', 'or', 'not', 'to', 'be');
$new = array('To', 'be', 'or', 'whatever');
$difference = array_diff($old, $new);
Array
(
    [3] => not
    [4] => to
)

The resulting array, $difference contains 'not' and 'to', because array_diff( ) is case-sensitive. It doesn't contain 'whatever' because it doesn't appear in $old.

To get a reverse difference, or in other words, to find the unique elements in $new that are lacking in $old, flip the arguments:

$old = array('To', 'be', 'or', 'not', 'to', 'be');
$new = array('To', 'be', 'or', 'whatever');
$reverse_diff = array_diff($new, $old);
Array
(
    [3] => whatever
)

The $reverse_diff array contains only 'whatever'.

If you want to apply a function or other filter to array_diff( ), roll your own diffing algorithm:

// implement case-insensitive diffing; diff -i

$seen = array( );
foreach ($new as $n) {
    $seen[strtolower($n)]++;
}

foreach ($old as $o) {
    $o = strtolower($o);
    if (!$seen[$o]) { $diff[$o] = $o; }
}

The first foreach builds an associative array lookup table. You then loop through $old and, if you can't find an entry in our lookup, add the element to $diff.

It can be a little faster to combine array_diff( ) with array_map( ):

$diff = array_diff(array_map('strtolower', $old), array_map('strtolower', $new));

The symmetric difference is what's in $a, but not $b, and what's in $b, but not $a:

$difference = array_merge(array_diff($a, $b), array_diff($b, $a));

Once stated, the algorithm is straightforward. You call array_diff( ) twice and find the two differences. Then you merge them together into one array. There's no need to call array_unique( ), since you've intentionally constructed these arrays to have nothing in common.



Library Navigation Links

Copyright © 2003 O'Reilly & Associates. All rights reserved.