14.5. Sorting Large DBM Files14.5.1. ProblemYou want to process a large dataset you'd like to commit to a DBM file in a particular order. 14.5.2. SolutionUse the DB_File's B-tree bindings and supply a comparison function of your own devising:
14.5.3. DescriptionAn annoyance of hashes, whether in memory or as DBM files, is that they do not maintain proper ordering. The CPAN module Tie::IxHash can make a regular hash in memory maintain its insertion order, but that doesn't help for DBM databases or arbitrary sorting criteria. The DB_File module supports a nice solution to this using a B-tree implementation. One advantage of a B-tree over a regular DBM hash is its ordering. When the user defines a comparison function, all calls to keys, values, and each are automatically ordered. For example, Example 14-3 is a program that maintains a hash whose keys will always be sorted case-insensitively. Example 14-3. sortdemo
By default, the entries in a B-tree DB_File database are stored alphabetically. Here, though, we provide a case-insensitive comparison function, so using each to fetch all the keys would show: by 6 camp 4 Can't 1 down 5 Gibraltar 7 go 3 you 2 This sorting property on hashes is so convenient that it's worth using even without a permanent database. If you pass undef where the filename is expected on the tie, DB_File will create a file in /tmp and then immediately unlink it, giving an anonymous database:
Remember these two things if you supply a comparison for your BTREE database. One, the new compare function must be specified when you create the database. Two, you cannot change the ordering once the database has been created; you must use the same compare function every time you access the database. Using BTREE databases under DB_File also permits duplicate or partial keys. See its documentation for examples. 14.5.4. See Also
Copyright © 2003 O'Reilly & Associates. All rights reserved. |
|