9.3. XML Programming Tools
Now we'll cover software that performs a somewhat inverse role compared to the ground we just covered. Instead of giving you Perl-lazy ways to work with XML documents, it uses XML standards to make things easier for a task that doesn't explicitly involve XML. Recently, some key folk in the community from the perl-xml mailing list have been seeking a mini-platform of universal data handling in Perl with SAX at its core. Some very interesting (and useful) examples have been born from this research, including Ilya Sterin's XML::SAXDriver::Excel and XML::SAXDriver::CSV, and Matt Sergeant's XML::Generator::DBI. All three modules share the ability to take a data format -- Microsoft Excel files, Comma-Separated Value files, and SQL databases, respectively -- and wrap a SAX API around it (the same sort covered in Chapter 5, "SAX", so that any programmer can merrily pretend that the format is as well behaved and manageable as all the other XML documents they've seen (even if the underlying module is quietly performing acrobatics akin to medicating cats).
We'll look more closely at one of these tools, as its subject matter has some interesting implications involving recent developments, before we move on to this chapter's final section.
XML::Generator::DBI is a fine example of a glue module, a simple piece of software whose only job is to take two existing (but not entirely unrelated) pieces of software and let them talk to one another. In this case, when you construct an object of this class, you hand it your additional objects: a DBI-flavored database handle and a SAX-speaking handler object.
XML::Generator::DBI does not know or care how or where the objects came from, but only trusts that they respond to the standard method calls of their respective families (either DBI, SAX, or SAX2). Then you can call an execute method on the XML::Generator::DBI object with an ordinary SQL statement, much as you would with a DBI-created database handle.
The following example shows this module in action. The SAX handler in question is an instance of Michael Koehne's XML::Handler::YAWriter module, a pleasantly configurable module that turns SAX events into textual output. Using this program, we can turn, say, a SQL table of CDs into well-formed XML and then have it printed to standard output:
#!/usr/bin/perl use warnings; use strict; use XML::Generator::DBI; use XML::Handler::YAWriter; use DBI; my $ya = XML::Handler::YAWriter->new(AsFile => "-"); my $dbh = DBI->connect("dbi:mysql:dbname=test", "jmac", ""); my $generator = XML::Generator::DBI->new( Handler => $ya, dbh => $dbh ); my $sql = "select * from cds"; $generator->execute($sql);
The result is this:
<?xml version="1.0" encoding="UTF-8"?><database> <select query="select * from cds"> <row> <id>1</id> <artist>Donald and the Knuths</artist> <title>Greatest Hits Vol. 3.14159</title> <genre>Rock</genre> </row> <row> <id>2</id> <artist>The Hypnocrats</artist> <title>Cybernetic Grandmother Attack</title> <genre>Electronic</genre> </row> <row> <id>3</id> <artist>The Sam Handwich Quartet</artist> <title>Handwich a la Yogurt</title> <genre>Jazz</genre> </row> </select> </database>
This example isn't very interesting, but it looks good in print. The point is that we didn't have to use YAWriter. We could have used any SAX handler Perl package on our system, including ones we wrote ourselves, and tossed them into the mix when baking a new XML::Generator::DBI object. Given the same database table as the example above used, when the $genenerator object's execute method is called, it would act as if it had just parsed the previous XML document (modulo the whitespace that YAWriter inserted to make things more human-readable). It would act this way even though the actual source isn't an XML document at all, but a database table.
9.3.2. Further Ruminations on DBI and SAX
The main reason why the Perl DBI earned its position as the preeminent Perl database interface involves its architecture. When installing DBI, one must obtain two separate pieces: DBI.pm contains all the code behind the DBI API and its documentation, but it alone won't let you drive a database with Perl; you also need at least one DBD module that is suitable to the type of database you plan to use. CPAN has many of these modules to choose from, DBD::MySQL, DBD::Oracle, and DBD::Pg for Postgres. While the programmer interacts only with the DBI module, feeding it SQL queries and receiving results from it, the appropriate DBD module communicates directly with the actual database. The DBD module turns the abstract DBI methods into highly specific and platform-dependent database commands. It does this far underneath the level at which the DBI user works, so that any Perl program using DBI will work on any database for which somebody has made available a DBD driver.
A similar movement is on the ascent in the Perl and XML world, which started in 2001 with the SAX drivers mentioned at the start of this section and ended up with the XML::SAX module, a SAX2 implementation that works like DBI. Tell it you want a SAX parser, optionally specifying the SAX features your program's gotta have, and it roots around on your system to find the best tool for the job, which it instantiates and hands back to you. Then you plug in the SAX handler package of your choice (much as with XML::Generator::DBI) and go to town.
Instead of a variety of DBD drivers that let you use a standard interface to pull data from a variety of databases, PerlSAX handlers let you use a standard interface to pull data from any imaginable data source. As with DBI, it requires only one intrepid hacker to wade through the data format in question, and suddenly other Perl programmers with a clue about SAX hacking can find themselves using a standard API to handle this once-alien format.
Copyright © 2002 O'Reilly & Associates. All rights reserved.