22.3. Parsing XML into SAX Events22.3.1. ProblemYou want to receive Simple API for XML (SAX) events from an XML parser because event-based parsing is faster and uses less memory than parsers that build a DOM tree. 22.3.2. SolutionUse the XML::SAX module from CPAN: use XML::SAX::ParserFactory; use MyHandler; my $handler = MyHandler->new( ); my $parser = XML::SAX::ParserFactory->parser(Handler => $handler); $parser->parse_uri($FILENAME); # or $parser->parse_string($XML); Logic for handling events goes into the handler class (MyHandler in this example), which you write:
22.3.3. DiscussionAn XML processor that uses SAX has three parts: the XML parser that generates SAX events, the handler that reacts to them, and the stub that connects the two. The XML parser can be XML::Parser, XML::LibXML, or the pure Perl XML::SAX::PurePerl that comes with XML::SAX. The XML::SAX::ParserFactory module selects a parser for you and connects it to your handler. Your handler takes the form of a class that inherits from XML::SAX::Base. The stub is the program shown in the Solution. The XML::SAX::Base module provides stubs for the different methods that the XML parser calls on your handler. Those methods are listed in Table 22-2, and are the methods defined by the SAX1 and SAX2 standards at http://www.saxproject.org/. The Perl implementation uses more Perl-ish data structures and is described in the XML::SAX::Intro manpage. Table 22-2. XML::SAX::Base methods
The two data structures you need most often are those representing elements and attributes. The $data parameter to start_element and end_element is a hash reference. The keys of the hash are given in Table 22-3. Table 22-3. An XML::SAX element hash
An attribute hash has a key for each attribute. The key is structured as "{namespaceURI}attrname". For example, if the current namespace URI is http://example.com/dtds/mailspec/ and the attribute is msgid, the key in the attribute hash is:
The attribute value is a hash; its keys are given in Table 22-4. Table 22-4. An XML::SAX attribute hash
Example 22-4 shows how to list the book titles using SAX events. It's more complex than the DOM solution because with SAX we must keep track of where we are in the XML document. Example 22-4. sax-titledumper
The XML::SAX::Intro manpage provides a gentle introduction to XML::SAX parsing. 22.3.4. See AlsoChapter 5 of Perl & XML; the documentation for the CPAN modules XML::SAX, XML::SAX::Base, and XML::SAX::Intro
Copyright © 2003 O'Reilly & Associates. All rights reserved. |
|