home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam  


Perl CookbookPerl CookbookSearch this book

22.4. Making Simple Changes to Elements or Text

22.4.3. Discussion

A SAX filter accepts SAX events and triggers new ones. The XML::SAX::Base module detects whether your handler object is called as a filter. If so, the XML::SAX::Base methods pass the SAX events onto the next filter in the chain. If your handler object is not called as a filter, then the XML::SAX::Base methods consume events but do not emit them. This makes it almost as simple to write events as it is to consume them.

The XML::SAX::Machines module chains the filters for you. Import its Pipeline function, then say:

my $machine = Pipeline(Filter1 => Filter2 => Filter3 => Filter4);
$machine->parse_uri($FILENAME);

SAX events triggered by parsing the XML file go to Filter1, which sends possibly different events to Filter2, which in turn sends events to Filter3, and so on to Filter4. The last filter should print or otherwise do something with the incoming SAX events. If you pass a reference to a typeglob, XML::SAX::Machines writes the XML to the filehandle in that typeglob.

Example 22-5 shows a filter that turns the id attribute in book elements from the XML document in Example 22-1 into a new id element. For example, <book id="1"> becomes <book><id>1</id>.

Example 22-5. filters-rewriteids

package RewriteIDs;
# RewriteIDs.pm -- turns "id" attributes into elements

use base qw(XML::SAX::Base);

my $ID_ATTRIB = "{  }id";   # the attribute hash entry we're interested in

sub start_element {
    my ($self, $data) = @_;

    if ($data->{Name} eq 'book') {
        my $id = $data->{Attributes}{$ID_ATTRIB}{Value};
        delete $data->{Attributes}{$ID_ATTRIB};
        $self->SUPER::start_element($data);

        # make new element parameter data structure for the <id> tag
        my $id_node = {  };
        %$id_node = %$self;
        $id_node->{Name} = 'id';     # more complex if namespaces involved
        $id_node->{Attributes} = {  };

        # build the <id>$id</id>
        $self->SUPER::start_element($id_node);
        $self->SUPER::characters({ Data => $id });
        $self->SUPER::end_element($id_node);
    } else {
        $self->SUPER::start_element($data);
    }
}

1;

Example 22-6 is the stub that uses XML::SAX::Machines to create the pipeline for processing books.xml and print the altered XML.

Example 22-6. filters-rewriteprog

#!/usr/bin/perl -w
# rewrite-ids -- call RewriteIDs SAX filter to turn id attrs into elements

use RewriteIDs;
use XML::SAX::Machines qw(:all);

my $machine = Pipeline(RewriteIDs => *STDOUT);
$machine->parse_uri("books.xml");

The output of Example 22-6 is as follows (truncated for brevity):

<book><id>1</id>
    <title>Programming Perl</title>
 ...
<book><id>2</id>
    <title>Perl &amp; LWP</title>
 ...

To save the XML to the file new-books.xml, use the XML::SAX::Writer module:

#!/usr/bin/perl -w

use RewriteIDs;
use XML::SAX::Machines qw(:all);
use XML::SAX::Writer;

my $writer = XML::SAX::Writer->new(Output => "new-books.xml");
my $machine = Pipeline(RewriteIDs => $writer);
$machine->parse_uri("books.xml");

You can also pass a scalar reference as the Output parameter to have the XML appended to the scalar; as an array reference to have the XML appended to the array, one array element per SAX event; or as a filehandle to have the XML printed to that filehandle.



Library Navigation Links

Copyright © 2003 O'Reilly & Associates. All rights reserved.