Filters and Writers (Java & XML, 2nd Edition)

4.3. Filters and Writers

At this point, I want to diverge from the beaten path. So far, I've detailed everything that's in a "standard" SAX application, from the reader to the callbacks to the handlers. However, there are a lot of additional features in SAX that can really turn you into a power developer, and take you beyond the confines of "standard" SAX. In this section, I'll introduce you to two of these: SAX filters and writers. Using classes both in the standard SAX distribution and available separately from the SAX web site (http://www.megginson.com/SAX), you can add some fairly advanced behavior to your SAX applications. This will also get you in the mindset of using SAX as a pipeline of events, rather than a single layer of processing. I'll explain this concept in more detail, but suffice it to say that it really is the key to writing efficient and modular SAX code.

package javaxml2; import org.xml.sax.Attributes; import org.xml.sax.SAXException; import org.xml.sax.XMLReader; import org.xml.sax.helpers.XMLFilterImpl; public class NamespaceFilter extends XMLFilterImpl { /** The old URI, to replace */ private String oldURI; /** The new URI, to replace the old URI with */ private String newURI; public NamespaceFilter(XMLReader reader, String oldURI, String newURI) { super(reader); this.oldURI = oldURI; this.newURI = newURI; } public void startPrefixMapping(String prefix, String uri) throws SAXException { // Change URI, if needed if (uri.equals(oldURI)) { super.startPrefixMapping(prefix, newURI); } else { super.startPrefixMapping(prefix, uri); } } public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException { // Change URI, if needed if (uri.equals(oldURI)) { super.startElement(newURI, localName, qName, attributes); } else { super.startElement(uri, localName, qName, attributes); } } public void endElement(String uri, String localName, String qName) throws SAXException { // Change URI, if needed if (uri.equals(oldURI)) { super.endElement(newURI, localName, qName); } else { super.endElement(uri, localName, qName); } } }

public void buildTree(DefaultTreeModel treeModel, DefaultMutableTreeNode base, String xmlURI) throws IOException, SAXException { // Create instances needed for parsing XMLReader reader = XMLReaderFactory.createXMLReader(vendorParserClass); NamespaceFilter filter = new NamespaceFilter(reader, "http://www.oreilly.com/javaxml2", "http://www.oreilly.com/catalog/javaxml2"); ContentHandler jTreeContentHandler = new JTreeContentHandler(treeModel, base, reader); ErrorHandler jTreeErrorHandler = new JTreeErrorHandler( ); // Register content handler filter.setContentHandler(jTreeContentHandler); // Register error handler filter.setErrorHandler(jTreeErrorHandler); // Register entity resolver filter.setEntityResolver(new SimpleEntityResolver( )); // Parse InputSource inputSource = new InputSource(xmlURI); filter.parse(inputSource); }

XMLReader reader = XMLReaderFactory.createXMLReader(vendorParserClass); NamespaceFilter xslFilter = new NamespaceFilter(reader, "http://www.w3.org/TR/XSL", "http://www.w3.org/1999/XSL/Transform"); NamespaceFilter xsdFilter = new NamespaceFilter(xslFilter, "http://www.w3.org/TR/XMLSchema", "http://www.w3.org/2001/XMLSchema");

4.3.2. XMLWriter

Now that you understand how filters work in SAX, I want to introduce you to a specific filter, XMLWriter . This class, as well as a subclass of it, DataWriter , can be downloaded from David Megginson's SAX site at http://www.megginson.com/SAX. XMLWriter extends XMLFilterImpl, and DataWriter extends XMLWriter. Both of these filter classes are used to output XML, which may seem a bit at odds with what you've learned so far about SAX. However, just as you could insert statements that output to Java Writers in SAX callbacks, so can this class. I'm not going to spend a lot of time on this class, because it's not really the way you want to be outputting XML in the general sense; it's much better to use DOM, JDOM, or another XML API if you want mutability. However, the XMLWriter class offers a valuable way to inspect what's going on in a SAX pipeline. By inserting it between other filters and readers in your pipeline, it can be used to output a snapshot of your data at whatever point it resides in your processing chain. For example, in the case where I'm changing namespace URIs, it might be that you want to actually store the XML document with the new namespace URI (be it a modified O'Reilly URI, a updated XSL one, or the XML Schema one) for later use. This becomes a piece of cake by using the XMLWriter class. Since you've already got SAXTreeViewer using the NamespaceFilter, I'll use that as an example. First, add import statements for java.io.Writer (for output), and the com.megginson.sax.XMLWriter class. Once that's in place, you'll need to insert an instance of XMLWriter between the NamespaceFilter and the XMLReader instances; this means output will occur after namespaces have been changed but before the visual events occur. Change your code as shown here:

    public void buildTree(DefaultTreeModel treeModel, 
                          DefaultMutableTreeNode base, String xmlURI) 
        throws IOException, SAXException {

        // Create instances needed for parsing
        XMLReader reader = 
            XMLReaderFactory.createXMLReader(vendorParserClass);        
        XMLWriter writer =
            new XMLWriter(reader, new FileWriter("snapshot.xml"));
        NamespaceFilter filter = 
            new NamespaceFilter(writer, 
                "http://www.oreilly.com/javaxml2",
                "http://www.oreilly.com/catalog/javaxml2");
        ContentHandler jTreeContentHandler = 
            new JTreeContentHandler(treeModel, base, reader);
        ErrorHandler jTreeErrorHandler = new JTreeErrorHandler( );

        // Register content handler
        filter.setContentHandler(jTreeContentHandler);

        // Register error handler
        filter.setErrorHandler(jTreeErrorHandler);
            
        // Register entity resolver
        filter.setEntityResolver(new SimpleEntityResolver( ));

        // Parse
        InputSource inputSource = 
            new InputSource(xmlURI);
        filter.parse(inputSource);        
    }

Be sure you set the parent of the NamespaceFilter instance to be the XMLWriter, not the XMLReader. Otherwise, no output will actually occur. Once you've got these changes compiled in, run the example. You should get a snapshot.xml file created in the directory you're running the example from; an excerpt from that document is shown here:

<?xml version="1.0" standalone="yes"?>

<book xmlns="http://www.oreilly.com/catalog/javaxml2">
  <title ora:series="Java" 
         xmlns:ora="http://www.oreilly.com">Java and XML</title>
  

  <contents>
    <chapter title="Introduction" number="1">
      <topic name="XML Matters"></topic>
      <topic name="What's Important"></topic>
      <topic name="The Essentials"></topic>
      <topic name="What's Next?"></topic>
    </chapter>
    <chapter title="Nuts and Bolts" number="2">
      <topic name="The Basics"></topic>
      <topic name="Constraints"></topic>
      <topic name="Transformations"></topic>
      <topic name="And More..."></topic>
      <topic name="What's Next?"></topic>
    </chapter>
    <!-- Other content... -->

  </contents>
</book>

Notice that the namespace, as changed by NamespaceFilter, is modified here. Snapshots like this, created by XMLWriter instances, can be great tools for debugging and logging of SAX events.

Both XMLWriter and DataWriter offer a lot more in terms of methods to output XML, both in full and in part, and you should check out the Javadoc included with the downloaded package. I do not encourage you to use these classes for general output. In my experience, they are most useful in the case demonstrated here.

4.3. Filters and Writers

4.3.1. XMLFilters

Example 4-5. NamespaceFilter class

Figure 4-2. SAXTreeViewer on contents.xml with NamespaceFilter in place

4.3.2. XMLWriter