Chapter 5. Other SAX ClassesContents:The preceding chapters have addressed all of the most important SAX2 classes and interfaces. You may need to use a handful of other classes, including simple implementations of a few more interfaces and SAX1 support. This chapter briefly presents those remaining classes and interfaces. Your parser distribution should have SAX2 support, with complete javadoc for these classes. Consult that documentation if you need more information than found in this book. The API summary in Appendix A, "SAX2 API Summary" should also be helpful. 5.1. Helper ClassesThere are several classes in the org.xml.sax.helpers package that you will probably find useful from time to time. 5.1.1. The AttributesImpl ClassThis is a general-purpose implementation of the SAX2 Attributes interface. As well as reading attribute information (as defined in the interface), you can write and modify it. This class is quite handy when your application code is producing SAX2 events, perhaps because it is converting data structures to a SAX event stream. Remember the attributes provided to the ContentHandler.startElement() event callback are only valid for the duration of that call. If you need a copy of those attributes for later use, it's simplest to use this class; just create a new instance using the copy constructor. That copy constructor is one of the most widely used APIs in this class, other than the Attributes methods. It's often handy to keep a stack around to track the currently open elements and attributes. If you support xml:base, you'll also want to track base URIs for the document and for any external parsed entities. This is easy to implement using another key method provided by this class, addAttribute(). Example 5-1 shows how to maintain such a stack with xml:base support. It shows full support for XML namespaces, unlike Example 2-2, which is simple and attribute-free (shown in Chapter 2, "Introducing SAX2" in Section 2.3, "Basic ContentHandler Events"). Example 5-1. Maintaining an element and attribute stackimport java.io.IOException; import java.net.URL; import java.util.Hashtable; import org.xml.sax.*; import org.xml.sax.ext.*; import org.xml.sax.helpers.AttributesImpl; import org.xml.sax.helpers.DefaultHandler; public class XStack extends DefaultHandler implements LexicalHandler, DeclHandler { static class StackEntry { final String nsURI, localName; final String qName; final Attributes atts; final StackEntry parent; StackEntry ( String namespace, String local, String name, Attributes attrs, StackEntry next ) { this.nsURI = namespace; this.localName = local; this.qName = name; this.atts = new AttributesImpl (attrs); this.parent = next; } } private Locator locator; private StackEntry current; private Hashtable extEntities = new Hashtable (); private static final String xmlNamespace = "http://www.w3.org/XML/1998/namespace"; private void addMarker (String label, String uri) throws SAXException { AttributesImpl atts = new AttributesImpl (); if (locator != null && locator.getSystemId () != null) uri = locator.getSystemId (); // guard against InputSource objects without system IDs if (uri == null) throw new SAXParseException ("Entity URI is unknown", locator); // guard against illegal relative URIs (Xerces) try { new URL (uri); } catch (IOException e) { throw new SAXParseException ("parser bug: relative URI", locator); } atts.addAttribute (xmlNamespace, "base", "xml:base", "CDATA", uri); current = new StackEntry ("", "", label, atts, current); } // walk up stack to get values for xml:space, xml:lang, and so on public String getInheritedAttribute (String uri, String name) { String retval = null; boolean useNS = (uri != null && uri.length () != 0); for (StackEntry here = current; retval == null && here != null; here = here.parent) { if (useNS) retval = here.atts.getValue (uri, name); else retval = here.atts.getValue (name); } return retval; } // knows about XML Base recommendation, and xml:base attributes // can be used in callbacks for elements, PIs, comments, // characters, ignorable whitespace, and so on. public URL getBaseURI () throws IOException { return getBaseURI (current); } private URL getBaseURI (StackEntry here) throws IOException { String uri = null; while (uri == null && here != null) { uri = here.atts.getValue (xmlNamespace, "base"); if (uri != null) break; here = here.parent; } // marker for document or entity boundary? absolute. if (here.qName.charAt (0) == '#') return new URL (uri); // else it might be a relative uri. int offset = uri.indexOf (":/"); if (offset == -1 || uri.indexOf (':') < offset) return new URL (getBaseURI (here.parent), uri); else return new URL (uri); } // from ContentHandler interface public void startElement ( String namespace, String local, String name, Attributes attrs ) throws SAXException { current = new StackEntry (namespace, local, name, attrs, current); } public void endElement (String namespace, String local, String name) throws SAXException { current = current.parent; } public void setDocumentLocator (Locator l) { locator = l; } public void startDocument () throws SAXException { addMarker ("#DOCUMENT", null); } public void endDocument () { current = null; } // DeclHandler interface public void externalEntityDecl (String name, String publicId, String systemId) throws SAXException { if (name.charAt (0) == '%') return; // absolutize URL try { URL url = new URL (locator.getSystemId ()); systemId = new URL (url, systemId).toString (); } catch (IOException e) { // what could we do? } extEntities.put (name, systemId); } public void elementDecl (String name, String model) { } public void attributeDecl (String element, String name, String type, String mode, String defaultValue) {} public void internalEntityDecl (String name, String value) { } // LexicalHandler interface public void startEntity (String name) throws SAXException { String uri = (String) extEntities.get (name); if (uri != null) addMarker ("#ENTITY", uri); } public void endEntity (String name) throws SAXException { current = current.parent; } public void startDTD (String root, String publicId, String systemId) {} public void endDTD () {} public void startCDATA () {} public void endCDATA () {} public void comment (char buf[], int off, int len) {} } With such a stack of attributes, it's easy to find the current values of inherited attributes like xml:space, xml:lang, xml:base, and their application-specific friends. For example, an application might have a policy that all unspecified attributes with #IMPLIED default values are inherited from some ancestor element's value or are calculated using data found in such a context stack. Notice how this code added marker entries on the stack with synthetic xml:base attributes holding the true base URIs for the the document and external general entities. That information is needed to correctly implement the recommendation, and lets the getBaseURI() work entirely from this stack. If you need such functionality very often, you might want to provide a more general API, not packaged as internal to one handler implementation. 5.1.2. The LocatorImpl ClassThis is a general-purpose implementation of the Locator interface. As well as reading location properties (as defined in the interface), you can write and modify them. It's part of SAX1 and is still useful in SAX2. The locator provided by the ContentHandler.setDocumentLocator() can be used during any event callback, but the values it returns will change over time. If you need a copy of those values for later use, it's simplest to use this class; just create a new instance using the copy constructor. More typically, you will pass the locator to the constructor for some kind of SAXException, or just save the current base URI to use with relative URIs you find in document (or attribute) content. 5.1.3. The NamespaceSupport ClassWhen your code needs to track namespaces or their prefixes, use this SAX2 class. One audience for this class is authors of XML parsers; that's probably not you. More likely you're writing code that, like XPath or W3C's XML schemas, needs to parse prefixed names when they're found in attribute values or element content; this class can help. Or you may be writing code to select or generate element or attribute name prefixes for output. (If you only need to put those names in element or attribute names, you should be able to package that work in an event filter component that postprocesses your output and ensures that its namespace content matches XML 1.0 rules.) What this class does is maintain a stack of namespace contexts, in which each context holds a set of prefix-to-URI mappings; the contexts normally correspond to an element. This is the right model to use when you're writing an XML parser. If you try to use this class in a layer on top of a SAX2 parser, you'll notice a slight mismatch: all the prefix-mapping events for an element's namespace context precede the startElement() events for that element. That is, you'll need to create and populate new contexts before you see the element that signifies a new context.[23] One simple way to work around this is with a Boolean flag indicating whether a new context is active yet.
To use this class with a SAX2 parser that's set to report namespace prefix mappings, you have to modify some of your ContentHandler callbacks to maintain that stack of contexts. This is done in much the same way as you produce those callbacks yourself:
If you follow these rules, you can use processName() to interpret element and attribute names that you find according to the current prefix bindings, or you can use getPrefix() to choose a prefix given a particular namespace URI:
Consult the class documentation (javadoc) for full details about the methods on this class. Copyright © 2002 O'Reilly & Associates. All rights reserved. |
|