home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam  

Book HomeXML in a NutshellSearch this book

Chapter 25. SAX Reference

SAX, the Simple API for XML, is a straightforward, event-based API used to parse XML documents. David Megginson, SAX's original author, placed SAX in the public domain. SAX is bundled with all parsers that implement the API, including Xerces, MSXML, Crimson, the Oracle XML Parser for Java, and Ælfred. However, you can also get it and the full source code from http://sax.sourceforge.net/.

SAX was originally defined as a Java API and is intended primarily for parsers written in Java, so this chapter will focus on its Java implementation. However, its port to other object-oriented languages, such as C++, Python, Perl, and Eiffel, is common and usually quite similar.

TIP: This chapter covers SAX2 exclusively. In 2002, all major parsers that support SAX support SAX2. The major change from SAX1 to SAX2 was the addition of namespace support. This addition necessitated changing the names and signatures of almost every method and class in SAX. The old SAX1 methods and classes are still available, but they're now deprecated and shouldn't be used.

25.1. The org.xml.sax Package

The org.xml.sax package contains the core interfaces and classes that comprise the Simple API for XML.

The XMLReader Interface

The XMLReader interface represents the XML parser that reads XML documents. You generally do not implement this interface yourself. Instead, use the org.xml.sax.helpers.XMLReaderFactory class to build a parser-specific implementation. Then use this parser's various setter methods to configure the parsing process. Finally, invoke the parse( ) method to read the document, while calling back to methods in your own implementations of ContentHandler, ErrorHandler, EntityResolver, and DTDHandler as the document is read:

package org.xml.sax;

public interface XMLReader {

  public boolean getFeature(String name)
   throws SAXNotRecognizedException, SAXNotSupportedException;
  public void    setFeature(String name, boolean value)
   throws SAXNotRecognizedException, SAXNotSupportedException;
  public Object  getProperty(String name)
   throws SAXNotRecognizedException, SAXNotSupportedException;

  public void    setProperty(String name, Object value)
   throws SAXNotRecognizedException, SAXNotSupportedException;
  public void           setEntityResolver(EntityResolver resolver);
  public EntityResolver getEntityResolver( );
  public void           setDTDHandler(DTDHandler handler);
  public DTDHandler     getDTDHandler( );
  public void           setContentHandler(ContentHandler handler);
  public ContentHandler getContentHandler( );
  public void           setErrorHandler(ErrorHandler handler);
  public ErrorHandler   getErrorHandler( );

  public void parse(InputSource input) throws IOException, SAXException;
  public void parse(String systemID) throws IOException, SAXException;

The SAXExceptions Class

Most exceptions thrown by SAX methods are instances of the SAXException class or one of its subclasses. The single exception to this rule is the parse( ) method of XMLReader, which may throw a raw IOException if a purely I/O-related error occurs, for example, if a socket is broken before the parser finishes reading the document from the network.

Besides the usual exception methods, such as getMessage( ) and printStackTrace( ), that SAXException inherits from or overrides in its superclasses, SAXException adds a getException( ) method to return the nested exception that caused the SAXException to be thrown in the first place:

package org.xml.sax;

public class SAXException extends Exception {

    public SAXException(String message);
    public SAXException(Exception ex);
    public SAXException(String message, Exception ex);

    public String    getMessage( );
    public Exception getException( );
    public String    toString( );


Library Navigation Links

Copyright © 2002 O'Reilly & Associates. All rights reserved.