home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam  

Book HomeSAX2Search this book

1.6. Some Popular SAX2 Parser Distributions

Today a variety of high-quality SAX2 parsers are available. Increasingly, they are packaged with Java programming environments, so you may not need to fetch one yourself unless you need upgrades (or bug fixes), or are constructing such a programming environment yourself (perhaps packaging an embedded system or a standalone application). You should be able to bootstrap any SAX parser. As a rule, if an XML parser is part of your Java programming environment, it already supports SAX and probably SAX2. The documentation should say whether SAX2 is supported. If it only mentions SAX1, you can upgrade to get most of the core SAX2 features; see Section 5.2, "SAX1 Support ", in Chapter 5, "Other SAX Classes", for more information.

If your programming environment doesn't include a SAX parser, you'll need to get and install one. This section provides a brief summary of some of the most widely available open source SAX2 parsers.[5] These packages all include SAX2, DOM Level 2, and JAXP 1.1 support, and can validate XML for you. They also have full support for the standard SAX2 extensions. If you don't happen to download documentation that includes the SAX2 documentation, it'll be available from the same site as the parser. All of these perform well in most applications, as long as you avoid the memory penalties of DOM.

[5]Proprietary SAX2 parsers exist, such as one from Oracle that is commonly used in Oracle-hosted server-side applications. More information is available on the Oracle web site, http://www.oracle.com/xml/.

Current versions of all these parsers do quite well on the open source SAX/XML conformance tests, available at http://xmlconf.sourceforge.net/java/. Those tests verify that these processors report essential information required of a SAX1 processor, and evaluate how well they support the XML 1.0 specification. SAX2 conformance testing isn't yet as well advanced, though some tests are now available.

In addition to a SAX2 parser, you will likely want to have some SAX2/XML utilities that are layered on top of that parser. The packages described here include a DOM implementation, which is normally provided as a clean layer over SAX2. You might also consider other more Java-friendly packages such as DOM4J (http://www.dom4j.org) or JDOM (http://www.jdom.org), both of which are layered over SAX2, as well as other APIs that provide more data-structure options. When you're learning SAX, having access to the source code of tools and applications built with SAX can help you learn the API, at least if it's high-quality source that uses the SAX APIs correctly.

1.6.1. Ælfred2

One of the original XML parsers mentioned earlier, Ælfred, has long been recognized for its simplicity, small size, and good performance. As XML parsers go, it is easy to read and understand. With a different maintainer (your humble author), this parser was updated to be the first with full native SAX2 support, and to substantially improve its conformance to the XML specification. This updated version is called Ælfred2, and versions have been incorporated in a variety of applications where its simplicity, size, and conformance are compelling features. It is now part of the GNU Classpath Extensions project and forms the core of the GNU JAXP library.

The updated version has taken SAX2 further than most other parsers. It has a highly modular structure; the reference distribution is able to use an optional "stream validator" that uses the SAX2 events. The model of an XML pipeline of such events is a natural and powerful way to think about SAX; the SAX2 pipeline package in this distribution lets applications compose arbitrary processing modules in series or parallel. This style of SAX2 processing is emphasized in this book, and some of the examples show how to use these advanced components. Validation and DOM support remain completely modular, and use SAX event pipelines, so Ælfred can still be distributed as a lightweight nonvalidating parser without those components. Likewise, the validation and DOM support don't need Ælfred to work.

The current version of Ælfred is licensed under the GNU General Public License (GPL), with the "library exception" clause to ensure that it can be used in proprietary applications (notably, embedded systems) that aren't themselves licensed under the GPL. That license is used with many GNU libraries, such as the GCC Java (GCJ) runtime libraries. Ælfred includes a gnujaxp.jar file that needs installation.

See http://www.gnu.org/software/classpathx/jaxp/ for information about the current distribution of Ælfred.

Library Navigation Links

Copyright © 2002 O'Reilly & Associates. All rights reserved.