B.8. Character Information Items
Along with element and attribute information items,
characters are one of the core types of information used by
XML applications. SAX2 reports characters in groups,
rather than one at a time.
Property |
Callbacks |
Explanation |
[character code] |
ContentHandler.characters(),
ContentHandler.ignorableWhitespace()
|
These calls provide one or more characters
in the UTF-16 encoding. Normally, each Java char is a
single [character code], but surrogate pairs are used
to encode characters from the "Astral Planes," which
don't fit into 16 bits.
(No whitespace characters need surrogate pairs.)
|
[element content whitespace] |
|
When known, this Boolean property is encoded by
using the ignorableWhitespace()
callback instead of characters().
Most SAX parsers report this property even when they aren't
validating, though that's not required.
(If any external parameter entities are skipped, it is
not possible to reliably provide this information.)
|
[parent] |
|
Applications must keep track of this information
item if it is needed.
|
SAX2 permits reporting of a character property that the XML
Infoset doesn't address: whether the characters are in a CDATA
section. (DOM requires this information.)
Such section boundaries are reported using methods
in the LexicalHandler class.
| | | B.7. Unexpanded Entity Reference Information Items | | B.9. Comment Information Items |
Copyright © 2002 O'Reilly & Associates. All rights reserved.
|
|