Example 4-1. XML fragment
<recipe>
<name>peanut butter and jelly sandwich</name>
<!-- add picture of sandwich here -->
<ingredients>
<ingredient>Gloppy™ brand peanut butter</ingredient>
<ingredient>bread</ingredient>
<ingredient>jelly</ingredient>
</ingredients>
<instructions>
<step>Spread peanutbutter on one slice of bread.</step>
<step>Spread jelly on the other slice of bread.</step>
<step>Put bread slices together, with peanut butter and
jelly touching.</step>
</instructions>
</recipe>
Apply a parser to the preceding example and it might generate this
list of events:
-
A document start (if this is the beginning of a document and not a
fragment)
-
A start tag for the <recipe> element
-
A start tag for the <name> element
-
The piece of text "peanut butter and jelly
sandwich"
-
An end tag for the <name> element
-
A comment with the text "add picture of sandwich
here"
-
A start tag for the <ingredients> element
-
A start tag for the <ingredient> element
-
The text "Gloppy"
-
A reference to the entity trade
-
The text "brand peanut butter"
-
An end tag for the <ingredient> element
. . . and so on, until the final event -- the end of the
document -- is reached.
Somewhere between chopping up a stream into tokens and processing the
tokens is a layer one might call a dispatcher. It branches the
processing depending on the type of token. The code that deals with a
particular token type is called a handler.
There could be a handler for start tags, another for character data,
and so on. It could be a compound if statement,
switching to a subroutine to handle each case. Or, it could be built
into the parser as a callback dispatcher, as is the case with
XML::Parser's stream mode. If you
register a set of subroutines, one to an event type, the parser calls
the appropriate one for each token as it's
generated. Which strategy you use depends on the parser.
 |  |  |
4. Event Streams |  | 4.3. The Parser as Commodity |