1.5. Keep in Mind...
In many cases, you'll find that the XML modules on
CPAN satisfy 90 percent of your needs. Of course, that final 10
percent is the difference between being an essential member of your
company's staff and ending up slated for the next
round of layoffs. We're going to give you your
money's worth out of this book by showing you in
gruesome detail how XML processing in Perl works at the lowest levels
(relative to any other kind of specialized text munging you may
perform with Perl). To start, let's go over some
basic truths:
-
It doesn't matter where it comes from.
By the time the XML
parsing part of a
program gets its hands on a document, it doesn't
give a camel's hump where the thing came from. It
could have been received over a network, constructed from a database,
or read from disk. To the parser, it's good (or bad)
XML, and that's all it knows.
Mind you, the program as a whole might care a great deal. If we write
a program that implements XML-RPC, for example, it better know
exactly how to use TCP to fetch and send all that XML data over the
Internet! We can have it do that fetching and sending however we
like, as long as the end product is the same: a clean XML document
fit to pass to the XML processor that lies at the
program's core.
We will get into some detailed examples of larger programs later in
this book.
-
Structurally, all XML documents are similar.
No matter why or how they were put together or to what purpose
they'll be applied, all XML documents must follow
the same basic rules of well-formedness: exactly one root element, no
overlapping elements, all attributes quoted, and so on. Every XML
processor's parser component will, at its core, need
to do the same things as every other XML processor. This, in turn,
means that all these processors can share a common base. Perl
XML-processing programs usually observe this in their use of one of
the many free parsing modules, rather than having to reimplement
basic XML parsing procedures every time.
Furthermore, the one-document, one-element nature of XML makes
processing a pleasantly fractal experience, as any document invoked
through an external entity by another document magically becomes
"just another element" within the
invoker, and the same code that crawled the first document can
skitter into the meat of any reference (and anything to which the
reference might refer) without batting an eye.
-
In meaning, all XML applications are different.
XML applications are the raison d'être of
any one XML document, the higher-level set of rules they follow with
an aim for applicability to some useful purpose -- be it filling
out a configuration file, preparing a network transmission, or
describing a comic strip. XML applications exist to not only bless
humble documents with a higher sense of purpose, but to require the
documents to be written according to a given application
specification.
DTDs help enforce the consistency of this structure. However, you
don't have to have a formal validation scheme to
make an application. You may want to create some validation rules,
though, if you need to make sure that your successors (including
yourself, two weeks in the future) do not stray from the path you had
in mind when they make changes to the program. You should also create
a validation scheme if you want to allow others to write programs
that generate the same flavor of XML.
Most of the XML hacking you'll accomplish will
capitalize on this document/application duality. In most cases, your
software will consist of parts that cover all three of these facts:
-
It will accept input in an appropriate way -- listening to a
network socket, for example, or reading a file from disk. This
behavior is very ordinary and Perlish: do whatever's
necessary here to get that data.
-
It will pass captured input to some kind of XML processor. Dollars to
doughnuts says you'll use one of the parsers that
other people in the Perl community have already written and continue
to maintain, such as XML::Simple, or the more
sophisticated modules we'll discuss later.
-
Finally, it will Do Something with whatever that processor did to the
XML. Maybe it will output more XML (or HTML), update a database, or
send mail to your mom. This is the defining point of your XML
application -- it takes the XML and does something meaningful with
it. While we won't cover the infinite possibilities
here, we will discuss the crucial ties between the XML processor and
the rest of your program.
| | | 1.4. A Myriad of Modules | | 1.6. XML Gotchas |
Copyright © 2002 O'Reilly & Associates. All rights reserved.
|
|