A.3. A Short History of XML Schema Languages
The list
of schema languages is long and needs to include languages developed
for SGML (the language used before XML was born) to be complete. The
list that I propose is far from exhaustive, and includes only the
major proposals that have influenced the schema languages I see as
the most promising.
A.3.1. The DTD Family
Mandatory for any SGML application, a
simplified version of the SGML DTDs was introduced in the XML 1.0
Recommendation. Even though a DTD is not mandatory for an application
to read and understand a XML document, many developers highly
recommend writing DTDs for any XML application.
A.3.2. The W3C XML Schema Family
The W3C XML Schema Working Group
received many proposals that were contributed as notes:
-
XML-Data,
submitted as a note (http://www.w3.org/TR/1998/NOTE-XML-data) in
January 1998 by Microsoft, DataChannel, Arbortext, Inso Corporation,
and the University of Edinburgh, included most of the basic concepts
later developed by W3C XML Schema. Although the details were not
fully developed, the note covered a lot of ground that was kept out
of W3C XML Schema, such as internal and external entity definitions
and the mapping to RDF (Resource Description Framework) and OOP
structures.
-
XML-Data-Reduced
(XDR), submitted in July 1998 (http://www.ltg.ed.ac.uk/~ht/XMLData-Reduced.htm)
by Microsoft and the University of Edinburgh was presented to
"refine and subset those ideas down to a more
manageable size in order to allow faster progress toward adopting a
new schema language for XML" (mappings were left
out). XDR was implemented by Microsoft and used by the BizTalk
framework.
-
DCD (Document
Content Description for XML), also submitted in July 1998
(http://www.w3.org/TR/NOTE-dcd)
by Textuality, Microsoft, and IBM, was a "subset of
the XML-Data Submission (XML-Data) and expressed it in a way which is
consistent with the ongoing W3C RDF (Resource Description Framework)
effort." Mapping considerations were left out, but
the language took care to be consistent with RDF through features
such as "Interchangeability of Elements and
Attributes."
-
SOX (Schema
for Object-Oriented XML) was developed by Veo Systems/Commerce One
and submitted as a note in September 1998 (a second version was
submitted in July 1999 (see http://www.w3.org/TR/NOTE-SOX) as
"informed by the XML 1.0 specification as well as
the XML-Data submission (XML-Data), the Document Content Description
submission (DCD), and the EXPRESS language reference manual
(ISO-10303-11)." SOX was very influenced by OOP
language design and included concepts of interface and
implementation, but it was also influenced by DTDs and included
support for "parameters." SOX is
widely used by Commerce One.
-
DDML (Document Definition Markup
Language or XSchema) was the "result of
contributions from a large number of people on the XML-Dev mailing
list, coordinated by a smaller group of editors"
(Ronald Bourret, John Cowan, Ingo Macherius, and Simon
St.
Laurent) and was submitted as a note in January 1999 (http://www.w3.org/TR/NOTE-ddml). Its purpose
was to "encode the logical (as opposed to physical)
content of DTDs in an XML document." Great attention
is paid to the definition of the back and forward conversions between
DTDs and DDML, and the document also includes an
"experimental" chapter proposing
"Inline DDML Elements." DDML made a
clear distinction between structures and data and left datatypes out.
-
W3C XML
Schema, published as a Recommendation in May 2001 (http://www.w3.org/TR/xmlschema-0, http://www.w3.org/TR/xmlschema-1 and
http://www.w3.org/TR/xmlschema-2)
acknowledges the influence of DCD, DDML, SOX, XML-Data, and XDR in
its list of references. It appears to have picked pieces from each of
these proposals but is also a compromise between them. The main
sponsors of the two languages still actively used and developed
(Microsoft for XDR and Commerce One for SOX) both announced that they
would support the W3C XML Schema for their new developments. W3C XML
Schema will most likely become the only surviving member of this
family in the long-term.
A.3.3. The RELAX NG Family
The RELAX NG family is a more traditional
marriage between grammar-based XML Schema languages that have chosen
to unite their strengths.
-
First published in March 2000 as a Japanese ISO Standard Technical
Report written by Murata Makoto, Regular Language description for
XML Core (RELAX; see http://www.xml.gr.jp/relax) is both simple
("Tired of complicated specifications? You just
RELAX !") and built on a solid mathematical
foundation (the adaptation of the hedge automata theory to XML
trees). It was approved as an ISO/IEC Technical Report in May 2001.
-
XDuce (http://xduce.sourceforge.net) was first
announced in March 2000."XDuce
('transduce') is a typed
programming language that is specifically designed for processing XML
data. One can read an XML document as an XDuce value, extract
information from it or convert it to another format, and write out
the result value as an XML document." Although it is
not meant to be a schema language, its typing system has influenced
the schema languages.
-
Published by James Clark in January 2001,
TREX (Tree Regular Expressions for XML;
see http://thaiopensource.com/trex) is
"basically the type system of XDuce with an XML
syntax and with a bunch of additional features." The
names and content models of the elements used to define the tree
patterns of a TREX schema have been carefully chosen, and TREX
schemas are usually as easy to read as a plain text description. The
simplicity of the structure of the language also allows the
resurrection of a consistent treatment between elements and
attributes, a feature lost since DCD.
-
Announced in May 2001, RELAX NG (RELAX New Generation) is a merger of
RELAX and TREX, developed by an OASIS TC (http://www.oasis-open.org/committees/relax-ng),
coedited by James Clark and Murata Makoto. "The key
features of RELAX NG are that it is simple, easy to learn, uses XML
syntax, does not change the information set of an XML document,
supports XML namespaces, treats attributes uniformly with elements so
far as possible, has unrestricted support for unordered content, has
unrestricted support for mixed content, has a solid theoretical
basis, and can partner with a separate datatyping language (such W3C
XML Schema Datatypes)." RELAX NG is now an official
specification of the OASIS RELAX NG Technical Committee and will
probably progress to become an ISO/IEC International Standard as part
of DSDL.
A.3.4. Schematron
Schematron (http://www.ascc.net/xml/resource/schematron/schematron.html),
which was first proposed in September 1999 by Rick
Jelliffe of
the Academia Sinica Computing Centre, is an
unusual schema language. It defines validation rules using XPath
expressions. Schematron is also described in the ISO DSDL project.
A.3.5. Examplotron
Starting from the observations that
instance documents are usually much easier to understand than the
schemas that describe them, and that schema languages often need to
give examples of instance documents to help human readers to
understand their syntax, I proposed Examplotron (http://examplotron.org) in March 2001, to
define "schemas by example" using
sample instance documents as actual schemas.
| | | A.2. Classification of XML Schema Languages | | A.4. Sample Application |
Copyright © 2002 O'Reilly & Associates. All rights reserved.
|
|