However, DTDs do not provide fine control over the format and data
types of element and attribute values. Other than the various special
attribute types (ID, IDREF,
ENTITY, NMTOKEN, and so forth),
once an element or attribute has been declared to contain character
data, no limits may be placed on the length, type, or format of that
content. For narrative documents (such as web pages, book chapters,
newsletters, etc.), this level of control is probably good enough.
But as XML makes inroads into more data-intensive applications (such
as web services using SOAP), more precise control over the text
content of elements and attributes becomes important. The W3C XML
Schema standard includes the following features:
-
Simple and complex data types
-
Type derivation and inheritance
-
Element occurrence constraints
-
Namespace-aware element and attribute declarations
The most important of these features is the addition of simple data
types for parsed character data and attribute values. Unlike DTDs,
schemas can enforce specific rules about the contents of elements and
attributes. In addition to a wide range of built-in simple types
(such as string, integer,
decimal, and dateTime), the
schema language provides a framework for declaring new data types,
deriving new types from old types, and reusing types from other
schemas.
Besides simple data types, schemas add the ability to place more
explicit restrictions on the number and sequence of child elements
that can appear in a given location. This is even true when elements
are mixed with character data, unlike the mixed content model
(#PCDATA) supported by DTDs.