If we're going to use XML to exchange documents electronically, we must be able to judge whether a document meets a certain set of necessary requirements. For example, an electronic invoice must, at minimum, include an invoice number, a date, and at least one item. Our systems should be smart enough to reject an invoice if it doesn't contain the required information. Additionally, we should be able to create these requirements ourselves.
You can associate a document type definition (DTD) with an XML document to enforce these sorts of rules. You can either create a DTD or use one that already exists. A major goal of XML is to encourage various groups (industry, community, academic, etc.) to form standards bodies to define collective DTDs. Eventually, these DTDs will form the basis for a variety of electronic data exchange systems.
A DTD is a lot like a database schema.[ 3 ] Just as you would define the columns in a database table, you can use a DTD to define the name and datatype of every element that can appear in an XML document. Just as you define a column constraint, you can require that particular elements appear within the document. Just as you would normalize a set of database tables into one-to-many or one-to-one relationships, you can create the same relationships by defining how the elements can be hierarchically nested.
Let's revisit the invoice example from the beginning of this chapter. If we were to simply model a basic invoice using an entity relationship diagram (ERD), we might wind up with something like Figure 9.2 .
We can use this diagram as a guide to constructing a corresponding DTD. For clarity, though, we'll start with the finished DTD and work backwards:
<!ELEMENT INVOICE (INVOICE_NUMBER, DATE, CUSTOMER+,INVOICE_ITEMS,TOTAL?)> <!ELEMENT INVOICE_NUMBER (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT CUSTOMER (#PCDATA)> <!ELEMENT INVOICE_ITEMS (ITEM+)> <!ELEMENT ITEM (ITEM_NAME, QUANTITY, PRICE)> <!ELEMENT ITEM_NAME (#PCDATA)> <!ATTLIST ITEM_NAME ITEM_NUM CDATA #REQUIRED> <!ELEMENT QUANTITY (#PCDATA)> <!ELEMENT PRICE (#PCDATA)> <!ELEMENT TOTAL (#PCDATA)>
As you can see from the example, the majority of the DTD consists of instructions to define the elements that can appear within an invoice. The first line defines the root element,
As we can see from the preceding code example, the
Declarations for each of these elements follow the root declaration. The first four items are the simplest declaration, and consist of a name and a datatype. XML datatypes are much more limited than the standard NUMBER, VARCHAR2, and RAW types used to define table columns. The datatype used here (
The next declaration,
That's it -- we've defined everything we need for our simple example: the name of each element, the number of times each element can appear, and the allowable nesting arrangements they can follow. All that remains now is to make sure our XML documents are valid, which means that they are both well-formed and comply with the associated DTD. This is the job of the XML parser.
Copyright (c) 2000 O'Reilly & Associates. All rights reserved.