A few examples show how this works. Consider the declaration of the
<html> tag, taken from the HTML DTD:
<!ELEMENT html (head, body)>
This defines the element named html whose content
is a head element followed by a
body element. Notice that you do not enclose the
element names in angle brackets within the DTD; that notation is used
only when the elements are actually used in a document.
Within the HTML DTD, you can find the declaration of the
<head> tag:
<!ELEMENT head (%head.misc;,
((title, %head.misc;, (base, %head.misc;)?) |
(base, %head.misc;, (title, %head.misc;))))>
Gulp. What on earth does this mean? First, notice that there is a
parameter entity named head.misc used several
times in this declaration. Let's go get it:
<!ENTITY % head.misc "(script|style|meta|link|object)*">
Now things are starting to make sense: head.misc
defines a group of elements, from which you may choose one. However
the trailing asterisk indicates that you may include zero or more of
these elements. The net result is that anywhere
%head.misc; appears, you can include zero or more
script, style,
meta, link, or
object elements, in any order. Sound familiar?
Returning to the head declaration, we see that we
are allowed to begin with any number of the head
miscellaneous elements. We must then make a choice: either a group
consisting of a title element, optional
miscellaneous items, and an optional base element
followed by miscellaneous items; or, a group consisting of a
base element, miscellaneous items, a
title element, and some more miscellaneous items.
Why such a convoluted rule for the <head>
tag? Why not just write:
<!ELEMENT head (script|style|meta|link|object|base|title)*>
which allows any number of the head elements to
appear, or none at all? Because the HTML standard requires that every
<head> tag contain exactly one
<title> tag. It also allows for only one
<base> tag, if any. Otherwise, the standard
does allow any number of the other head elements,
in any order.
Put simply, the head element declaration, while
initially confusing, forces the XML processor to ensure that exactly
one title element appears in the
head element, and that if specified, just one
base element appears as well. It then allows for
any of the other head elements, in any order.
This one example demonstrates a lot of the power of XML: the ability
to define commonly used elements using parameter entities and the use
of grammar rules to dictate document syntax. If you can work through
the head element declaration and understand it,
you are well on your way to reading any XML DTD.