home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam  


Book HomeXML in a NutshellSearch this book

3.3. Attribute Declarations

As well as declaring its elements, a valid document must declare all the elements' attributes. This is done with ATTLIST declarations. A single ATTLIST can declare multiple attributes for a single element type. However, if the same attribute is repeated on multiple elements, then it must be declared separately for each element where it appears. (Later in this chapter you'll see how to use parameter entity references to make this repetition less burdensome.)

For example, this ATTLIST declaration declares the source attribute of the image element:

<!ATTLIST image source CDATA #REQUIRED>

It says that the image element has an attribute named source. The value of the source attribute is character data, and instances of the image element in the document are required to provide a value for the source attribute.

A single ATTLIST declaration can declare multiple attributes for the same element. For example, this ATTLIST declaration not only declares the source attribute of the image element, but also the width, height, and alt attributes:

<!ATTLIST image source CDATA #REQUIRED
                width  CDATA #REQUIRED
                height CDATA #REQUIRED
                alt    CDATA #IMPLIED
>

This declaration says the source, width, and height attributes are required. However, the alt attribute is optional and may be omitted from particular image elements. All four attributes are declared to contain character data, the most generic attribute type.

This declaration has the same effect and meaning as four separate ATTLIST declarations, one for each attribute. Whether to use one ATTLIST declaration per attribute is a matter of personal preference, but most experienced DTD designers prefer the multiple-attribute form. Given judicious application of whitespace, it's no less legible than the alternative.

3.3.1. Attribute Types

In merely well-formed XML, attribute values can be any string of text. The only restrictions are that any occurrences of < or & must be escaped as &lt; and &amp; and whichever kind of quotation mark, single or double, is used to delimit the value must also be escaped. However, a DTD allows you to make somewhat stronger statements about the content of an attribute value. Indeed, these are stronger statements than can be made about the contents of an element. For instance, you can say that an attribute value must be unique within the document, that it must be a legal XML name token, or that it must be chosen from a fixed list of values.

There are ten attribute types in XML. They are:

  • CDATA

  • NMTOKEN

  • NMTOKENS

  • Enumeration

  • ENTITY

  • ENTITIES

  • ID

  • IDREF

  • IDREFS

  • NOTATION

These are the only attribute types allowed. A DTD cannot say that an attribute value must be an integer or a date between 1966 and 2002, for example.

3.3.1.4. Enumeration

An enumeration is the only attribute type that is not an XML keyword. Rather, it is a list of all possible values for the attribute, separated by vertical bars. Each possible value must be an XML name token. For example, the following declarations say that the value of the month attribute of a date element must be one of the twelve English month names, that the value of the day attribute must be a number between 1 and 31, and that the value of the year attribute must be an integer between 1970 and 2009:

<!ATTLIST date month (January | February | March | April | May | June
  | July | August | September | October | November | December) #REQUIRED
>
<!ATTLIST date day (1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12
  | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25
  | 26 | 27 | 28 | 29 | 30 | 31) #REQUIRED
>
<!ATTLIST date year (1970 | 1971 | 1972 | 1973 | 1974 | 1975 | 1976
  | 1977 | 1978 | 1979 | 1980 | 1981 | 1982 | 1983 | 1984 | 1985 | 1986
  | 1987 | 1988 | 1989 | 1990 | 1991 | 1992 | 1993 | 1994 | 1995 | 1996
  | 1997 | 1998 | 1999 | 2000 | 2001 | 2002 | 2003 | 2004 | 2005 | 2006
  | 2007 | 2008 | 2009 ) #REQUIRED
>
<!ELEMENT date EMPTY>

Given this DTD, this date element is valid:

<date month="January" day="22" year="2001"/>

However, these date elements are invalid:

<date month="01"      day="22" year="2001"/>
<date month="Jan"     day="22" year="2001"/>
<date month="January" day="02" year="2001"/>
<date month="January" day="2"  year="1969"/>
<date month="Janvier" day="22" year="2001"/>

This trick works here because all the desired values happen to be legal XML name tokens. However, we could not use the same trick if the possible values included whitespace or any punctuation besides the underscore, hyphen, colon, and period.

3.3.1.6. IDREF

An IDREF type attribute refers to the ID type attribute of some element in the document. Thus, it must be an XML name. IDREF attributes are commonly used to establish relationships between elements when simple containment won't suffice.

For example, imagine an XML document that contains a list of project elements and employee elements. Every project has a project_id ID type attribute, and every employee has a social_security_number ID type attribute. Furthermore, each project has team_member child elements that identify who's working on the project, and each employee element has assignment children that indicate to which projects that employee is assigned. Since each project is assigned to multiple employees and some employees are assigned to more than one project, it's not possible to make the employees children of the projects or the projects children of the employees. The solution is to use IDREF type attributes like this:

<project project_id="p1">
  <goal>Develop Strategic Plan</goal>
  <team_member person="ss078-05-1120"/>
  <team_member person="ss987-65-4320"/>
</project>
<project project_id="p2">
  <goal>Deploy Linux</goal>
  <team_member person="ss078-05-1120"/>
  <team_member person="ss9876-12-3456"/>
</project>
<employee social_security_label="ss078-05-1120">
  <name>Fred Smith</name>
  <assignment project_id="p1"/>
  <assignment project_id="p2"/>
</employee>
<employee social_security_label="ss987-65-4320">
  <name>Jill Jones</name>
  <assignment project_id="p1"/>
</employee>
<employee social_security_label="ss9876-12-3456">
  <name>Sydney Lee</name>
  <assignment project_id="p2"/>
</employee>

In this example, the project_id attribute of the project element and the social_security_number attribute of the employee element would be declared to have type ID. The person attribute of the team_member element and the project_id attribute of the assignment element would have type IDREF. The relevant ATTLIST declarations look like this:

<!ATTLIST employee social_security_number ID    #REQUIRED>
<!ATTLIST project  project_id             ID    #REQUIRED>
<!ATTLIST team_member person              IDREF #REQUIRED>
<!ATTLIST assignment  project_id          IDREF #REQUIRED>

These declarations constrain the person attribute of the team_member element and the project_id attribute of the assignment element to match the ID of something in the document. However, they do not constrain the person attribute of the team_member element to match only employee IDs or constrain the project_id attribute of the assignment element to match only project IDs. It would be valid (though not necessarily correct) for a team_member to hold the ID of another project or even the same project.

3.3.2. Attribute Defaults

As well as providing a data type, each ATTLIST declaration includes a default declaration for that attribute. There are four possibilities for this default:

#IMPLIED
The attribute is optional. Each instance of the element may or may not provide a value for the attribute. No default value is provided.

#REQUIRED
The attribute is required. Each instance of the element must provide a value for the attribute. No default value is provided.

#FIXED
The attribute value is constant and immutable. This attribute has the specified value regardless of whether the attribute is explicitly noted on an individual instance of the element. If it is included, though, it must have the specified value.

Literal
The actual default value is given as a quoted string.

For example, this ATTLIST declaration says that person elements can have but do not have to have born and died attributes:

<!ATTLIST person born CDATA #IMPLIED
                 died CDATA #IMPLIED
>

This ATTLIST declaration says that every circle element must have center_x, center_y, and radius attributes:

<!ATTLIST circle center_x NMTOKEN #REQUIRED
                 center_y NMTOKEN #REQUIRED
                 radius   NMTOKEN #REQUIRED
>

This ATTLIST declaration says that every biography element has an xmlns:xlink attribute and that the value of that attribute is http://www.w3.org/1999/xlink, even if the start-tag of the element does not explicitly include an xmlns:xlink attribute.

<!ATTLIST biography
   xmlns:xlink CDATA #FIXED "http://www.w3.org/1999/xlink">

This ATTLIST declaration says that every web_page element has a protocol attribute. If a particular web_page element doesn't have an explicit protocol attribute, then the parser will supply one with the value http:

<!ATTLIST web_page protocol NMTOKEN "http">


Library Navigation Links

Copyright © 2002 O'Reilly & Associates. All rights reserved.