home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam  

Book HomeHTML & XHTML: The Definitive GuideSearch this book

16.2. Creating XHTML Documents

For the most part, creating an XHTML document is no different than creating an HTML document. Using your favorite text editor, simply add the markup elements in the right order to your document's contents, and display it using your favorite browser. To be strictly correct ("valid," as they say at the W3C), your XHTML document will need a boilerplate declaration up front that specifies the DTD you used to create the document and defines a namespace for the document.

16.2.1. Declaring Document Types

For an XHTML browser to correctly parse and display your XHTML document, you should tell it which version of XML is being used to create the document. You must also state which XHTML DTD defines the elements in your document.

The XML version declaration uses a special XML processing directive. In general, these XML directives begin with <? and end with ?>, but otherwise they look like typical tags in your document.[82]

[82]<! was already taken.

To declare that you are using XML Version 1.0, place this directive in the first line in your document:

<?xml version="1.0" encoding="UTF-8"?>

This tells the browser that you are using XML 1.0 along with the 8-bit Unicode character set, the one most commonly used today. The encoding attribute's value should reflect your local character set. Refer to the appropriate ISO standards for other encoding names.

Once you've gotten the important issue of the XML version squared away, you should then declare the markup language's DTD:

<!DOCTYPE html 
     PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"

With this statement, you declare that your document's root element is html, as defined in the DTD whose public identifier is defined as "-//W3C//DTD XHTML 1.0 Strict//EN". The browser may know how to find the DTD matching this public identifier. If it does not, it can use the URL following the public identifier as an alternative location for the DTD.

As you may have noticed, the <!DOCTYPE> directive has told the browser to use the strict XHTML DTD. Here's the one you'll probably use for your transitional XHTML documents:

<!DOCTYPE html 
     PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"

And, as you might expect, the <!DOCTYPE> directive for the frame-based XHTML DTD is:

<!DOCTYPE html 
     PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN"

16.2.2. Understanding Namespaces

As described in the last chapter, an XML DTD defines any number of element and attribute names as part of the markup language. These elements and attribute names are stored in a namespace that is unique to the DTD. As you reference elements and attributes in your document, the browser looks them up in the namespace to find out how they should be used.

For instance, the <a> tag's name ("a") and attributes like "href " and "style" are defined in the XHTML DTD and their names are placed in the DTD's namespace. Any "processing agent" -- usually a browser, but your eyes and brain can serve the same function -- can look up the name in the appropriate DTD to figure out what the markup means and what it should do.

With XML, your document actually may use more than one DTD, and therefore need more than one namespace. For example, you might create a transitional XHTML document, but also include special markup for some math expressions according to an XML math language. What happens when both the XHTML DTD and the math DTD use the same name to define different elements, such as <a> for XHTML hypertext and <a> for an absolute value in math? How does the browser choose which namespace to use?

The answer is the xmlns[83] attribute. Use it to define one or more alternative namespaces within your document. It can be placed within the start tag of any element within your document, and its URL-like[84] value defines the namespace that the browser should use for all content within that element.

[83]XML namespace -- xmlns -- get it? This is why XML doesn't let you begin any element or attribute with the three-letter prefix "xml": it's reserved for special XML attributes and elements.

[84]It looks like a URL, and you might think that it references a document that contains the namespace, but alas, it doesn't. It is simply a unique name that identifies the namespace. Display agents use that placeholder to refer to their own resources for how to treat the named element or attribute.

With XHTML, according to the new XML conventions, you should at the very least include an xmlns attribute within your document's <html> tag that identifies the primary namespace used throughout the document:

<html xmlns="http://www.w3.org/1999/xhtml">

If and when you need to include math markup, you use the xmlns attribute again to define the math namespace. So, for instance, you could use the xmlns attribute within some math-specific tag of your otherwise common XHTML document (assuming the MATH element exists, of course):

<div xmlns="http://www.w3.org/1998/Math/MathML>x2/x</div">

In this case, the XML-compliant browser would use the http://www.w3.org/1998/Math/MathML namespace to divine that this is the MATH, not the XHTML, version of the <div> tag, and should therefore be displayed as a division equation.

It would quickly become tedious if you had to embed the xmlns attribute into each and every <div> tag any time you wanted to show a division equation in your document. A better way -- particularly if you plan to apply it to many different elements in your document -- is to identify and label the namespace at the beginning of your document, and then refer to it by that label as a prefix to the affected element in your document. For example:

<html xmlns="http://www.w3.org/1999/xhtml" 

The math namespace can now be abbreviated to "math" later in your document. So the streamlined:


now has the same effect as the lengthy earlier example of the math <div> tag containing its own xmlns attribute.

For the most part, the vast majority of XHTML authors will never need to define multiple namespaces and so will never have to use fully qualified names containing the namespace prefix. Even so, you should understand that multiple namespaces exist and that you will need to manage them if you choose to embed content based upon one DTD within content defined by another DTD.

16.2.3. A Minimal XHTML Document

As a courtesy to all fledgling XHTML authors, we now present the minimal and correct XHTML document, including all the appropriate XML, XHTML, and namespace declarations. With this most difficult part out of the way, you need only supply content to create a complete XHTML document:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html 
          PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
    <title>Every document must have a title</title>
    ...your content goes here...

Working through the minimal document one element at a time, we begin by declaring we are basing the document on the XML 1.0 standard and using 8-bit Unicode characters to express its contents and markup. We then announce, in the familiar HTML-like <!DOCTYPE> statement, that we are following the markup rules defined in the transitional XHTML 1.0 DTD, which allow us free rein to use nearly any and all HTML 4.01 elements in our document.

Our document content actually begins with the <html> tag, which has its xmlns attribute declare that the XHTML namespace will be the default namespace for the entire document. Also note the lang attribute, in both the XML and XHTML namespaces, which declares that the document language is English.

Finally, we include the familiar document <head> and <body> tags, along with the required <title> tag.

Library Navigation Links

Copyright © 2002 O'Reilly & Associates. All rights reserved.