Every HTML document should conform to the HTML SGML DTD, the formal Document Type Definition that defines the HTML standard. The DTD defines the tags and syntax that are used to create an HTML document. You can inform the browser which DTD your document complies with by placing a special SGML (Standard Generalized Markup Language) command in the first line of the document:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
This cryptic message indicates that your document is intended to be compliant with the HTML 3.2 final DTD defined by the World Wide Web Consortium (W3C). Other versions of the DTD define more restricted versions of the HTML standard, and not all browsers support all versions of the HTML DTD. In fact, specifying any other doctype may cause the browser to misinterpret your document when displaying it for the user. It's also unclear what doctype to use when including in the HTML document the various tags that are not standards, but are very popular features of a popular browser--the Netscape extensions, for instance, or even the deprecated HTML 3.0 standard, for which a DTD was never released.
Almost no one precedes their HTML documents with the SGML doctype command. Because of the confusion of versions and standards, we don't recommend that you include the prefix with your HTML documents either. There are other mechanisms to better define your document contents, such as the version attribute for the <html> tag.
As we saw earlier, the <html> and </html> tags serve to delimit the beginning and end of an HTML document. Since the typical browser can easily infer from the enclosed source that it is an HTML document, you don't really need to include the tag in your source document.
That said, it's considered good form to include this tag so that other tools, particularly more mundane text-processing ones, can recognize your document as an HTML document. At the very least, the presence of the beginning and ending <html> tags ensures that the beginning or the end of the document haven't been inadvertently deleted.
Inside the <html> tag and its end tag are the document's head and body. Within the head, you'll find tags that identify the document and define its place within a document collection. Within the body is the actual document content, defined by tags that determine the layout and appearance of the document text. As you might expect, the document head is contained within a <head> tag and the body is within a <body> tag, both of which are defined below.
Netscape Navigator and Internet Explorer extend the <html> tag so that the <body> tag may be replaced by a <frameset> tag, defining one or more display frames that, in turn, contain actual document content. See Chapter 12, Frames, for more information.
By far, the most common form of the <html> tag is simply:
<html> ...document head and body content </html>
When the <html> tag appears without the version attribute, the HTML document server and browser assume the version of HTML used in this document is supplied to the browser by the server.
The <html> version attribute defines the HTML standard version used to compose the document. If included, the value of the version attribute should read exactly:
version="-//W3C//DTD HTML 3.2 Final//EN"
This attribute better identifies an HTML document's origins and contents than the SGML doctype command. However, some browsers may alter their processing of the document based upon the HTML version specified by this attribute, so be careful. Again, the confusion of extensions and versions and the lack of standards guidance makes us uneasy, and we do not recommend you include version information in your document, except perhaps as part of a leading comment.