The Hypertext Transfer Protocol (HTTP) is the language that Web clients
and Web servers use to communicate with each other. It is essentially
the backbone of the Web.
While HTTP is largely the realm of server and client programming,
a firm understanding of HTTP is also important for CGI programming. In addition, sometimes HTTP filters back to the users--for example, when server error codes are reported in a browser window. In this book, we cover HTTP in four chapters:
- In the current chapter (Chapter 17, HTTP Overview), we give a brief
introduction to HTTP, the structure of HTTP transactions,
and a discussion of client methods.
- In Chapter 18, Server Response Codes, we cover the valid status codes
used in HTTP server responses.
- In Chapter 19, HTTP Headers, we list the headers used by
both clients and servers under HTTP.
- Finally, in Chapter 20, Media Types and Subtypes, we cover the Internet media types
used under HTTP.
All HTTP transactions follow the same general format.
Each client request and server response has three parts: the request or
response line, a header section, and the entity body. The client
initiates a transaction as follows:
- The client contacts the server at a designated port number (by default, 80).
Then it sends a document request by specifying an HTTP command called a
method, followed by a document address, and an HTTP
uses the GET method to request the document index.html using
version 1.0 of HTTP.
HTTP methods are discussed in more detail later in this chapter.
- Next, the client sends optional header information to inform the server of
its configuration and the document formats it will accept. All header
information is given line by line, each with a header name and value.
Chapter 19, HTTP Headers lists the valid HTTP headers.
For example, this header information sent by the client indicates its
name and version number and specifies several document preferences:
User-Agent: Mozilla/2.02Gold (WinNT; I)
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*
The client sends a blank line to end the header.
- After sending the request and headers, the client may send additional
data. This data is mostly used by CGI programs using the POST method. It may also be used by clients like Netscape Navigator-Gold, to publish an edited page back onto the Web server.
The server responds in the following way to the client's request:
- The server replies with a status line containing three fields: HTTP version,
status code, and description. The HTTP version indicates the version of HTTP
that the server is using to respond.
The status code is a three digit number that indicates the server's
result of the client's request. The description following the status
code is just human-readable text that describes the status code. For example, this status line:
indicates that the server uses version 1.0 of HTTP in its response. A status
code of 200 means that the client's request was successful and the requested
data will be supplied after the headers.
Chapter 18, Server Response Codes contains a listing of the status codes and their
- After the status line, the server sends header information to the client
about itself and the requested document.
HTTP headers are covered in Chapter 19, HTTP Headers.
Date: Fri, 20 Sep 1996 08:17:58 GMT
Last-modified: Mon, 17 Jun 1996 21:53:08 GMT
A blank line ends the header.
- If the client's request is successful, the requested data is sent. This
data may be a copy of a file, or the response from a CGI program. If the
client's request could not be fulfilled, additional data may be a human-readable
explanation of why the server could not fulfill the request.
In HTTP 1.0, after the server has finished sending the requested data, it disconnects from the client and the transaction is over unless a Connection: Key Admin header is sent. In HTTP 1.1, however, the default is for the server to maintain the connection and allow the client to make additional requests. Since many documents embed other documents as inline images, frames, applets, etc., this saves the overhead of the client having to repeatedly connect to the same server just to draw a single page. Under HTTP 1.1, therefore, the transaction might cycle back to the beginning, until either the client or server explicitly closes the connection.
Being a stateless protocol, HTTP does not maintain any information
from one transaction to the next, so the next transaction needs to
start all over again. The advantage is that an HTTP server can
serve a lot more clients in a given period of time, since there's no additional overhead for tracking sessions from one connection to the next. The disadvantage is that
more elaborate CGI programs need to use hidden input fields (as
described in Chapter 10, HTML Form Tags) or
external tools such as Netscape cookies
(as described in Chapter 12, Cookies) to maintain
information from one transaction to the next.