1.4. How HTTP Clients WorkOnce the server is set up, we can get down to business. The client has the easy end: it wants web action on a particular site, and it sends a request with a URL that begins with http to indicate what service it wants (other common services are ftp for File Transfer Protocolor https for HTTP with Secure Sockets Layer — SSL) and continues with these possible parts: //<user>:<password>@<host>:<port>/<url-path> RFC 1738 says:
In real life, URLs look more like: http://www.apache.org/ — that is, there is no user and password pair, and there is no port. What happens? The browser observes that the URL starts with http: and deduces that it should be using the HTTP protocol. The client then contacts a name server, which uses DNS to resolve www.apache.org to an IP address. At the time of writing, this was 63.251.56.142. One way to check the validity of a hostname is to go to the operating-system prompt[9] and type:
ping www.apache.org If that host is connected to the Internet, a response is returned: Pinging www.apache.org [63.251.56.142] with 32 bytes of data: Reply from 63.251.56.142: bytes=32 time=278ms TTL=49 Reply from 63.251.56.142: bytes=32 time=620ms TTL=49 Reply from 63.251.56.142: bytes=32 time=285ms TTL=49 Reply from 63.251.56.142: bytes=32 time=290ms TTL=49 Ping statistics for 63.251.56.142: A URL can be given more precision by attaching a post number: the web address http://www.apache.org doesn't include a port because it is port 80, the default, and the browser takes it for granted. If some other port is wanted, it is included in the URL after a colon — for example, http://www.apache.org:8000/. We will have more to do with ports later. The URL always includes a path, even if is only /. If the path is left out by the careless user, most browsers put it back in. If the path were /some/where/foo.html on port 8000, the URL would be http://www.apache.org:8000/some/where/foo.html. The client now makes a TCP connection to port number 8000 on IP 204.152.144.38 and sends the following message down the connection (if it is using HTTP 1.0): GET /some/where/foo.html HTTP/1.0<CR><LF><CR><LF> These carriage returns and line feeds (CRLF) are very important because they separate the HTTP header from its body. If the request were a POST, there would be data following. The server sends the response back and closes the connection. To see it in action, connect again to the Internet, get a command-line prompt, and type the following: % telnet www.apache.org 80 > telnet www.apache.org 80 GET http://www.apache.org/foundation/contact.html HTTP/1.1 Host: www.apache.org On Win98, telnet puts up a dialog box. Click connect remote system, and change Port from "telnet" to "80". In Terminal preferences, check "local echo". Then type this, followed by two Returns: GET http://www.apache.org/foundation/contact.html HTTP/1.1 Host: www.apache.org You should see text similar to that which follows. Some implementations of telnet rather unnervingly don't echo what you type to the screen, so it seems that nothing is happening. Nevertheless, a whole mess of response streams past: Trying 64.125.133.20... Connected to www.apache.org. Escape character is '^]'. HTTP/1.1 200 OK Date: Mon, 25 Feb 2002 15:03:19 GMT Server: Apache/2.0.32 (Unix) Cache-Control: max-age=86400 Expires: Tue, 26 Feb 2002 15:03:19 GMT Accept-Ranges: bytes Content-Length: 4946 Content-Type: text/html <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> <title>Contact Information--The Apache Software Foundation</title> </head> <body bgcolor="#ffffff" text="#000000" link="#525D76"> <table border="0" width="100%" cellspacing="0"> <tr><!-- SITE BANNER AND PROJECT IMAGE --> <td align="left" valign="top"> <a href="http://www.apache.org/"><img src="../images/asf_logo_wide.gif" alt="The Apache Software Foundation" align="left" border="0"/></a> </td> </tr> </table> <table border="0" width="100%" cellspacing="4"> <tr><td colspan="2"><hr noshade="noshade" size="1"/></td></tr> <tr> <!-- LEFT SIDE NAVIGATION --> <td valign="top" nowrap="nowrap"> <p><b><a href="/foundation/projects.html">Apache Projects</a></b></p> <menu compact="compact"> <li><a href="http://httpd.apache.org/">HTTP Server</a></li> <li><a href="http://apr.apache.org/">APR</a></li> <li><a href="http://jakarta.apache.org/">Jakarta</a></li> <li><a href="http://perl.apache.org/">Perl</a></li> <li><a href="http://php.apache.org/">PHP</a></li> <li><a href="http://tcl.apache.org/">TCL</a></li> <li><a href="http://xml.apache.org/">XML</a></li> <li><a href="/foundation/conferences.html">Conferences</a></li> <li><a href="/foundation/">Foundation</a></li> </menu> ...... and so on Copyright © 2003 O'Reilly & Associates. All rights reserved. |
|