4.1. Sending Data to the Server
In the last couple of chapters, we have
referred to the options that a browser can include with an HTTP
request. In the case of a
GET request, these options are
included as the query string portion of the URL passed in
the request line. In the case of a
POST request, these options are
included as the content of the HTTP request. These options are
typically generated by HTML forms.
Each HTML form element has an associated name and
value, like this checkbox:
<INPUT TYPE="checkbox" NAME="send_email" VALUE="yes">
If this checkbox is checked, then the option
send_email with a value of yes
is sent to the web server. Other form elements, which we will look at
in a moment, act similarly. Before the browser can send form option
data to the server, the browser must encode it. There are currently
two different forms of encoding form data. The default encoding,
which has the
media
type of application/x-www-form-urlencoded, is used
almost exclusively. The other form of encoding,
multipart/form-data,
is
primarily used with forms which allow the user to upload files to the
web server. We will look at this in Section 5.2.4, "File Uploads with CGI.pm".
For now, let's look at how
application/x-www-form-urlencoded works. As we
mentioned, each HTML form element has a name and a value attribute.
First, the browser collects the names and values for each element in
the form. It then takes these strings and encodes them according to
the same rules for encoding
URL text that we discussed in Chapter 2, "The Hypertext Transport Protocol ". If you recall, characters that have special
meaning for HTTP are replaced with a percentage
symbol and a two-digit hexadecimal number;
spaces are replaced with
+. For example, the string "Thanks for the
help!" would be converted to
"Thanks+for+the+help%21".
Next, the browser joins each name and value with an equals sign. For
example, if the user entered "30" when asked for the age,
the key-value pair would be "age=30". Each
key-value pair is then joined, using the
"&" character as a delimiter. Here is an example of
an HTML form:
<HTML>
<HEAD>
<TITLE>Mailing List</TITLE>
</HEAD>
<BODY>
<H1>Mailing List Signup</H1>
<P>Please fill out this form to be notified via email about
updates and future product announcements.</P>
<FORM ACTION="/cgi/register.cgi" METHOD="POST">
<P>
Name: <INPUT TYPE="TEXT" NAME="name"><BR>
Email: <INPUT TYPE="TEXT" NAME="email">
</P>
<HR>
<INPUT TYPE="SUBMIT" VALUE="Submit Registration Info">
</FORM>
</BODY>
</HTML>
Figure 4-1 shows how the form looks in Netscape
with some sample input.
Figure 4-1. Sample HTML form
When this form is submitted, the browser encodes these three elements
as:
name=Mary+Jones&email=mjones%40jones.com
Since the
request method is POST in this
example, this string would be added to the HTTP request as the
content of that message. The HTTP request message would look like
this:
POST /cgi/register.cgi HTTP/1.1
Host: localhost
Content-Length: 67
Content-Type: application/x-www-form-urlencoded
name=Mary+Jones&email=mjones%40jones.com
If the request method were set to GET, then the request would be
formatted this way instead:
GET /cgi/register.cgi?name=Mary+Jones&email=mjones%40jones.com HTTP/1.1
Host: localhost