home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam  


19.1. Writing a CGI Script

Problem

You want to write a CGI script to process the contents of an HTML form. In particular, you want to access the form contents, and produce valid output in return.

Solution

A CGI script is a server-side program launched by a web server to generate a dynamic document. It receives encoded information from the remote client (user's browser) via STDIN and environment variables, and it must produce a valid HTTP header and body on STDOUT. The standard CGI module, shown in Example 19.1 , painlessly manages the encoding of input and output.

Example 19.1: hiweb

#!/usr/bin/perl -w
# 

hiweb - load CGI module to decode information given by web server
use strict;

use CGI qw(:standard escapeHTML);

# get a parameter from a form
my $value = param('PARAM_NAME');

# output a document
print header(), start_html("Howdy there!"),
      p("You typed: ", tt(escapeHTML($value))),
      end_html();

Discussion

CGI is just a protocol, a formal agreement between a web server and a separate program. The server encodes the client's form input data, and the CGI program decodes the form and generates output. The protocol says nothing regarding which language the program must be written in; programs and scripts that obey the CGI protocol have been written in C, shell, Rexx, C++, VMS DCL, Smalltalk, Tcl, Python, and (of course) Perl.

The full CGI specification lays out which environment variables hold which data (such as form input parameters) and how it's all encoded. In theory, it should be easy to follow the protocol to decode the input, but in practice, it is surprisingly tricky to get right. That's why we strongly recommend using Lincoln Stein's excellent CGI module. The hard work of handling the CGI requirements correctly and conveniently has already been done, freeing you to write the core of your program without tedious network protocols.

CGI scripts are called in two main ways, referred to as methods  - but don't confuse HTTP methods with Perl object methods! The HTTP GET method is used in document retrievals where an identical request will produce an identical result, such as a dictionary lookup. A GET stores form data in the URL. This means it can be conveniently bookmarked for canned requests, but has limitations on the total size of the data requested. The HTTP POST method sends form data separate from the request. It has no such size limitations, but cannot be bookmarked. Forms that update information on the server, like mailing in feedback or modifying a database entry, should use POST. Client browsers and intervening proxies are free to cache and refresh the results of GET requests behind your back, but they may not cache POST requests. GET is only safe for short read-only requests, whereas POST is safe for forms of any size, as well as for updates and feedback responses. Therefore, by default, the CGI module uses POST for all forms it generates.

With a few exceptions mainly related to file permissions and highly interactive work, CGI scripts can do nearly anything any other program can do. They can send back results in many formats: plain text, HTML documents, sound files, pictures, or anything else specified in the HTTP header. Besides producing plain text or HTML text, they can also redirect the client browser to another location, set server cookies, request authentication, and give errors.

The CGI module provides two different interfaces, a procedural one for casual use, and an object-oriented one for power users with complicated needs. Virtually all CGI scripts should use the simple procedural interface, but unfortunately, most of CGI.pm's documentation uses examples with the original object-oriented approach. Due to backwards compatibility, if you want the simple procedural interface, you need to specifically ask for it using the :standard import tag. See Chapter 12, Packages, Libraries, and Modules , for more on import tags.

To read the user's form input, pass the param function a field name to get. If you had a form field name "favorite", then param("favorite") would return its value. With some types of form fields like scrolling lists, the user can choose more than one option. For these, param returns a list of values, which you could assign to an array.

For example, here's a script that pulls in values of three form fields, the last one having many return values:

use CGI qw(:standard);
$who   = param("Name");
$phone = param("Number");
@picks = param("Choices");

Called without any arguments, param returns a list of valid form parameters in list context, or in scalar context, how many form parameters there were.

That's all there is to accessing the user's input. Do with it whatever you please, and then generate properly formatted output. This is nearly as easy. Remember that unlike regular programs, a CGI script's output must be formatted in a particular way: it must first emit a set of headers followed by a blank line before it can produce normal output.

As shown in the Solution above, the CGI module helps with output as well as input. The module provides functions for generating HTTP headers and HTML code. The header function builds the text of a header for you. By default, it produces headers for a text/html document, but you can change the Content-Type and supply other optional header parameters as well:

print header( -TYPE    => 'text/plain',
              -EXPIRES => '+3d' );

CGI.pm can also be used to generate HTML. It may seem trivial, but this is where the CGI module shines: the creation of dynamic forms, especially stateful ones such as shopping carts. The CGI module even has functions for generating forms and tables.

When printing form widgets, the characters & , <, >, and " in HTML output are automatically replaced with their entity equivalents. This is not the case with arbitary user output. That's why the Solution imports and makes use of the escapeHTML function  - if the user types any of those special characters, they won't cause formatting errors in the HTML.

For a full list of functions and their calling conventions, see CGI.pm's documentation, included as POD source within the module itself.

See Also

The documentation for the standard CGI module; Chapter 19 of Learning Perl on "CGI Programming"; http://www.w3.org/CGI/ ; Recipe 19.7