Chapter 5. CGI.pm
The CGI.pm module has become the standard tool for creating CGI scripts in Perl. It provides a simple interface for most of the common CGI tasks. Not only does it easily parse input parameters, but it also provides a clean interface for outputting headers and a powerful yet elegant way to output HTML code from your scripts.
We will cover most of the basics here and will revisit CGI.pm later to look at some of its other features when we discuss other components of CGI programming. For example, CGI.pm provides a simple way to read and write to browser cookies, but we will wait to review that until we get to our discussion about maintaining state, in Chapter 11, "Maintaining State".
If after reading this chapter you are interested in more information, the author of CGI.pm has written an entire book devoted to it: The Official Guide to Programming with CGI.pm by Lincoln Stein ( John Wiley & Sons).
Because CGI.pm offers so many methods, we'll organize our discussion of CGI.pm into three parts: handling input, generating output, and handling errors. We will look at ways to generate output both with and without CGI.pm. Here is the structure of our chapter:
Let's start with a general overview of CGI.pm.
$ perl -v This is perl, version 5.005 Copyright 1987-1997, Larry Wall Perl may be copied only under the terms of either the Artistic License or the GNU General Public License, which may be found in the Perl 5.0 source kit.
You can verify whether CGI.pm is installed and which version by doing this:
$ perl -MCGI -e 'print "CGI.pm version $CGI::VERSION\n";' CGI.pm version 2.56
If you get something like the following, then you do not have CGI.pm installed, and you will have to download and install it. Appendix B, "Perl Modules", explains how to do this.
Can't locate CGI.pm in @INC (@INC contains: /usr/lib/perl5/i386-linux/5.005 /usr/ lib/perl5 /usr/lib/perl5/site_perl/i386-linux /usr/lib/perl5/site_perl .). BEGIN failed--compilation aborted.
New versions of CGI.pm are released regularly, and most releases include bug fixes. We therefore recommend that you install the latest version and monitor new releases (you can find a version history at the bottom of the cgi_docs.html file distributed with CGI.pm). This chapter discusses features introduced as late as 2.47.
5.1.1. Denial of Service Attacks
Before we get started, you should make a minor change to your copy of CGI.pm. CGI.pm handles HTTP file uploads and automatically saves the contents of these uploads to temporary files. This is a very convenient feature, and we'll talk about this later. However, file uploads are enabled by default in CGI.pm, and it does not impose any limitations on the size of files it will accept. Thus, it is possible for someone to upload multiple large files to your web server and fill up your disk.
Clearly, the vast majority of your CGI scripts do not accept file uploads. Thus, you should disable this feature and enable it only in those scripts where you wish to use it. You may also wish to limit the size of POST requests, which includes file uploads as well as standard forms submitted via the POST method.
To make these changes, locate CGI.pm in your Perl libraries and then search for text that looks like the following:
# Set this to a positive value to limit the size of a POSTing # to a certain number of bytes: $POST_MAX = -1; # Change this to 1 to disable uploads entirely: $DISABLE_UPLOADS = 0;
Set $DISABLE_UPLOADS to 1. You may wish to set $POST_MAX to a reasonable upper bound as well, such as 100KB. POST requests that are not file uploads are processed in memory, so restricting the size of POST requests avoids someone submitting multiple large POST requests that quickly use up available memory on your server. The result looks like this:
# Set this to a positive value to limit the size of a POSTing # to a certain number of bytes: $POST_MAX = 102_400; # 100 KB # Change this to 1 to disable uploads entirely: $DISABLE_UPLOADS = 1;
If you then want to enable uploads and/or allow a greater size for POST requests, you can override these values in your script by setting $CGI::DISABLE_UPLOADS and $CGI::POST_MAX after you use the CGI.pm module, but before you create a CGI.pm object. We will look at how to receive file uploads later in this chapter.
You may need special permission to update your CGI.pm file. If your system administrator for some reason will not make these changes, then you must disable file uploads and limit POST requests on a script by script basis. Your scripts should begin like this:
#!/usr/bin/perl -wT use strict; use CGI; $CGI::DISABLE_UPLOADS = 1; $CGI::POST_MAX = 102_400; # 100 KB my $q = new CGI; . .
Throughout our examples, we will assume that the module has been patched and omit these lines.
5.1.2. The Kitchen Sink
CGI.pm is a big module. It provides functions for accessing CGI environment variables and printing outgoing headers. It automatically interprets form data submitted via POST, via GET, and handles multipart-encoded file uploads. It provides many utility functions to do common CGI-related tasks, and it provides a simple interface for outputting HTML. This interface does not eliminate the need to understand HTML, but it makes including HTML inside a Perl script more natural and easier to validate.
Because CGI.pm is so large, some people consider it bloated and complain that it wastes memory. In fact, it uses many creative ways to increase the efficiency of CGI.pm including a custom implementation of SelfLoader. This means that it loads only code that you need. If you use CGI.pm only to parse input, but do not use it to produce HTML, then CGI.pm does not load the code for producing HTML.
There have also been some alternative, lightweight CGI modules written. One of the lightweight alternatives to CGI.pm was begun by David James; he got together with Lincoln Stein and the result is a new and improved version of CGI.pm that is even smaller, faster, and more modular than the original. It should be available as CGI.pm 3.0 by the time you read this book.
5.1.3. Standard and Object-Oriented Syntax
CGI.pm, like Perl, is powerful yet flexible. It supports two styles of usage: a standard interface and an object-oriented interface. Internally, it is a fully object-oriented module. Not all Perl programmers are comfortable with object-oriented notation, however, so those developers can instead request that CGI.pm make its subroutines available for the developer to call directly.
Here is an example. The object-oriented syntax looks like this:
use strict; use CGI; my $q = new CGI; my $name = $q->param( "name" ); print $q->header( "text/html" ), $q->start_html( "Welcome" ), $q->p( "Hi $name!" ), $q->end_html;
The standard syntax looks like this:
use strict; use CGI qw( :standard ); my $name = param( "name" ); print header( "text/html" ), start_html( "Welcome" ), p( "Hi $name!" ), end_html;
Don't worry about the details of what the code does right now; we will cover all of it during this chapter. The important thing to notice is the different syntax. The first script creates a CGI.pm object and stores it in $q ($q is short for query and is a common convention for CGI.pm objects, although $cgi is used sometimes, too). Thereafter, all the CGI.pm functions are preceded by $q->. The second asks CGI.pm to export the standard functions and simply uses them directly. CGI.pm provides several predefined groups of functions, like :standard , that can be exported into your CGI script.
The standard CGI.pm syntax certainly has less noise. It doesn't have all those $q-> prefixes. Aesthetics aside, however, there are good arguments for using the object oriented syntax with CGI.pm.
Exporting functions has its costs. Perl maintains a separate namespace for different chunks of code referred to as packages. Most modules, like CGI.pm, load themselves into their own package. Thus, the functions and variables that modules see are different from the modules and variables you see in your scripts. This is a good thing, because it prevents collisions between variables and functions in different packages that happen to have the same name. When a module exports symbols (whether they are variables or functions), Perl has to create and maintain an alias of each of the these symbols in your program's namespace, the main namespace. These aliases consume memory. This memory usage becomes especially critical if you decide to use your CGI scripts with FastCGI or mod_perl.
The object-oriented syntax also helps you avoid any possible collisions that would occur if you create a subroutine with the same name as one of CGI.pm's exported subroutines. Also, from a maintenance standpoint, it is clear from looking at the object-oriented script where the code for the header function is: it's a method of a CGI.pm object, so it must be in the CGI.pm module (or one of its associated modules). Knowing where to look for the header function in the second example is much more difficult, especially if your CGI scripts grow large and complex.
Some people avoid the object-oriented syntax because they believe it is slower. In Perl, methods typically are slower than functions. However, CGI.pm is truly an object-oriented module at heart, and in order to provide the function syntax, it must do some fancy footwork to manage an object for you internally. Thus with CGI.pm, the object-oriented syntax is not any slower than the function syntax. In fact, it can be slightly faster.
We will use CGI.pm's object-oriented syntax in most of our examples.
Copyright © 2001 O'Reilly & Associates. All rights reserved.