Show Contents Previous Page Next Page
Chapter 5 - Maintaining State
In this section...
Introduction Show Contents Go to Top Previous Page Next Page
In this Chapter
If you've ever written a complicated CGI script, you know that the main inconvenience
of the HTTP architecture is its stateless nature. Once an HTTP transaction is
finished, the server forgets all about it. Even if the same remote user connects
a few seconds later, from the server's point of view it's a completely new interaction
and the script has to reconstruct the previous interaction's state. This makes
even simple applications like shopping carts and multipage questionnaires a
challenge to write.
CGI script developers have come up with a standard bag of tricks for overcoming this restriction. You can save state information inside the fields of fill-out forms, stuff it into the URI as additional path information, save it in a cookie, ferret it away in a server-side database, or rewrite the URI to include a session ID. In addition to these techniques, the Apache API allows you to maintain state by taking advantage of the persistence of the Apache process itself.
This chapter takes you on a tour of various techniques for maintaining state with the Apache API. In the process, it also shows you how to hook your pages up to relational databases using the Perl DBI library.
Choosing the Right Technique Show Contents Go to Top Previous Page Next Page
The main issue in preserving state information is where to store it. Six frequently used places are shown in the following list. They can be broadly broken down into client-side techniques (items 1 through 3) and server-side techniques (items 4 through 6).
- Store state in hidden fields
- Store state in cookies
- Store state in the URI
- Store state in web server process memory
- Store state in a file
- Store state in a database
In client-side techniques the bulk of the state information is saved on the browser's side of the connection. Client-side techniques include those that store information in HTTP cookies and those that put state information in the hidden fields of a fill-out form. In contrast, server-side techniques keep all the state information on the web server host. Server-side techniques include any method for tracking a user session with a session ID.
Each technique for maintaining state has unique advantages and disadvantages.
You need to choose the one that best fits your application. The main advantage
of the client-side techniques is that they require very little overhead for
the web server: no data structures to maintain in memory, no database lookups,
and no complex computations. The disadvantage is that client-side techniques
require the cooperation of remote users and their browser software. If you store
state information in the hidden fields of an HTML form, users are free to peek
at the information (using the browser's "View Source" command) or even to try
to trick your application by sending a modified version of the form back to
you.1 If you use HTTP cookies to store state information,
you have to worry about older browsers that don't support the HTTP cookie protocol
and the large number of users (estimated at up to 20 percent) who disable cookies
out of privacy concerns. If the amount of state information you need to save
is large, you may also run into bandwidth problems when transmitting the information
back and forth.
Server-side techniques solve some of the problems of client-side methods but introduce their own issues. Typically you'll create a "session object" somewhere on the web server system. This object contains all the state information associated with the user session. For example, if the user has completed several pages of a multipage questionnaire, the session will hold the current page number and the responses to previous pages' questions. If the amount of state information is small, and you don't need to hold onto it for an extended period of time, you can keep it in the web server's process memory. Otherwise, you'll have to stash it in some long-term storage, such as a file or database. Because the information is maintained on the server's side of the connection, you don't have to worry about the user peeking or modifying it inappropriately.
However, server-side techniques are more complex than client-side ones. First, because these techniques must manage the information from multiple sessions simultaneously, you must worry about such things as database and file locking. Otherwise, you face the possibility of leaving the session storage in an inconsistent state when two HTTP processes try to update it simultaneously. Second, you have to decide when to expire old sessions that are no longer needed. Finally, you need a way to associate a particular session object with a particular browser. Nothing about a browser is guaranteed to be unique: not its software version number, nor its IP address, nor its DNS name. The browser has to be coerced into identifying itself with a unique session ID, either with one of the client-side techniques or by requiring users to authenticate themselves with usernames and passwords.
A last important consideration is the length of time you need to remember
state. If you only need to save state across a single user session and don't
mind losing the state information when the user quits the browser or leaves
your site, then hidden fields and URI-based storage will work well. If you need
state storage that will survive the remote user quitting the browser but don't
mind if state is lost when you reboot the web server, then storing state in
web server process memory is appropriate. However, for long-term storage, such
as saving a user's preferences over a period of months, you'll need to use persistent
cookies on the client side or store the state information in a file or database
on the server side.
1 Some sites that use the hidden fields technique
in their shopping carts script report upward of 30 attempts per month by users
to submit fraudulently modified forms in an attempt to obtain merchandise they
didn't pay for. Show Contents Go to Top Previous Page Next Page
Copyright © 1999 by O'Reilly & Associates, Inc.