home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam  

Writing Apache Modules with Perl and C
By:   Lincoln Stein and Doug MacEachern
Published:   O'Reilly & Associates, Inc.  - March 1999

Copyright 1999 by O'Reilly & Associates, Inc.


   Show Contents   Previous Page   Next Page

Chapter 6 - Authentication and Authorization / Access Control, Authentication, and Authorization
How Authentication and Authorization Work

In contrast to access control, the process of authenticating a remote user is more involved. The question "is the user who he or she claims to be?" sounds simple, but the steps for verifying the answer can be simple or complex, depending on the level of assurance you desire. The HTTP protocol does not provide a way to answer the question of authenticity, only a method of asking it. It's up to the web server itself to decide when a user is or is not authenticated.

When a web server needs to know who a user is, it issues a challenge using the HTTP 401 "Authorization Required" code (Figure 6-1). In addition to this code, the HTTP header includes one or more fields called www-Authenticate, indicating the type (or types) of authentication that the server considers acceptable. www-Authenticate may also provide other information, such as a challenge string to use in cryptographic authentication protocols.

When a client sees the 401 response code, it studies the www-Authenticate header and fetches the requested authentication information if it can. If need be, the client requests some information from the user, such as prompting for an account name and password or requiring the user to insert a smart token containing a cryptographic signature.

Figure 6-1. During web authentication, the server challenges the browser to provide authentication information, and the browser reissues the request with an Authorization header.

Armed with this information, the browser now issues a second request for the URI, but this time it adds an Authorization field containing the information necessary to establish the user's credentials. (Notice that this field is misnamed since it provides authentication information, not authorization information.) The server checks the contents of Authorization, and if it passes muster, the request is passed on to the authorization phase of the transaction, where the server will decide whether the authenticated user has access to the requested URI.

On subsequent requests to this URI, the browser remembers the user's authentication information and automatically provides it in the Authorization field. This way the user doesn't have to provide his credentials each time he fetches a page. The browser also provides the same information for URIs at the same level or beneath the current one, anticipating the common situation in which an entire directory tree is placed under access control. If the authentication information becomes invalid (for example, in a scheme in which authentication expires after a period of time), the server can again issue a 401 response, forcing the browser to request the user's credentials all over again.

The contents of www-Authenticate and Authorization are specific to the particular authentication scheme. Fortunately, only three authentication schemes are in general use, and just one dominates the current generation of browsers and servers.1 This is the Basic authentication scheme, the first authentication scheme defined in the HTTP protocol. Basic authentication is, well, basic! It is the standard account name/password scheme that we all know and love.

Here's what an unauthorized response looks like. Feel free to try it for yourself.

% telnet www.modperl.com 80
Connected to www.modperl.com.
Escape character is '^]'.
GET /private/ HTTP/1.0
HTTP/1.1 401 Authorization Required
Date: Mon, 10 Nov 1998 1:01:17 GMT
Server: Apache/1.3.3 mod_perl/1.16
WWW-Authenticate: Basic realm="Test"
Connection: close
Content-Type: text/html
<TITLE>Authorization Required</TITLE>
<H1>Authorization Required</H1>
This server could not verify that you
are authorized to access the document you
requested.  Either you supplied the wrong
credentials (e.g., bad password), or your
browser doesn't understand how to supply
the credentials required.<P>
Connection closed by foreign host.

In this example, we requested the URI /private/, which has been placed under Basic authentication. The returned HTTP 401 status code indicates that some sort of authentication is required, and the www-Authenticate field tells the browser to use Basic authentication. The www-Authenticate field also contains scheme-specific information following the name of the scheme. In the case of Basic authentication, this information consists of the authorization "realm," a short label that the browser will display in the password prompt box. One purpose of the realm is to hint to the user which password he should provide on systems that maintain more than one set of accounts. Another purpose is to allow the browser to automatically provide the same authentication information if it later encounters a discontiguous part of the site that uses the same realm name. However, we have found that not all browsers implement this feature.

Following the HTTP header is some HTML for the browser to display. Unlike the situation with the 403 status, however, the browser doesn't immediately display this page. Instead it pops up a dialog box to request the user's account name and password. The HTML is only displayed if the user presses "Cancel", or in the rare case of browsers that don't understand Basic authentication.

After the user enters his credentials, the browser attempts to fetch the URI once again, this time providing the credential information in the Authorization field. The request (which you can try yourself) will look something like this:

% telnet www.modperl.com 80
Connected to www.modperl.com.
Escape character is '^]'.
GET /private/ HTTP/1.0
Authorization: Basic Z2FuZGFsZjp0aGUtd2l6YXJk
HTTP/1.1 200 OK
Date: Mon, 10 Nov 1998 1:43:56 GMT
Server: Apache/1.3.3 mod_perl/1.16
Last-Modified: Thu, 29 Jan 1998 11:44:21 GMT
ETag: "1612a-18-34d06b95"
Content-Length: 24
Accept-Ranges: bytes
Connection: close
Content-Type: text/plain
Hi there.
How are you?
Connection closed by foreign host.

The contents of the Authorization field are the security scheme, "Basic" in this case, and scheme-specific information. For Basic authentication, this consists of the user's name and password, concatenated together and encoded with base64. Although the example makes it look like the password is encrypted in some clever way, it's not--a fact that you can readily prove to yourself if you have the MIME::Base64 module installed:2

% perl -MMIME::Base64 -le 'print decode_base64 "Z2FuZGFsZjp0aGUtd2l6YXJk"'

Standard Apache offers two types of authentication: the Basic authentication shown above, and a more secure method known as Digest. Digest authentication, which became standard with HTTP/1.1, is safer than Basic because passwords are never transmitted in the clear. In Digest authentication, the server generates a random "challenge" string and sends it to the browser. The browser encrypts the challenge with the user's password and returns it to the server. The server also encrypts the challenge with the user's stored password and compares its result to the one returned by the browser.3 If the two match, the server knows that the user knows the correct password. Unfortunately, the commercial browser vendors haven't been as quick to innovate as Apache, so Digest authentication isn't widely implemented on the browser side. At the same time, some might argue that using Basic authentication over the encrypted Secure Sockets Layer (SSL) protocol is simpler, provided that the browser and server both implement SSL. We discuss SSL authentication techniques at the end of this chapter.

Because authentication requires the cooperation of the browser, your options for customizing how authentication works are somewhat limited. You are essentially limited to authenticating based on information that the user provides in the standard password dialog box. However, even within these bounds, there are some interesting things you can do. For example, you can implement an anonymous login system that gives the user a chance to provide contact information without requiring vigorous authentication.

After successfully authenticating a user, Apache enters its authorization phase. Just because a user can prove that he is who he claims to be doesn't mean he has unrestricted access to the site! During this phase Apache applies any number of arbitrary tests to the authenticated username. Apache's default handlers allow you to grant access to users based on their account names or their membership in named groups, using a variety of flat file and hashed lookup table formats.

By writing custom authorization handlers, you can do much more than this. You can perform a SQL query on an enterprise database, consult the company's current organizational chart to implement role-based authorization, or apply ad hoc rules like allowing users named "Fred" access on alternate Tuesdays. Or how about something completely different from the usual web access model, such as a system in which the user purchases a certain number of "pay per view" accesses in advance? Each time he accesses a page, the system decrements a counter in a database. When the user's access count hits zero, the server denies him access.


1 The three authentication schemes in general use are Basic, Digest, and Microsoft's proprietary NTLM protocol used by its MSIE and IIS products.

2 MIME::Base64 is available from CPAN.

3 Actually, the user's plain-text password is not stored on the server side. Instead, the server stores an MD5 hash of the user's password and the hash, not the password itself, are used on the server and browser side to encrypt the challenge. Because users tend to use the same password for multiple services, this prevents the compromise of passwords by unscrupulous webmasters.

   Show Contents   Previous Page   Next Page
Copyright 1999 by O'Reilly & Associates, Inc.