home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam  


CONTENTS

Chapter 12. Running a Big Web Site

In this chapter we try to bring together the major issues that should concern the webmaster in charge of a big site. Of course, the bigger the site, the more diverse the issues that have to be thought about, so we do not at all claim to cover every possible problem. What follows is a bare minimum, most of which just refers to topics that have already been covered elsewhere in this book.

12.1 Machine Setup

Each machine should be set up with the following:

  1. The current, stable versions of the operating system and all the supporting software, such as Apache, database manager, scripting language, etc. It is obviously essential that all machines on the site should be running the same versions of all these products.

  2. Currently working TCP/IP layer with all up-to-date patches.

  3. The correct time: since elements of the HTTP protocol use the time of day — it is worth using Unix's xntpd (http://www.eecis.udel.edu/~ntp/), Win32's ntpdate (http://www.eecis.udel.edu/~ntp/ntp_spool/html/ntpdate.html), or Tardis (http://www.kaska.demon.co.uk) to make sure your machines keep accurate time.

12.2 Server Security

There are many changing aspects to securing a server, but the following points should get you started. All of these need to be checked regularly and by someone other than the normal sys admin. Two sets of eyes find more problems, and an independent and knowledgeable review ensures trust.

12.2.1 Root Password

The root password on your server is the linchpin of your security. Do not let people write it on the wall over their monitors or otherwise expose it.

12.2.2 File Positions and Ownerships

File security is a fundamental aspect of web server security. These are rules to follow for file positions and ownership:

  • Files should not be owned by the user(s) that services (http, ftpd, sendmail...) run as — each service should have its own user. Ideally, ownership of files and services should be as finely divided as possible — for instance, the user that the Apache daemon runs as should probably be different from the user that owns its configuration files — this prevents the server from changing its own configuration even if someone does manage to subvert it. Each service should also have its own user, to increase the difficulty of attacks that use multiple servers. (With different users, it is likely that files dropped off using one server can't be accessed from another, for example). Qmail, a secure mail server, for instance, uses no less than six different users for different parts of its service, and its configuration files are owned by yet another user, usually root.

  • Services shouldn't share file trees.

  • Don't put executable files in the web tree — that is, on or below Apache's DocumentRoot.

  • Don't put service control files in the web tree or ftp tree or anywhere else that can be accessed remotely.

  • Ideally, run each service on a different machine.

These are rules to follow for file permissions:

  • If files are owned by someone else, you have to grant read permissions to the group that includes the relevant service. Similarly, you have to grant execute permissions to compiled binaries. Compiled binaries don't need read permissions, but shell scripts do. Always try to grant the most restrictive permissions possible — so don't grant write permission to the server for configuration files, for instance.

  • In the upgrade procedure (see later) make handoff scripts set permissions and ownerships to avoid mistakes.

12.2.3 The Apache Web Site

The Apache web site offers some hints and tips on security issues in setting up a web server. Some of the suggestions will be general; others specific to Apache.

12.2.3.1 Permissions on ServerRoot directories

In typical operation, Apache is started by the root user, and it switches to the user defined by the User directive to serve hits. As is the case with any command that root executes, you must take care that it is protected from modification by nonroot users. Not only must the files themselves be writable only by root, but so must the directories and parents of all directories. For example, if you choose to place ServerRoot in /usr/local/apache, then it is suggested that you create that directory as root, with commands like these:

mkdir /usr/local/apache
cd /usr/local/apache
mkdir bin conf logs
chown 0 . bin conf logs
chgrp 0 . bin conf logs
chmod 755 . bin conf logs

It is assumed that /, /usr, and /usr/local are only modifiable by root. When you install the httpd executable, you should ensure that it is similarly protected:

cp httpd /usr/local/apache/bin
chown 0 /usr/local/apache/bin/httpd
chgrp 0 /usr/local/apache/bin/httpd
chmod 511 /usr/local/apache/bin/httpd

You can create an htdocs subdirectory that is modifiable by other users — since root never executes any files out of there and shouldn't be creating files in there.

If you allow nonroot users to modify any files that root either executes or writes on, then you open your system to root compromises. For example, someone could replace the httpd binary so that the next time you start it, it will execute some arbitrary code. If the logs directory is writable (by a nonroot user), someone could replace a log file with a symlink to some other system file, and then root might overwrite that file with arbitrary data. If the log files themselves are writable (by a nonroot user), then someone may be able to overwrite the log itself with bogus data.

12.2.3.2 Server-side includes

Server-side includes (SSI) can be configured so that users can execute arbitrary programs on the server. That thought alone should send a shiver down the spine of any sys admin.

One solution is to disable that part of SSI. To do that, you use the IncludesNOEXEC option to the Options directive.

12.2.3.3 Nonscript-aliased CGI

Allowing users to execute CGI scripts in any directory should only be considered if:

  • You trust your users not to write scripts that will deliberately or accidentally expose your system to an attack.

  • You consider security at your site to be so feeble in other areas as to make one more potential hole irrelevant.

  • You have no users, and nobody ever visits your server.

12.2.3.4 Script-aliased CGI

Limiting CGI to special directories gives the sys admin control over what goes into those directories. This is inevitably more secure than nonscript-aliased CGI, but only if users with write access to the directories are trusted or the sys admin is willing to test each new CGI script/program for potential security holes.

Most sites choose this option over the nonscript-aliased CGI approach.

12.2.3.5 CGI in general

Always remember that you must trust the writers of the CGI script/programs or your ability to spot potential security holes in CGI, whether they were deliberate or accidental.

All the CGI scripts will run as the same user, so they have the potential to conflict (accidentally or deliberately) with other scripts. For example, User A hates User B, so she writes a script to trash User B's CGI database. One program that can be used to allow scripts to run as different users is suEXEC, which is included with Apache as of 1.2 and is called from special hooks in the Apache server code. Another popular way of doing this is with CGIWrap.

12.2.3.6 Stopping users overriding system-wide settings...

To run a really tight ship, you'll want to stop users from setting up .htaccess files that can override security features you've configured. Here's one way to do it: in the server configuration file, add the following:

<Directory /> 
AllowOverride None 
Options None 
Allow from all 
</Directory> 

then set up for specific directories. This stops all overrides, includes, and accesses in all directories apart from those named.

12.2.3.7 Protect server files by default

One aspect of Apache, which is occasionally misunderstood, is the feature of default access. That is, unless you take steps to change it, if the server can find its way to a file through normal URL mapping rules, it can serve it to clients. For instance, consider the following example:

  1. # cd /; ln -s / public_html

  2. Accessing http://localhost/~root/

    This would allow clients to walk through the entire filesystem. To work around this, add the following block to your server's configuration:

    <Directory />
        Order Deny,Allow
        Deny from all
    </Directory>

This will forbid default access to filesystem locations. Add appropriate <Directory> blocks to allow access only in those areas you wish. For example:

<Directory /usr/users/*/public_html>
    Order Deny,Allow
    Allow from all
</Directory>
<Directory /usr/local/httpd>
    Order Deny,Allow
    Allow from all
</Directory>

Pay particular attention to the interactions of <Location> and <Directory> directives; for instance, even if <Directory /> denies access, a <Location /> directive might overturn it.

Also be wary of playing games with the UserDir directive; setting it to something like ./ would have the same effect, for root, as the first example earlier. If you are using Apache 1.3 or above, we strongly recommend that you include the following line in your server configuration files:

UserDir disabled root

Please send any other useful security tips to The Apache Group by filling out a problem report. If you are confident you have found a security bug in the Apache source code itself, please let us know.

12.3 Managing a Big Site

A major problem in managing a big site is that it is always in flux. The person in charge therefore has to manage a constant flow of new material from the development machines, through the beta test systems, to the live site. This process can be very complicated and he will need as much help from automation as he can get.

12.3.1 Development Machines

The development hardware has to address two issues: the functionality of the code — running on any machine — and the interaction of the different machines on the live site.

The development of the code — by one or several programmers — will benefit enormously from using a version control system like CVS (see http://www.cvshome.org/). CVS allows you to download files from the archive, work on them, and upload them again. The changes are logged and a note is broadcast to everyone else in the project.[1] At any time you can go back to any earlier version of a file. You can also create "branches" — temporary diversions from the main development that run in parallel.

CVS can operate through a secure shell so that developers can share code securely over the Internet. We used it to control the writing of this edition of this book. It is also used to manage the development of Apache itself, and, in fact, most free software.

The network of development machines needs to resemble the network of live machines so that load balancing and other intersystem activities can be verified. It is possible to simulate multiple machines by running multiple services on one machine. However, this can miss accidental dependences that arise, so it is not a good idea for the beta test stage.

12.3.2 Beta Test

The beta test site should be separate from the development machines. It should be a replica of the real site in every sense (though perhaps scaled down — e.g., if the live site is 10 load-balanced machines, the beta test site might only have 2), so that all the different ways that networked computers can interfere with each other can have full rein. It should be set up by the sys admins but tested by a very special sort of person: not a programmer, but someone who understands both computing and end users. Like a test pilot, she should be capable of making the crassest mistakes while noting exactly what she did and what happened next.

12.3.3 The Live Site

The configuration of the live site will be dictated by a number of factors — the functionality of the site plus the expected traffic. Quite often a site can be divided into several parts, which are best handled on different machines. One might handle data-intensive actions — serving a large stock of images for instance. Another might be concerned with computations and a database, while a third might handle secure access. They might be replicated for backup and maybe mirrored in another continent to minimize long-haul web traffic and improve client access. Load sharing and automatic-backup software will be an issue here (see later).

12.3.4 Upgrade Procedures

An established site will have its own upgrade procedure. If not, it should — and do so by incorporating at least some elements that follow.

Repeatable

You should be sure that what is handed off to the live site is really, really what was beta tested.

Reversible

When it turns out that it wasn't, or that the beta site got broken in the hand-off process or never worked properly in the first place, you can go back to the previous live site. This may not be possible if databases have changed in the meantime, so backups are a good idea. The upgrade should be designed from the start so that it can be unwound in the event of upgrade failure. For instance, if a field in the client record is to be changed, it would be a good idea to keep the old field and create a new field alongside it into which the value is copied and then changed. The old code will then work on the new data as before.

Cautious

Always incorporate a final testing phase before going live.

As development goes ahead, the transfer of data and scripts between the three sites should be managed by scripts that produce comprehensive logs. This way, when something goes wrong, it can be traced and fixed. These scripts should also explicitly set ownerships and permissions for all the files transferred.

12.3.5 Maintenance Pages

Once you have an active web site, you — or your marketing people — will want to know as much as you can about who is using it, why they are, and what they think of the experience. Apache has comprehensive logging facilities, and you can write scripts to analyze them; alternatively, you can write scripts to accumulate data in your database as you go along. Either way, you do not want your business rivals finding their way to this sensitive information or monitoring your web traffic while you look at it, so you may want to use SSL to protect your access to your maintenance pages. These pages may well allow you to view, alter, and update confidential customer information: normal prudence and the demands of data protection laws would suggest you screen these activities with SSL.

12.4 Supporting Software

Besides Apache, there are two big chunks of supporting software you will need: a scripting language and a database manager. We cover languages fairly extensively in Chapter 13, Chapter 15, Chapter 16, and Chapter 17. There are also some smaller items.

12.4.1 Database Manager

The computing world divides into two camps — the sort-of-free camp and the definitely expensive camp. If you are reading this, you probably already use or intend to use Apache and you will therefore be in the sort-of-free camp. This camp offers free software under a variety of licences (see later) plus, in varying degrees, commercial support. Nowadays, all DBMs (database managers) use the SQL model, so a good book on this topic is essential.[2] Most of the scripting languages now have more or less standardized interfaces to the leading DBMs. When working with a database manager, the programmer often has a choice between using functions in the DBM or the language. For instance, MySQL has powerful date-formatting routines that will return a date and time from the database served up to your taste. This could equally be done in Perl, though at a cost in labor. It is worth exploring the programming language hidden inside a DBM.

These are the significant freeware database managers:

MySQL (http://www.mysql.com)

MySQL is said to be a "lighter weight" DBM. However, we have found it to be very reliable, fast, and easy to use. It follows what one might call the "European" programming style, in which the features most people will want to use are brought to the fore and made easy, while more sophisticated features are accessible if you need them. The "American" style seems to range all the package's features with equal prominence, so that the user has to be aware of what he does not want to use, as well as what he does.

PostgreSQL (http://www.postgresql.org)

PostgreSQL is said to be a more sophisticated, "proper" database. However, it did not, at the time of writing, offer outer joins and a few other useful features. It is also annoyingly literal about the case of table and field names, but requires quotation marks to actually pay attention to them.

mSQL

mSQL used to be everyone's favorite database until MySQL came along and largely displaced it. (It is source available but not free.) In many respects it is very similar to MySQL.

A "real" database manager will offer features like transactions that can be rolled-back in case of failure and Foreign key. Both MySQL and PostgreSQL now have these.

If you are buying a commercial database manager, you will probably consider Oracle, Sybase, Informix: products that do not need our marketing assistance and whose support for free operating systems is limited.

12.4.2 Mailserver

Most web sites need a mailserver to keep in touch with clients and to tell people in the organization what the clients are up to.

The Unix utility Sendmail (http://www.sendmail.org) is old and comprehensive (huge, even). It had a reputation for insecurity, but it seems to have been fixed, and in recent years there have been few exploits against it. It must mean something if the O'Reilly book about it is one of the thickest they publish.[3] It has three younger competitors:

Qmail (http://www.qmail.org)

Qmail is secure, with documentation in English, Castillian Spanish, French, Russian, Japanese and Korean, but rather restrictive and difficult to deal with, particularly since the author won't allow anyone to redistribute modified versions, but nor will he update the package himself. This means that it can be a pretty tedious process getting qmail to do what you want.[4]

Postfix (http://www.postfix.cs.uu.nl)

Postfix is secure and, in our experience, nice.

Exim (http://www.exim.org/)

There is also Exim from the University of Cambridge in the U.K. The home page says the following:

In style it is similar to Smail 3, but its facilities are more extensive, and in particular it has some defences against mail bombs and unsolicited junk mail in the form of options for refusing messages from particular hosts, networks, or senders. It can be installed in place of sendmail, although the configuration of exim is quite different to that of sendmail.

It is available for Unix machines under the GNU licence and has a good reputation among people whose opinions we respect.

12.4.3 PGP

Business email should be encrypted because it may contain confidential details about your business, which you want to keep secret, or about your clients, which you are obliged to keep secret.

Pretty Good Privacy (PGP) (http://www.pgpi.org) is the obvious resource, but it uses the IDEA algorithm, is protected by patents, and is not completely free. GnuPG does not use IDEA and is free: http://www.gnupg.org/. PGP is excellent software, but it has one problem if used interactively. It tries to install itself into your web browsers as a plug-in and then purports to encrypt your email on the fly. We have found that this does not always work, with the result that your darkest secrets get sent en clair. It is much safer to write an email, cut it onto the clipboard, use PGP's encryption tool to encrypt the clipboard, and copy the message — now visibly secure — back into your email.

12.4.4 SSH Access to Server

Your live web site will very likely be on a machine far away that is not under your control. You can connect to the remote end using telnet and run a terminal emulator on your machine, but when you type in the essential root password to get control of the far server, the password goes across the web unencrypted. This is not a good idea.

You therefore need to access it through a secure shell over the Web so that all your traffic is encrypted. Not only your passwords are protected, but also, say, a new version of your client database with all their credit card numbers and account details that you are uploading. The Bad Guys might like to intercept it, but they will not be able to.

You need two software elements to do all this:

  1. Secure shell: free from OpenSSH at www.openssh.org or expensive at http://www.ssh.com.

  2. figs/win32.gif

    figs/unix.gif

    A terminal emulator that will tunnel through ssh to the target machine and make it seem to you that you have the target's operating system prompt on your desktop. If you are running Win32, we have found that Mindterm (http://www.mindbright.se) works well enough, though it is written in Java and you need to install the JDK. When our version starts up, it throws alarming-looking Java fatal errors, but these don't seem to matter. A good alternative is Putty: http://www.chiark.greenend.org.uk/~sgtatham/putty/. If you are running Unix, then it "just works" — since you have access to a terminal already.

12.4.5 Credit Cards

The object of business is to part customers from their money (in the nicest possible way), and the essential point of attack is the credit card. It is the tap through which wealth flows, but it may also serve to fill you a poisoned chalice as well. As soon as you deal in credit card numbers, you are apt to have trouble. Credit card fraud is vast, and the merchant ends up paying for most of it. See the sad advice at, for instance, http://antifraud.com/tips.htm. Conversely, there is little to stop any of your employees who have access to credit card numbers from noting a number and then doing some cheap shopping. Someone more organized than that can get you into trouble in an even bigger way.

Unless you are big and confident and have a big and competent security department, you probably will want to use an intermediary company to handle the credit card transaction and send you most of the money. An interesting overview of the whole complicated process is at http://www.virtualschool.edu/mon/ElectronicProperty/klamond/credit_card.htm.

There are a number of North American intermediaries:

EMS Nationwide http://www.webmall.net/admark/
First of Omaha http://www.synergy.net/channels/studio23/fbo/foomp.html
First USA Paymentech http://www.fusa.com/
First Union - Merchant Sales and Services http://www.firstunion.com/2/business/merchant/
Nova Information Systems http://www.novainfo.com/
Vantage Services http://vanserv.com/

Since we have not dealt with any of them, we cannot comment. The interfaces to your site will vary from company to company, as will the costs and the percentage they will skim off each transaction. It is also very important to look at the small print on customer fraud: who picks up the tab?

We have used WorldPay — a U.K. company operating internationally, owned by HSBC, one of our biggest banks. They offer a number of products, including complete shopping systems and the ability to accept payments in any of the world's currencies and convert the payment to yours at the going rate. We used their entry-level product, Select Junior, which has rather an ingenious interface. We describe it to show how things can be done — no doubt other intermediaries have other methods.

You persuade your customer along to the point of buying and then present her with an HTML form that says something like this:

We are now ready to take your payment by credit card for $50.75.

The form has a number of hidden fields, which contain your merchant ID at WorldPay, the transaction ID you have assigned to this purchase, the amount, the currency, and a description field that you have made up. The customer hits the Submit button, and the form calls WorldPay's secure purchase site. They then handle the collection of credit card details using their own page, which is dropped into a page you have designed and preloaded onto their site to carry through the feel of your web pages. The result combines your image with theirs.

When the customer's credit card dialog has finished, WorldPay will then display one of two more pages you have preloaded: the first, for a successful transaction, thanking the client and giving him a link back to your site; the other for a failed transaction, which offers suitable regrets, hopes for the future, and a link to your main rival. WorldPay then sends you an email and/or calls a link to your site with the transaction details. This link will be to a script that does whatever is necessary to set the purchase in motion. Writing the script that accepts this link is slightly tricky because it does nothing visible in your browser. You have to dump debugging messages to a file.

It is worth checking that the amount of money the intermediary says it has debited from the client really is the amount you want to be paid, because things may have been fiddled by an attacker or just gone wrong during the payment process.

12.4.6 Passwords

A password is only useful when there is a human in the loop to remember and enter it. Passwords are not useful between processes on the server. For instance, scripts that call the database manager will often have to quote a password. But since this has to be written into the script that anyone can read who has access to the server and is of no use to them if they have not, it does nothing to improve security.

However, services should have minimal access, and separate accounts should be used. SSH access with the associated encrypted keys should be necessary when humans do upgrades or perform maintenance activities.

12.4.7 Turn Off Unwanted Services

You should run no more Unix services than are essential. The Unix utility ps tells you what programs are running. You may have the utility sockstat, which looks at what services are using sockets and therefore vulnerable to attacks from outside via TCP/IP. It produces output like this:

USER       COMMAND        PID     FD PROTO    LOCAL ADDRESS      FOREIGN ADDRESS
root     mysqld     157    4 tcp4   127.0.0.1.3306 *.*
root     sshd1      135    3 tcp4   *.22           *.*
root     inetd      100    4 tcp4   *.21           *.*

indicating that MySQL, SSH, and inet are running.

The utility lsof is more cryptic but more widely supported — it shows open files and sockets and which processes opened them. lsof can be found at ftp://vic.cc.purdue.edu/pub/tools/unix/lsof/.

It is a good idea to restrict services so that they listen only on the appropriate interface. For example, if you have a database manager running, you may want it to listen on localhost so only the CGI stuff can talk to it. If you have two networks (one Internet, one backend), then some stuff may only want to listen on one of the two.

12.4.8 Backend Networks

Internal services — those not exposed to the Internet, like a database manager — should have their own network. You should partition machines/networks as much as possible so that attackers have to crawl over or under internal walls.

12.4.9 SuEXEC

If there are untrusted internal users on your system (for instance, students on a University system who are allowed to create their own virtual web sites), use suexec to make sure they do not abuse the file permissions they get via Apache.

12.4.10 SSL

When your clients need to talk confidentially to you — and vice versa — you need to use Apache SSL (see Chapter 3). Since there is a performance cost, you want to be sparing about using this facility. A link from an insecure page invokes SSL simply by calling https://<securepage>. Use a known Certificate Authority or customers will get warnings that might shake their confidence in your integrity. You need to start SSL one page early, so that the customer sees the padlock on her browser before you ask her to type her card number.

You might also use SSL for maintenance pages (see earlier).

12.4.11 Certificates

See Chapter 11 on SSL.

12.5 Scalability

Moving a web site from one machine serving a few test requests to an industrial-strength site capable of serving the full flood of web demand may not be a simple matter.

12.5.1 Performance

A busy site will have performance issues, which boil down to the question: "Are we serving the maximum number of customers at the minimum cost?"

12.5.1.1 Tools

You can see how resources are being used under Unix from the utilities: top, vmstat, swapinfo, iostat, and their friends. (See Essential System Administration, by Aeleen Frisch [O'Reilly, 2002].)

12.5.1.2 Apache's mod_info

mod_info can be used to monitor and diagnose processes that deal with HTTPD. See Chapter 10.

12.5.1.3 Bandwidth

Your own hardware may be working wonderfully, but it's being strangled by bandwidth limitations between you and the Web backbone. You should be able to make rough estimates of the bandwidth you need by multiplying the number of transactions per second by the number of bytes transferred (making allowance for the substantial HTTP headers that go with each web page). Having done that, check what is actually happening by using a utility like ipfm from http://www.via.ecp.fr/~tibob/ipfm/:

HOST                    IN        OUT      TOTAL 
host1.domain.com        12345     6666684  6679029 
host2.domain.com        1232314   12345    1244659 
host3.domain.com        6645632   123      6645755
...

Or use cricket (http://cricket.sourceforge.net/) to produce pretty graphs.

12.5.1.4 Load balancing

mod_backhand is free software for load balancing, covered later in this chapter. For expensive software look for ServerIron, BigIP, LoadDirector, on the Web.

12.5.1.5 Image server, text server

The amount of RAM at your disposal limits the number of copies of Apache (as httpd or httpsd) that you can run, and that limits the number of simultaneous clients you can serve. You can reduce the size of some of the httpd instances by having a cutdown version for images, PDF files, or text while running a big version for scripts.

What normally makes the difference in size is the necessity to load a scripting language such as Perl or PHP into httpd. Because these provide persistent storage of modules and variables between requests, they tend to consume far more RAM than servers that only serve static pages and images. The normal answer is to run two copies of Apache, one for the static stuff and one for the scripts. Each copy has to bind to a different IP and port combination, of course, and usually the number of instances of the dynamic one has to be limited to avoid thrashing.

12.5.2 Shared Versus Replicated DBs

You may want to speed up database accesses by replicating your database across several machines so that they can serve clients independently. Replication is easy if the data is static, i.e., catalogs, texts, libraries of images, etc. Replication is hard if the database is often updated as it would be with active clients. However, you can sidestep replication by dividing your client database into chunks (for instance, by surname: A-D, E-G,...etc.), each served by a single machine. To increase speed, you divide it smaller and add more hardware.

12.6 Load Balancing

This section deals with the problems of running a high-volume web site on a number of physical servers. These problems are roughly:

  • Connecting the servers together.

  • Tuning individual servers to get the best out of the hardware and Apache.

  • Spreading the load among a number of servers with mod_backhand.

  • Spreading your data over the servers with Splash so that failure of one database machine does not crash the whole site.

  • Collecting log files in one place with rsync (see http://www.rsync.org/ ) — if you choose not to do your logging in the database.

12.6.1 Spreading the Load

The simplest and, in many ways, the best way to deal with an underpowered web site is to throw hardware at it. PCs are the cheapest way to buy MegaFlops, and TCP/IP connects them together nicely. All that's needed to make a server farm is something to balance the load around the PCs, keeping them all evenly up to the collar, like a well-driven team of horses.

There are expensive solutions: Cisco's LocalDirector, LinuxDirector, ServerIrons, and a host of others.

12.6.2 mod_backhand

The cheap solution is mod_backhand, distributed on the same licence as Apache. It originated in the Center for Networking and Distributed Systems at Johns Hopkins University.

Its function is to keep track of the resources of individual machines running Apache and connected in a cluster. It then diverts incoming requests to the machines with the largest available resources. There is a small overhead in the redirection, but overall, the cluster works much better.

In the simplest arrangement, a single server has the site's IP number and farms the requests out to the other servers, which are set up identically (apart from IP addresses) and with identical mod_backhand directives. The machines communicate with each other (once a second, by default, but this can be changed), exchanging information on the resources each currently has available. On the basis of this information, the machine that catches a request can forward it to the machine best able to deal with it. Naturally, there is a computing cost to this, but it is small and predictable.

mod_backhand works like a proxy server, but one that knows the capabilities of its proxies and how that capability varies from moment to moment.

It is possible to vary this setup so that different machines do different things — for instance, you might have some 64-bit processors (DEC Alphas, for example) which could specialize in running CGI scripts. PCs, however, are used to serve images.

A more complex setup is to use multiple servers fielding the incoming requests and handing them off to each other. There are essentially two ways of handling this. The first is to use standard load-balancing hardware to distribute the requests among the servers, and then using mod_backhand to redistribute them more intelligently. An alternative is to use round-robin DNS — that is, to give each machine a different IP address, but to have the server name resolve to all of the addresses. This has the advantage that you avoid the expense of the load balancer (and the problems of single points of failure, too), but the problem is that if a server dies, there's no easy way to handle the fact its IP address is no longer being serviced. One answer to this problem is Wackamole, also from CNDS, which builds on the rather marvelous Spread toolkit to ensure that every IP address is always in service on some machine.

This is all very fine and good, and the idea of mod_backhand — choosing a lightly loaded server to service a request on the fly — clearly seems a good one. But there are problems. The main one is deciding on the server. The operating system provides loading information in the form of a one-minute rolling average of the length of the run queue updated every five seconds. Since a busy site could get 5,000 hits before the next update, it is clear that just choosing the most lightly loaded server each time will overwhelm it. The granularity of this data is much too coarse. Consequently, mod_backhand has a number of methods for picking a reasonably lightly loaded server. Just which method is best involves a lot of real-world experimentation, and the jury is still out.

12.6.3 Installation of mod_backhand

Download the usual gzipped tarball from http://www.backhand.org/mod_backhand/download/mod_backhand.tar.gz. Surprisingly, it is less than 100KB long and arrives in a flash. Make it a source directory next to Apache's — we put it in /usr/wrc.mod_backhand. Ungzipping and detarring produces a subdirectory — /usr/wrc.mod_backhand/mod_backhand-1.0.1 with the usual source files in it.

The module is so simple it does not need the paraphernalia of configuration files. Just make sure you have a path to the Apache directory by running ls:

ls ../../apache/apache_x.x.x

When it shows the contents of the Apache directory, turn it into:

./precompile ../../apache/apache_x.x.x

This will produce a commentary on the reconfiguration of Apache:

Copying source into apache tree...
Copying sample cgi script and logo into htdocs directory...
Adding libs to Apache's Configure...
Adding to Apache's Configuration.tmpl...
Setting extra shared libraries for FreeBSD (-lm)
Modifying httpd.conf-dist...
Updating Makefile.tmpl...

Now change to the apache source directory:
    ../../apache/apache_1.3.9
And do a ./configure...

If you want to enable backhand (why would you have done this if you didn't?)
then add:  --enable-module=backhand --enable-shared=backhand
to your apache configure command.  For example, I use:

   ./configure --prefix=/var/backhand --enable-module=so \
     --enable-module=rewrite --enable-shared=rewrite \
     --enable-module=speling --enable-shared=speling \
     --enable-module=info --enable-shared=info \
     --enable-module=include --enable-shared=include \
     --enable-module=status --enable-shared=status \
     --enable-module=backhand --enable-shared=backhand

For those who prefer the semimanual route to making Apache, edit Configuration to include the line:

SharedModule modules/backhand/mod_backhand.cso

then run ./Configure and make.

This will make it possible to run mod_backhand as a DSO. The shiny new httpd needs to be moved onto your path — perhaps in /usr/local/bin.

This process, perhaps surprisingly, writes a demonstration set of Directives and Candidacy functions into the file .../apache_x.x.x/conf/httpd.conf-dist. The intention is good, but the data may not be all that fresh. For instance, when we did it, the file included byCPU (see later), which is now deprecated. We suggest you review it in light of what is upcoming in the next section and the latest mod_backhand documentation.

12.6.4 Directives

mod_backhand has seven Apache directives of its own:

Backhand  

Backhand <candidacy function>
Default none 
Directory
 

This directive invokes one of the built-in mod_backhand candidacy functions — see later.

BackhandFromSO  

BackhandFromSO <path to .so file> <name of function>
<argument>
Default none 
Directory
 

This directive invokes a DSO version of the candidacy function. At the time of writing the only one available was by Hostname (see later). The distribution includes the "C" source byHostname.c, which one could use as a prototype to write new functions. For example:

BackhandFromSO libexec/byHostname.so byHostname www

would eliminate all hostnames that do not include www.

UnixSocketDir  

UnixSocketDir <Apache user home directory>
Default none 
Server
 

This directive gives mod_backhand a directory where it can write a file containing the performance details of this server — known as the "Arriba". Since mod_backhand has the permissions of Apache, this directory needs to be writable by webuser/webgroup — or whatever user/group you have configured Apache to run as. You might want to create a subdirectory /backhand beneath the Apache user's home directory, for example.

MulticastStats  

MulticastStats <dest addr>:<port>[,ttl]M
ulticastStats <myip addr> <dest addr>:<port>[,ttl]
Default none 
Server
 

mod_backhand announces the status of its machine to others in the cluster by broadcasting or multicasting them periodically. By default, it broadcasts to the broadcast address of its own network (i.e., the one the server is listening on), but you may want it to send elsewhere. For example, you may have two networks, an Internet facing one that receives requests and a backend network for distributing them among the servers. In this case you probably want to configure mod_backhand to broadcast on the backend network. You are also likely to want to accept redirected requests on the backend network, so you'd also use the second form of the command to specify a different IP address for your server. For example, suppose your machine's Internet-facing interface is number 193.2.3.4, but your backend interface is 10.0.0.4 with a /24 netmask. Then you'd want to have this in your Config file:

MulticastStats 10.0.0.4 10.0.0.255:4445

The first form of the command (with only a destination address) is likely to be used when you are using multicast for the statistics instead of broadcast.

Incidentally, mod_backhand listens on all ports on which it is configured to broadcast — obviously, you should choose a UDP port not used for anything else.

AcceptStats  

AcceptStats <ip address>[/<mask>]
Default none 
Server
 

This directive determines from where statistics will be accepted, which can be useful if you are running multiple clusters on a single network or to avoid accidentally picking up stuff that looks like statistics from the wrong network. It simply takes an IP address and netmask. So to correspond to the MulticastStats example given above, you would configure the following:

AcceptStats 10.0.0.0/24

If you need to listen on more than one network (or subnet), then you can use multiple AcceptStats directives. Note that this directive does not include a port number; so to avoid confusion, it would probably be best to use the same port on all networks that share media.

HTTPRedirectToIP  

HTTPRedirectToIP 
Default none  
Directory
 

mod_backhand normally proxies to the other servers if it chooses not to handle the request itself. If HTTPRedirectToIP is used, then it will instead redirect the client, using an IP address rather than a DNS name.

HTTPRedirectToName  

HTTPRedirectToName [format string]
Default [ServerName for the chosen Apache server] 
Directory
 

Like HTTPRedirectToIP, this tells mod_backhand to redirect instead of proxying. However, in this case it redirects to a DNS name constructed from the ServerName and the contents of the Host: header in the request. By default, it is the ServerName, but for complex setups hosting multiple servers on the same server farm, more cunning may be required to end up at the right virtual host on the right machine. So, the format string can be used to control the construction of the DNS name to which you're redirected. We can do no better than to reproduce mod_backhand's documentation:

The format string is just like C format string except that it only has two insertion tokens: %#S and %#H (where # is a number).

%-#S is the server name with the right # parts chopped off. If your server name is www-1.jersey.domain.com, %-3S will yield www-1.

%#S is the server name with only the # left parts preserved. If your server name is www-1.jersey.domain.com, %2S will yield www-1.jersey.

%-#H is the Host: with only the right # parts preserved. If the Host: is www.client.com, %-2S will yield client.com.

%#H will be the Host: with the left # parts chopped off. If the Host: is www.client.com, %1H will yield client.com.

For example, if you run a hosting company hosting.com and you have 5 machines named www[1-5].sanfran.hosting.com. You host www.client1.com and www.client2.com. You also add appropriate DNS names for www[1-5].sanfran.client[12].com.

Backhand HTTPRedirectToName %-2S.%-2H

This will redirect requests to www.client#.com to one of the www[1-5].sanfran.client#.com.

BackhandSelfRedirect  

BackhandSelfRedirect <On|Off>
Default Off  
Directory
 

A common way to run Apache when heavily loaded is to have two instances of Apache running on the same server: one serving static content and doing load balancing and the second running CGIs, typically with mod_perl or some other built-in scripting module. The reason you do this is that each instance of Apache with mod_perl tends to consume a lot of memory, so you only want them to run when they need to. So, normally one sets them up on a different IP address and carefully arranges only the CGI URLs to go to that server (or uses mod_proxy to reverse proxy some URLs to that server). If you are running mod_backhand, though, you can allow it to redirect to another server on the same host. If BackhandSelfRedirect is off and the candidacy functions indicate that the host itself is the best candidate, then mod_backhand will simply "fall through" and allow the rest of Apache to handle the request. However, if BackhandSelfRedirect is on, then it will redirect to itself as if it were another host, thus invoking the "heavyweight" instance. Note that this requires you to set up the MulticastStats directive to use the interface the mod_perl (or whatever) instance to which it's bound, rather than the one to which the "lightweight" instance is bound.

BackhandLogLevel  

BackhandLogLevel <+|-><mbcs|dcsn|net><all|1|2|3|4>
Default Off  
Directory
 

The details seem undocumented, but to get copious error messages in the error log, use this (note the commas):

BackhandLogLevel +net1, +dcsnall

To turn logging off, either don't use the directive at all or use:

BackhandLogLevel -mbscall, -netall, -dcsnall
BackhandModeratorPIDFile  

BackhandModeratorPIDFile filename
Default none
Server
 

If present, this directive specifies a file in which the PID of the "moderator" process will be put. The moderator is the process that generates and receives statistics.

12.6.5 Candidacy Functions

These built-in candidacy functions — that help to select one server to deal with the incoming requests — follow the Backhand directives (see earlier):

byAge  

byAge [time in seconds]
Default: 20 
Directory
 

This function steps around machines that are busy, have crashed, or are locked up: it eliminates servers that have not reported their resources for the "time in seconds".

byLoad  

byLoad [bias - a floating point number]
Default none 
Directory
 

The byLoad function produces a list of servers sorted by load. The bias argument, a floating-point number, lets you prefer the server that originally catches the request by offsetting the extra cost of forwarding it. In other words, it may pay to let the first server cope with the request, even if it is not quite the least loaded. Sensible values would be in the region of 0 to 1.0.

byBusyChildren  

byBusyChildren [bias - an integer]
Default none  
Directory
 

This orders by the number of busy Apache children. The bias is subtracted from the current server's number of children to allow the current server to service the request even if it isn't quite the busiest.

byCPU  

byCPU 
Default  
Directory
 

The byCPU function has the same effect as byLoad but makes its decision on the basis of CPU loading. The FAQ says, "This is mostly useless", and who will argue with that? This function is of historical interest only.

byLogWindow  

byLogWindow 
Default none 
Directory
 

The byLogWindow function eliminates the first log base 2 of the n servers listed: if there are 17 servers, it eliminates all after the first 4.

byRandom  

byRandom 
Default none 
Directory
 

The byRandom function reorders the list of servers using a pseudorandom method.

byCost  

byCost 
Default none  
Directory
 

The byCost function calculates the computing cost (mostly memory use, it seems) of redirection to each server and chooses the cheapest. The logic of the function is explained at http://www.cnds.jhu.edu/pub/papers/dss99.ps.

bySession  

bySession cookie
Default off 
Directory
 

This chooses the server based on the value of a cookie, which should be the IP address of the server to choose. Note that mod_backhand does not set the cookie — it's up to you to arrange that (presumably in a CGI script). This is obviously handy for situations where there's a state associated with the client that is only available on the server to which it first connected.

addPrediction  

AddPrediction 
Default none  
Directory
 

If this function is still available, it is strongly deprecated. We only mention it to advise you not to use it.

byHostname  

byHostname <regexp> 
Default none 
Directory
 

This function needs to be run by BackhandFromSO (see earlier). It eliminates servers whose names do not pass the <regexp> regular expression. For example:

BackhandFromSO libexec/byHostname.so byHostname www

would eliminate all hostnames that do not include www.

12.6.6 The Config File

To avoid an obscure bug, make sure that Apache's User and Group directives are above this block:

LoadModule backhand_module libexec/mod_backhand.so
UnixSocketDir @@ServerRoot@@/backhand
# this multicast is actually broadcast because 128 < 224
# so no time to live parameter needed - ',1' restericts to the local networks
# MulticastStats 128.220.221.255:4445
MulticastStats 225.220.221.20:4445,1
AcceptStats 128.220.221.0/24

<Location "/backhand/">
  SetHandler backhand-handler
</Location>

The SetHandler directive produces the mod_backhand status page at the location specified — this shows the current servers, loads, etc.

The Candidacy functions should appear in a Directory or Location block. A sample scheme might be:

<Directory cgi-bin>
BackhandbyAge 6
BackhandFromSO libexec/byHostname.so byHostname (sun|alpha)
Backhand byRandom
BackHand byLogWindow
Backhand byLoad
</Directory>

This would do the following:

  • Eliminate all servers not heard from for six seconds

  • Choose servers who names were sub or alpha — to handle heavy CGI requests

  • Randomize the list of servers

  • Take a sample of the random list

  • Sort these servers in ascending order of load

  • Take the server at the top of the list

12.6.7 Example Site

Normally, we would construct an example site to illustrate our points, but in the case of mod_backhand, it's rather difficult to do so without using several machines. So, instead, our example will be from a live site that one of the authors (BL) runs, FreeBMD, which is a world-wide volunteer effort to transcribe the Birth, Marriage, and Death Index for England and Wales, currently comprising over 3,000 volunteers. You can see FreeBMD at http://www.freebmd.org.uk/ if you are interested. At the time of writing, FreeBMD was load-balanced across three machines, each with 250 GB of RAID disk, 2 GB of RAM, and around 25 million records in a MySQL database. Users upload and modify files on the machines, from which the database is built, and for that reason the configuration is nontrivial: the files must live on a "master" machine to maintain consistency easily. This means that part of the site has to be load-balanced. Anyway, we will present the configuration file for one of these machines with interleaved comments following the line(s) to which they refer.

HostnameLookups off

This speeds up logging.

User webserv
Group webserv

Just the usual deal, setting a user for the web server.

ServerName liberty.thebunker.net

The three machines are called liberty, fraternity, and equality — clearly, this line is different on each machine.

CoreDumpDirectory /tmp

For diagnostic purposes, we may need to see core dumps: Note that /tmp would not be a good choice on a shared machine — since it is available to all and might leak information. There can also be a security hole allowing people to overwrite arbitrary files using soft links.

UnixSocketDir /var/backhand

This is backhand's internal socket.

MulticastStats 239.255.0.0:10000,1

Since this site shares its network with other servers in the hosting facility (http://www.thebunker.net/) in which it lives, we decided to use multicast for the statistics. Note the TTL of 1, limiting them to the local network.

AcceptStats 213.129.65.176
AcceptStats 213.129.65.177
AcceptStats 213.129.65.178
AcceptStats 213.129.65.179
AcceptStats 213.129.65.180
AcceptStats 213.129.65.181

The three machines each have two IP addresses: one fixed and one administered by Wackamole (see earlier). The fixed address is useful for administration and also for functions that have to be pinned to a single machine. Since we don't know which of these will turn out to be the source address for backhand statistics, we mention them both.

NameVirtualHost *:80

The web servers also host a couple of related projects — FreeCEN, FreeREG, and FreeUKGEN — so we used name-based virtual hosting for them.

Listen *:80

Set up the listening port on all IPs.

MinSpareServers 1
MaxSpareServers 1
StartServers 1

Well, this is what happens if you let other people configure your webserver! Configuring the min and max spare servers to be the same is very bad, because it causes Apache to have to kill and restart child processes constantly and will lead to a somewhat unresponsive site. We'd recommend something more along the lines of a Min of 10 and a Max of 25. StartServers matters somewhat less, but it's useful to avoid horrendous loads at startup. This is, in fact, terrible practice, but we thought we'd leave it in as an object lesson.

MaxClients 100

Limit the total number of children to 100. Usually, this limit is determined by how much RAM you have, and the size of the Apache children.

MaxRequestsPerChild 10000

After 10,000 requests, restart the child. This is useful when running mod_perl to limit the total memory consumption, which otherwise tends to climb without limit.

LogFormat "%h %l %u %t \"%r\" %s %b \"%{Referer}i\" \"%{User-Agent}i\" \
"%{BackhandProxyRequest}n\" \"%{ProxiedFrom}n\""

This provides extra logging so we can see what backhand is up to.

Port 80

This is probably redundant, but it doesn't hurt.

ServerRoot /home/apache

Again, redundant but harmless.

TransferLog /home/apache/logs/access.log
ErrorLog /home/apache/logs/error.log

The "main" logs should hardly be used, since all the actual hosts are in VirtualHost sections.

PidFile /home/apache/logs/httpd.pid
LockFile /home/apache/logs/lockfile.lock

Again, probably redundant, but harmless.

<VirtualHost *:80>
        Port 80
        ServerName freebmd.rootsweb.com
        ServerAlias www.freebmd.org.uk www3.freebmd.org.uk

Finally, our first virtual host. Note that all of this will be the same on each host, except www3.freebmd.org.uk, which will be www1 or 2 on the others.

        DocumentRoot /home/apache/hosts/freebmd/html
        ServerAdmin register@freebmd.rootsweb.com
        TransferLog "| /home/apache/bin/rotatelogs 
                       /home/apache/logs/freebmd/access_log.liberty 86400"
        ErrorLog "| /home/apache/bin/rotatelogs 
                    /home/apache/logs/freebmd/error_log.liberty 86400"

Note that we rotate the logs — since this server gets many hits per second, that's a good thing to do before you are confronted with a 10 GB log file!

        SetEnv BMD_USER_DIR /home/apache/hosts/freebmd/users
        SetEnv AUDITLOG /home/apache/logs/freebmd/auditlog
        SetEnv CORRECTIONSLOG /home/apache/logs/freebmd/correctionslog
        SetEnv MASTER_DOMAIN www1.freebmd.org.uk
        SetEnv MY_DOMAIN www3.freebmd.org.uk

These are used to communicate local configurations to various scripts. Some of them exist because of differences between development and live environments, and some exist because of differences between the various platforms.

        AddType text/html .shtml
        AddHandler server-parsed .shtml
        DirectoryIndex index.shtml index.html

Set up server-parsed HTML, and allow for directory indexes using that.

        ScriptAlias /cgi /home/apache/hosts/freebmd/cgi
        ScriptAlias /admin-cgi /home/apache/hosts/freebmd/admin-cgi
        ScriptAlias /special-cgi /home/apache/hosts/freebmd/admin-cgi
        ScriptAlias /join /home/apache/hosts/freebmd/cgi/bmd-add-user.pl

The various different CGIs, some of which are secure below.

        Alias /scans /home/FreeBMD-scans
        Alias /logs /home/apache/logs/freebmd
        Alias /GUS /raid/freebmd/GUS/Live-GUS
        Alias /motd /home/apache/hosts/freebmd/motd
        Alias /icons /home/apache/hosts/freebmd/backhand-icons

And some aliases to keep everything sane.

        <Location /special-cgi>
                AllowOverride none
                AuthUserFile /home/apache/auth/freebmd/special_users
                AuthType Basic
                AuthName "Live FreeBMD - Liberty Special Administration Site"
                require valid-user
                SetEnv Administrator 1
        </Location>

special-cgi needs authentication before you can use it, and is also particular to this machine.

        <Location />
                Backhand byAge
                Backhand byLoad .5
        </Location>

This achieves load balance. byAge means we won't attempt to use servers that are no longer talking to us, and byLoad means use the least loaded machine — except we prefer ourselves if our load is within .5 of the minimum, to avoid silly proxying based on tiny load average differences. We're also looking into using byBusyChildren, which is probably more sensitive than byLoad, and we are also considering writing a backhand module to allow us to proxy by database load instead.

        <LocationMatch /cgi/(show-file|bmd-user-admin|bmd-add-user|bmd-bulk-add|
                       bmd-challenge|bmd-forgotten|bmd-synd|check-range|
                       list-synd|show-synd-info|submitter)\.pl>
                BackHand off
        </LocationMatch>

        <LocationMatch /(special-cgi|admin-cgi)/>
                BackHand off
        </LocationMatch>

        <LocationMatch /join>
                BackHand off
        </LocationMatch>

These scripts should not be load-balanced.

        <LocationMatch /cgi/bmd-files.pl>
                BackhandFromSO libexec/byHostname.so byHostname (equality)
        </LocationMatch>

This script should always go to equality.

        <LocationMatch /(freebmd|freereg|freecen|search)wusage>
                BackhandFromSO libexec/byHostname.so byHostname (fraternity)
        </LocationMatch>

And these should always go to fraternity.

        <Location /backhand>
                SetHandler backhand-handler
        </Location>

This sets the backhand status page up.

</VirtualHost>

For simplicity, we've left out the configuration for the other virtual hosts. They don't do anything any more interesting, anyway.

[1]  Notes can be broadcast if you've added scripts to do it — these are widely available, though they don't come with CVS itself.

[2]  Such as SQL in a Nutshell, by Kevin Kline (O'Reilly, 2000).

[3]  Bryan Costales with Eric Allman, sendmail (O'Reilly, 2002)

[4]  Indeed, it was exactly this kind of situation that led to the formation of the Apache Group in the first place.

CONTENTS