home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam  


Apache The Definitive Guide, 3rd EditionApache: The Definitive GuideSearch this book

12.5. Scalability

Moving a web site from one machine serving a few test requests to an industrial-strength site capable of serving the full flood of web demand may not be a simple matter.

12.5.1. Performance

A busy site will have performance issues, which boil down to the question: "Are we serving the maximum number of customers at the minimum cost?"

12.5.1.1. Tools

You can see how resources are being used under Unix from the utilities: top, vmstat, swapinfo, iostat, and their friends. (See Essential System Administration, by Aeleen Frisch [O'Reilly, 2002].)

12.5.1.4. Load balancing

mod_backhand is free software for load balancing, covered later in this chapter. For expensive software look for ServerIron, BigIP, LoadDirector, on the Web.

12.5.1.5. Image server, text server

The amount of RAM at your disposal limits the number of copies of Apache (as httpd or httpsd) that you can run, and that limits the number of simultaneous clients you can serve. You can reduce the size of some of the httpd instances by having a cutdown version for images, PDF files, or text while running a big version for scripts.

What normally makes the difference in size is the necessity to load a scripting language such as Perl or PHP into httpd. Because these provide persistent storage of modules and variables between requests, they tend to consume far more RAM than servers that only serve static pages and images. The normal answer is to run two copies of Apache, one for the static stuff and one for the scripts. Each copy has to bind to a different IP and port combination, of course, and usually the number of instances of the dynamic one has to be limited to avoid thrashing.

12.5.2. Shared Versus Replicated DBs

You may want to speed up database accesses by replicating your database across several machines so that they can serve clients independently. Replication is easy if the data is static, i.e., catalogs, texts, libraries of images, etc. Replication is hard if the database is often updated as it would be with active clients. However, you can sidestep replication by dividing your client database into chunks (for instance, by surname: A-D, E-G,...etc.), each served by a single machine. To increase speed, you divide it smaller and add more hardware.



Library Navigation Links

Copyright © 2003 O'Reilly & Associates. All rights reserved.