home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam  


Practical mod_perlPractical mod_perlSearch this book

12.5. Adding a Proxy Server in httpd Accelerator Mode

We have already presented a solution with two servers: one plain Apache server, which is very light and configured to serve static objects, and the other with mod_perl enabled (very heavy) and configured to serve mod_perl scripts and handlers. We named them httpd_docs and httpd_perl, respectively.

In the dual-server setup presented earlier, the two servers coexist at the same IP address by listening to different ports: httpd_docs listens to port 80 (e.g., http://www.example.com/images/test.gif) and httpd_perl listens to port 8000 (e.g., http://www.example.com:8000/perl/test.pl). Note that we did not write http://www.example.com:80 for the first example, since port 80 is the default port for the HTTP service. Later on, we will change the configuration of the httpd_docs server to make it listen to port 81.

This section will attempt to convince you that you should really deploy a proxy server in httpd accelerator mode. This is a special mode that, in addition to providing the normal caching mechanism, accelerates your CGI and mod_perl scripts by taking the responsibility of pushing the produced content to the client, thereby freeing your mod_perl processes. Figure 12-3 shows a configuration that uses a proxy server, a standalone Apache server, and a mod_perl-enabled Apache server.

Figure 12-3

Figure 12-3. A proxy server, standalone Apache, and mod_perl-enabled Apache

The advantages of using the proxy server in conjunction with mod_perl are:

Of course, there are drawbacks. Luckily, these are not functionality drawbacks—they are more administration hassles. The disadvantages are:

  • You have another daemon to worry about, and while proxies are generally stable, you have to make sure to prepare proper startup and shutdown scripts, which are run at boot and reboot as appropriate. This is something that you do once and never come back to again. Also, you might want to set up the crontab to run a watchdog script that will make sure that the proxy server is running and restart it if it detects a problem, reporting the problem to the administrator on the way. Chapter 5 explains how to develop and run such watchdogs.

  • Proxy servers can be configured to be light or heavy. The administrator must decide what gives the highest performance for his application. A proxy server such as Squid is light in the sense of having only one process serving all requests, but it can consume a lot of memory when it loads objects into memory for faster service.

  • If you use the default logging mechanism for all requests on the front- and backend servers, the requests that will be proxied to the backend server will be logged twice, which makes it tricky to merge the two log files, should you want to. Therefore, if all accesses to the backend server are done via the frontend server, it's the best to turn off logging of the backend server.

    If the backend server is also accessed directly, bypassing the frontend server, you want to log only the requests that don't go through the frontend server. One way to tell whether a request was proxied or not is to use mod_proxy_add_forward, presented later in this chapter, which sets the HTTP header X-Forwarded-For for all proxied requests. So if the default logging is turned off, you can add a custom PerlLogHandler that logs only requests made directly to the backend server.

    If you still decide to log proxied requests at the backend server, they might not contain all the information you need, since instead of the real remote IP of the user, you will always get the IP of the frontend server. Again, mod_proxy_add_forward, presented later, provides a solution to this problem.

Let's look at a real-world scenario that shows the importance of the proxy httpd accelerator mode for mod_perl.

First let's explain an abbreviation used in the networking world. If someone claims to have a 56-kbps connection, it means that the connection is made at 56 kilobits per second (~56,000 bits/sec). It's not 56 kilobytes per second, but 7 kilobytes per second, because 1 byte equals 8 bits. So don't let the merchants fool you—your modem gives you a 7 kilobytes-per-second connection at most, not 56 kilobytes per second, as one might think.

Another convention used in computer literature is that 10 Kb usually means 10 kilo-bits and 10 KB means 10 kilobytes. An uppercase B generally refers to bytes, and a lowercase b refers to bits (K of course means kilo and equals 1,024 or 1,000, depending on the field in which it's used). Remember that the latter convention is not followed everywhere, so use this knowledge with care.

In the typical scenario (as of this writing), users connect to your site with 56-kbps modems. This means that the speed of the user's network link is 56/8 = 7 KB per second. Let's assume an average generated HTML page to be of 42 KB and an average mod_perl script to generate this response in 0.5 seconds. How many responses could this script produce during the time it took for the output to be delivered to the user? A simple calculation reveals pretty scary numbers:

(42KB)/(0.5sx7KB/s) = 12

Twelve other dynamic requests could be served at the same time, if we could let mod_perl do only what it's best at: generating responses.

This very simple example shows us that we need only one-twelfth the number of children running, which means that we will need only one-twelfth of the memory.

But you know that nowadays scripts often return pages that are blown up with JavaScript and other code, which can easily make them 100 KB in size. Can you calculate what the download time for a file that size would be?

Furthermore, many users like to open multiple browser windows and do several things at once (e.g., download files and browse graphically heavy sites). So the speed of 7 KB/sec we assumed before may in reality be 5-10 times slower. This is not good for your server.

Considering the last example and taking into account all the other advantages that the proxy server provides, we hope that you are convinced that despite a small administration overhead, using a proxy is a good thing.

Of course, if you are on a very fast local area network (LAN) (which means that all your users are connected from this network and not from the outside), the big benefit of the proxy buffering the output and feeding a slow client is gone. You are probably better off sticking with a straight mod_perl server in this case.

Two proxy implementations are known to be widely used with mod_perl: the Squid proxy server and the mod_proxy Apache module. We'll discuss these in the next sections.



Library Navigation Links

Copyright © 2003 O'Reilly & Associates. All rights reserved.