11.3. Understanding an httpd.conf FileIt's helpful to know the default configuration when you're called upon to correct the configuration of someone else's system. In this section we examine the values set in the default configuration on a Solaris 8 system. (The default Solaris 8 configuration file is listed in Appendix F, "Solaris httpd.conf File".) Here we focus on the directives that are actually used in the Solaris 8 configuration, and a few others that show important Apache features. There are some other directives that we don't discuss. If you need additional information about any directive, there are many places to look. The full httpd.conf file contains many comments, which explain the purpose of each directive and are an excellent source of information. The Apache web site (http://www.apache.org) provides online documentation. Two excellent books on Apache configuration are Apache: The Definitive Guide, by Ben and Peter Laurie (O'Reilly), and Linux Apache Web Server Administration, by Charles Aulds (Sybex). However, you'll probably find more information about the httpd.conf file than you need for an average configuration right here in this chapter. The httpd.conf file that comes with Solaris has 160 active configuration lines. To tackle that much information, the following sections organize the configuration directives into different groups. Note that the configuration file itself organizes directives by scope: global environment directives, main server directives, and virtual host directives. (Virtual hosts are explained later in this chapter.) Although that organization is great for httpd when it is processing the file, it's not so great for a human reading the file. Here, related directives are grouped by function to make the individual directives more understandable. Once you understand the individual directives, you will understand the entire configuration. We start our look at the httpd.conf file with the directives that load dynamically loadable modules. These modules must be loaded before the directives they provide can be used in the configuration, so it makes sense to discuss loading the modules before we discuss the features they provide. Understanding dynamically loadable modules is a good place to start understanding Apache configuration. 11.3.1. Loading Dynamic Shared ObjectsThe two directives that appear most in the Solaris httpd.conf file are LoadModule and AddModule. Together, they make up more than 60 of the 160 active lines in the httpd.conf file. All 60 of these lines configure the Dynamic Shared Object (DSO) modules used by the Apache web server. Apache is composed of many software modules. Like kernel modules, DSO modules can be compiled into Apache or loaded at runtime. Running httpd with the -l command-line option lists all the modules compiled into Apache. The following example is from a Solaris 8 system: $ /usr/apache/bin/httpd -l Compiled-in modules: http_core.c mod_so.c Some systems may have many modules compiled into the Apache daemon. Solaris and Red Hat systems are delivered with only the following two modules compiled in:
In addition to these statically linked modules, Solaris uses many dynamically loadable modules. The LoadModule and AddModule directives are used in the httpd.conf file to load DSOs. First, each module is identified by a LoadModule directive. For example, this line in the Solaris httpd.conf file identifies the module that tracks users through the use of cookies: LoadModule usertrack_module /usr/apache/libexec/mod_usertrack.so The LoadModule directive is followed by the module name and the path of the shared object file. Before a module can be used, it must be added to the list of modules that are available to Apache. The first step in building the new module list is to clear the old one. This is done with the ClearModuleList directive. ClearModuleList has no arguments or options. It occurs in the httpd.conf file after the last LoadModule directive and before the first AddModule directive. The AddModule directive adds a module name to the module list. The module list must include all optional modules, both those compiled into the server and those that are dynamically loaded. On our sample Solaris system, that means that there is one more AddModule directive in the httpd.conf file than there are LoadModule directives. The extra AddModule directive handles mod_so.c, which is the only optional module compiled into Apache on our sample system.[127]
Mostly, however, LoadModule and AddModule directives occur in pairs: there is one AddModule directive for every LoadModule directive. For example, the following AddModule directive in the Solaris httpd.conf file adds the usertrack_module defined by the LoadModule directive shown previously to the module list: AddModule mod_usertrack.c The AddModule directive is followed by the name of the source file for the module being loaded. Notice that this is the name of the source file that produced the object module, not the module name seen in the LoadModule directive. This name is identical to the object filename except for the extension. In the LoadModule directive, which uses the shared object extension .so, the object filename is mod_usertrack.so. AddModule uses the source filename extension .c, so the module name is mod_usertrack.c. Table 11-1 lists all the modules referenced by AddModule directives in the Solaris 8 httpd.conf file. Table 11-1. DSO modules loaded in the Solaris configuration
If you decide to add modules to your configuration, do so very carefully. The order of the LoadModule and AddModule directives in the httpd.conf file is critical. Don't change things without knowing what you're doing. Before proceeding with a new installation, read the documentation that comes with your new module and the modules documentation found in the manual/mod directory of the Apache distribution. See the previously mentioned book Linux Apache Web Server Administration for detailed advice about adding new modules. Once the DSOs are loaded, the directives that they provide can be used in the configuration file. Let's continue looking at the Solaris httpd.conf file by examining some of the basic configuration directives. 11.3.2. Basic Configuration DirectivesThis section covers six different directives. The directives as they appear in the sample configuration we created for our Solaris system are: ServerAdmin webmaster@www.wrotethebook.com ServerName www.wrotethebook.com UseCanonicalName On ServerRoot "/var/apache" ServerType standalone Port 80 Two of the basic directives, ServerAdmin and ServerName, were touched upon earlier in the chapter. ServerAdmin defines the email address of the web server administrator. This is set to a bogus value, you@your.host, in the default Solaris configuration. You should change this to the full email address of the real web administrator before starting the server. ServerName defines the hostname returned to clients when they read data from this server. In the default Solaris configuration, the ServerName directive is commented out, which means that the "real" hostname is sent to clients. Thus, if the name assigned to the first network interface is crab.wrotethebook.com, then that is the name sent to clients. Many Apache experts suggest defining an explicit value for ServerName in order to document your configuration and to ensure that you get exactly the value you want. Earlier, we set ServerName to www.wrotethebook.com, so that even though the web server is running on crab, the server will be known as www.wrotethebook.com during web interactions. Of course, www.wrotethebook.com must be a valid hostname configured in DNS. (See Chapter 8, "Configuring DNS", where www is defined as a nickname for crab in the wrotethebook.com zone file.) A configuration directive related to ServerName is UseCanonicalName, which defines how httpd builds "self-referencing" URLs. A self-referencing URL contains the name of the server itself in the hostname portion of the URL. For example, on the server www.wrotethebook.com, a URL that starts with http://www.wrotethebook.com would be a self-referencing URL. The hostname in the URL should be a canonical name, which is a name that DNS can resolve to a valid IP address. When UseCanonicalName is set to on, as it is in the default Solaris configuration, the value in ServerName is used to identify the server in self-referencing URLs. For most configurations, leave it set to on. If it is set to off, the value that came in the query from the client is used. The ServerRoot option defines the directory that contains important files used by httpd, including error files, log files, and the three configuration files: httpd.conf, srm.conf, and access.conf. In the Solaris configuration, ServerRoot points to /var/apache. This is surprising in that the Solaris httpd configuration files are actually located in /etc/apache, so clearly something else is at work. Solaris uses the -f option on the httpd command line to override the location of the httpd.conf file at runtime. httpd is started at boot time using the script /etc/init.d/apache. That script defines a variable named CONF_FILE that contains the value /etc/apache/httpd.conf. This variable is used with the httpd command that launches the web server, and it is this variable that defines the location of the configuration file on a Solaris system. The ServerType option defines how the server is started. If the server starts from a startup script at boot time, the option is set to standalone. If the server is run on demand by inetd, the option is set to inetd. The default Solaris configuration sets ServerType to standalone, which is the best value; web servers are usually in high demand, so it is best to start them at boot time. It is possible, of course, for a user to set up a small, rarely used web site on a desktop workstation, in which case running the server from inetd may be desirable. But the web server you create for your network should be standalone. Port defines the TCP port number used by the server. The standard port number is 80. On occasion, private web servers run on other port numbers. For example, Solaris runs the AnswerBook2 server on port 8888. Other popular alternative ports for special-purpose web sites are 8080 and 8000. If you change the port number, you must then tell your users the nonstandard port number. For example, http://jerboas.wrotethebook.com:8080 is a URL for a web site running on TCP port 8080 on host jerboas.wrotethebook.com. When ServerType is set to inetd, it is usually desirable to set Port to something other than 80. The reason for this is that the ports under 1024 are "privileged" ports. If 80 is used, httpd must be run from inetd with the userid root. This is a potential security problem, as an intruder might be able to exploit the web site to get root access. Using port 80 is okay when ServerType is standalone because the initial httpd process does not provide direct client service. Instead it starts several other HTTP daemons, called the swarm, to provide client services. The daemons in the swarm do not run with root privilege. 11.3.3. Managing the SwarmIn the original web server design, the server would create separate processes to handle individual requests. This placed a heavy load on the CPU when the server was busy and had a major negative impact on responsiveness. It was possible for the entire system to be overwhelmed by httpd processes. Apache uses a different approach. A swarm of server processes starts at boot time (the ps command earlier in the chapter shows several httpd processes running on the Solaris system), and all the processes in the swarm share the workload. If all the persistent httpd processes become busy, spare processes are started to share the work. Five directives in the Apache configuration control how the swarm of server child processes is managed. They are:
The User and Group directives define the UID and GID under which the swarm of httpd processes are run. When httpd starts at boot time, it runs as a root process, binds to port 80, and then starts a group of child processes that provide the actual web services. These child processes are the ones given the UID and GID defined in the file. The UID and GID should provide the least possible system privileges to the web server. On the Solaris system, this is the user nobody and the group nobody. The previous ps command output shows this clearly. One httpd process belongs to root and five other httpd processes belong to the user nobody. An alternative to using nobody is to create a userid and groupid just for httpd. If you do this, create the file permissions granted to the new user account very carefully. The advantage of creating a special user and group for httpd is that you can use group permissions for added protection, and you won't be completely dependent on the world permissions granted to nobody. 11.3.4. Defining Where Things Are StoredThe DocumentRoot directive defines the directory that contains the web server documents. For security reasons, this is not the same directory that holds the configuration files. As we saw earlier, the Solaris setting for DocumentRoot is: DocumentRoot "/var/apache/htdocs" To apply directives to a specific directory, create a container for those directives. Three of the httpd.conf directives used to create containers are:
Directories and files are easy to understand: they are parts of the Unix filesystem that every system administrator knows. Documents, on the other hand, are specific to the web server. The screenful of information that appears in response to a web query is a document; it can be made up of many files from different directories. The Location container provides an easy way to refer to a complex document as a single entity. We will see examples of Location and Files containers later in this chapter. Here we look at Directory containers. The Solaris configuration defines a Directory container for the server's root directory and for the DocumentRoot: <Directory /> Options FollowSymLinks AllowOverride None </Directory> <Directory "/var/apache/htdocs"> Options Indexes FollowSymLinks AllowOverride None Order allow,deny Allow from all </Directory> Each Directory container starts with a Directory directive and ends with a </Directory> tag. Both containers shown here enclose configuration statements that apply to only a single directory. The purpose of the directives inside these containers is covered later in Section 11.4, "Web Server Security". For now, it is sufficient to understand that containers are used inside the httpd.conf file to limit the scope of various configuration directives. The Alias directive and the ScriptAlias directive both map a URL path to a directory on the server. For example, the Solaris configuration contains the following three directives: Alias /icons/ "/var/apache/icons/" Alias /manuals/ "/usr/apache/htdocs/manual/" ScriptAlias /cgi-bin/ "/var/apache/cgi-bin/" The first line maps the URL path /icons/ to the directory /var/apache/icons/. Thus a request for www.wrotethebook.com/icons/ is mapped to www.wrotethebook.com/var/apache/icons/. The second directive maps the URL path /manuals/ to www.wrotethebook.com/usr/apache/htdocs/manual/. You may have several Alias directives to handle several different mappings, but you will have only one ScriptAlias directive. The ScriptAlias directive functions in exactly the same ways as the Alias directive, except that the directory it points to contains executable CGI programs. Therefore, httpd grants this directory execution privileges. ScriptAlias is particularly important because it allows you to maintain executable web scripts in a directory separate from the DocumentRoot. CGI scripts are the single biggest security threat to your server; maintaining them separately allows you to have tighter control over who has access to the scripts. The Solaris configuration has containers for the /var/apache/icons directory and the /var/apache/cgi-bin directory, but none for the /usr/apache/htdocs/manual directory. Just because a directory is defined inside the httpd.conf file does not mean that a Directory container must be created for that directory. The /var/apache/icons and the /var/apache/cgi-bin containers are shown here: <Directory "/var/apache/icons"> Options Indexes MultiViews AllowOverride None Order allow,deny Allow from all </Directory> <Directory "/var/apache/cgi-bin"> AllowOverride None Options None Order allow,deny Allow from all </Directory> These containers enclose AllowOverride, Options, Order, and Allow statements -- all of which relate to security. Most of the directives found in containers have security implications, and have been placed in containers to provide special security settings for a file, document, or directory. All of the directives used in the containers shown above are covered in Section 11.4, "Web Server Security" later in this chapter. The UserDir directive enables personal user web pages and points to the directory that contains the user pages. UserDir usually points to public_html, and it does in the Solaris configuration. With this default setting, users create a directory named public_html in their home directories to hold their personal web pages. When a request comes in for www.wrotethebook.com/~sara, for example, it is mapped to www.wrotethebook.com/export/home/sara/public_html. An alternative is to define a full pathname on the UserDir directive line such as /export/home/userpages. Then the administrator creates the directory and allows each user to store personal pages in subdirectories of this directory, so that a request for www.wrotethebook.com/~sara will map to www.wrotethebook.com/export/home/userpages/sara. The advantage of this approach is that it makes it easier for you to monitor the content of user pages. The disadvantage is that a separate user web directory tree must be created and protected separately, whereas a web folder within the user's home directory will inherit the protection of that user's home. The PidFile and ScoreBoardFile directives define the paths of files that relate to process status. The PidFile is the file in which httpd stores its process ID, and the ScoreBoardFile is the file where httpd writes process status information. The DirectoryIndex option defines the name of the file retrieved if the client's request does not include a filename. Our Solaris system has the following value for this option: DirectoryIndex index.html Given the value defined for DocumentRoot and this value, if the server gets a request for http://www.wrotethebook.com, it gives the client the file /var/apache/htdocs/index.html. If it gets a request for http://www.wrotethebook.com/books/, it gives the client the file /var/apache/htdocs/books/index.html. The DocumentRoot is prepended to every request, and the DirectoryIndex is appended to any request that doesn't end in a filename. Earlier in this chapter, we saw from an ls of /var/apache/htdocs that the directory contains a file named index.html. But what if it didn't? What would Apache send to the client? If the file index.html is not found in the directory, httpd sends the client a listing of the directory, if the configuration permits it. A directory listing is allowed if the Options directive in the Directory container for the directory contains the keyword Indexes. (More on Options later.) If a directory index is allowed, several different directives control how that directory listing is formatted. 11.3.5. Creating a Fancy IndexThe keyword FancyIndexing is used on the IndexOptions directive line to enable a "fancy index" of the directory when Apache is forced to send the client a directory listing. When fancy indexing is enabled, httpd creates a directory list that includes graphics, links, and other advanced features. The Solaris configuration enables fancy indexing with the IndexOptions directive, and it contains about 20 extra lines to help configure the fancy index. Solaris uses the following directives to define the graphics and features used in the fancy directory listing:
11.3.6. Defining File TypesMIME file types and file extensions play a major role in helping the server determine how a file should be handled. Specifying IME options is also a major part of the Solaris httpd.conf file. The directives involved are:
Another directive that is commonly used to process files based on the filename extension is the AddHandler directive. This directive maps a file handler to a file extension. A file handler is a program that knows how to process a specific file type. For example, the handler cgi-script is able to execute CGI files. The Solaris configuration does not define any optional handlers, so all the AddHandler directives are commented out. 11.3.7. Performance Tuning DirectivesThe KeepAlive directive enables the use of persistent connections. Without persistent connections, the client must make a new connection to the server for every link the user selects. Because HTTP runs over TCP, every connection requires a connection setup, adding time to every file retrieval. With persistent connections, the server waits to see if the client has additional requests before it closes the connection. Therefore, the client does not need to create a new connection to request a new document. The KeepAliveTimeout defines the number of seconds the server holds a persistent connection open waiting to see if the client has additional requests. The Solaris configuration turns KeepAlive on and sets KeepAliveTimeout to 15 seconds. MaxKeepAliveRequests defines the maximum number of requests that will be accepted on a "kept-alive" connection before a new TCP connection is required. Solaris sets this value to 100, which is the Apache default. Setting MaxKeepAliveRequests to 0 allows unlimited requests. 100 is a good value for this parameter: few users request 100 document transfers, so the value essentially creates a persistent connection for all reasonable cases. If the client does request more than 100 document transfers, it might indicate a problem with the client system, so requiring another connection request is probably a good idea. Timeout defines the number of seconds the server waits for a transfer to complete. The value needs to be large enough to handle the size of the files your site sends as well as the low performance of the modem connections of your clients. But if it is set too high, the server will hold open connections for clients that may have gone offline. The Solaris configuration has the Timeout set to 5 minutes (300 seconds), which is a very common setting. BrowserMatch is a different type of tuning parameter: it reduces performance for compatibility's sake. The Solaris configuration contains the following five BrowserMatch directives: BrowserMatch "Mozilla/2" nokeepalive BrowserMatch "MSIE 4\.0b2;" nokeepalive downgrade-1.0 force-response-1.0 BrowserMatch "RealPlayer 4\.0" force-response-1.0 BrowserMatch "Java/1\.0" force-response-1.0 BrowserMatch "JDK/1\.0" force-response-1.0 The BrowserMatch statements are used to present information in ways that are compatible with the capabilities of different web browsers. For example, a browser may be able to handle only HTTP 1.0, not HTTP 1.1. In this case, downgrade-1.0 is used on the BrowserMatch line to ensure that the server uses only HTTP 1.0 when dealing with that browser. In the Solaris configuration, keepalives are disabled for two browsers. One browser is offered only HTTP 1.0 during the connection, and responses are formatted to be compatible with HTTP 1.0 for four different browsers. Don't fiddle with the BrowserMatch directives. These settings are shipped as defaults in the Apache distribution, and are set to handle the limitations of different browsers. These are tuning parameters, but they are used by the Apache developers to adjust to the limitations of older browsers. HostnameLookups tells httpd whether or not it should log hostnames as well as IP addresses. The advantage of enabling hostname logging is that you get a more readable log. The disadvantage is that httpd has the added overhead of DNS name lookups. Setting this to off, as in the Solaris configuration, enhances server performance. The HostnameLookups directive affects what is logged, but its major impact is on system performance, which is why we cover it under tuning parameters instead of logging directives. 11.3.8. Logging Configuration DirectivesLog files provide a great deal of information about the web server. The following seven lines define the Apache logging configuration in the default Solaris 8 httpd.conf file: ErrorLog /var/apache/logs/error_log LogLevel warn LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined LogFormat "%h %l %u %t \"%r\" %>s %b" common LogFormat "%{Referer}i -> %U" referer LogFormat "%{User-agent}i" agent CustomLog /var/apache/logs/access_log common ErrorLog defines the path of the error log file. Use the error log to track and correct failures. You should review the log at least once a day to check for problems. To keep a close eye on the file while you're logged in, use the tail command with the -f option: $ tail -l 1 -f /var/log/httpd/apache/error_log The tail command prints the tail end of a file; in the example, the file is /var/log/httpd/apache/error_log. The -l option is the lines option. It tells tail how many lines from the end of the file to print. In this case, -l 1 directs tail to print the (one) last line in the file. The -f option keeps the tail process running so that you will see each record as it is written to the file. This allows you to monitor the file in real time. The LogLevel directive defines the type of events written to the error log. The Solaris configuration sets LogLevel to warn, which specifies that warnings and other more critical errors are to be written to the log. This is a safe setting for an error log because it logs a wide variety of operational errors. LogLevel has eight possible settings: debug, info, notice, warn, error, crit, alert, and emerg. The log levels are cumulative. For example, warn provides warnings, errors, critical messages, alerts, and emergency messages; debug provides all types of logging, which causes the file to grow at a very rapid rate; emerg keeps the file small but notifies you only of disasters. warn is a good compromise between not enough detail and too much detail. Just as important as reporting errors, the logs provide information about who is using the server, how much it is being used, and how well it is servicing the users. Web servers are used to distribute information; if no one wants or uses the information, you need to know it. The LogFormat and CustomLog directives do not configure the error log, but rather how server activity is logged. 11.3.8.1. Defining the log file formatThe LogFormat directives define the format of log file entries. A LogFormat directive contains two things: the layout of a file entry and a label used in the httpd.conf file to identify the log entry. The layout of the entry is placed directly after the LogFormat keyword and is enclosed in quotes. The layout is defined using literals and variables. Examining a sample LogFormat directive shows how the variables are used. The basic Apache log file conforms to the Common Log Format (CLF). CLF is a standard used by all web server vendors, and using this format means that the logs generated by Apache servers can be processed by any log analysis tool that conforms to the standard. The format of a standard CLF entry is clearly defined by the second LogFormat directive in the Solaris httpd.conf file: LogFormat "%h %l %u %t \"%r\" %>s %b" common This LogFormat directive specifies exactly the information required for a CLF log entry. It does this using seven different LogFormat variables:
Apache log entries are not limited to the CLF format. The LogFormat directive lets you define what information is logged. A wide variety of information can be logged. The Solaris configuration contains three additional LogFormat directives that demonstrate some optional log formats. The three directives are: LogFormat "%{User-agent}i" agent LogFormat "%{Referer}i -> %U" referer LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined All of these directives log the contents of HTTP headers. For example, the first directive logs the value received from the client in the User-agent header. User-agent is the user program that generates the document request; generally this is the name of a browser. The format that logs the header is: %{User-agent}i This format works for any header: simply replace User-agent with the name of the header. The i indicates that this is an input header; output headers are indicated by an o. Apache can log the contents of any header records received or sent. The second LogFormat directive logs the contents of the Referer header received from the client (%{Referer}i), the literal characters dash and greater-than sign (->), and the requested URL (%U). Referer is the name of the remote site that referred the client to your web site; %U is the document to which the site referred the client. The last LogFormat directive starts with the CLF (%h %l %u %t \"%r\" %>s %b \") and adds to that the values from the Referer header and the User-agent header. This format is labeled combined because it combines the CLF with other information; the other two formats are also aptly labeled as agent and referer. Yet none of these formats is actually used in the Solaris configuration. Simply creating a LogFormat is not enough to generate a log file; you must also add a matching CustomLog directive to map the format to a file, as explained later. In the LogFormat directive, the layout of the log entry is enclosed in quotes. The label that occurs after the closing quote is not part of the format. In the LogFormat directive that defines the CLF format, the label common is an arbitrary string used to tie the LogFormat directive to a CustomLog directive. In the Solaris configuration, that particular LogFormat is tied to the file /var/apache/logs/access_log defined by this line: CustomLog /var/apache/logs/access_log common The label common binds the two directives together. Thus the CLF entries defined by this LogFormat directive are written to the file defined by this CustomLog directive. In the Solaris configuration, the other CustomLog directives that create the agent, referer, and combined log files are commented out: #CustomLog /var/apache/logs/referer_log referer #CustomLog /var/apache/logs/agent_log agent #CustomLog /var/apache/logs/access_log combined The referer_log stores the URL of the source page that linked to your web server. This helps you determine what sites are pointing to your web pages. Entries in the referer_log are defined by this line: LogFormat "%{Referer}i -> %U" referer To create the log, uncomment this line: CustomLog /var/apache/logs/referer_log referer The agent_log identifies the browsers that are used to access your site, and is defined by this LogFormat statement: LogFormat "%{User-agent}i" agent To create the log, uncomment this line: CustomLog /var/apache/logs/agent_log agent Lastly, the format for the expanded CLF log is defined by this line: LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined To create a combined log, uncomment this line: CustomLog /var/apache/logs/access_log combined and comment this line: #CustomLog /var/apache/logs/access_log common These changes cause the combined log format to be used to build a log file named /var/apache/logs/access_log. This is the same log file that is used by the default common log format. To avoid duplicate log entries, turn off common logging when you turn on combined logging. In effect, these changes switch the access_log file from using the common log format to logging the combined log entry. Each LogFormat statement and its associated CustomLog statement end with the same label. The label is an arbitrary name used to bind the format and the file together. 11.3.8.2. Using conditional loggingApache also supports conditional logging to identify fields that are logged only when certain status codes are returned by the server. The status codes are listed in Table 11-2. Table 11-2. Apache server status codes
To make a field conditional, put one or more status codes on the field in the LogFormat entry. If multiple status codes are used, separate them with commas. Assume that you want to log the browser name only if the browser requests a service that is not implemented in your server. Combine the Not Implemented (501) status code with User-agent header in this manner: %501{User-agent}i If this value appears in the LogFormat statement, the name of the browser is logged only when the status code is 501. Place an exclamation mark in front of the status codes to specify that you want to log a field only when the status code does not contain the specified values. For example, to log the address of the site that referred the user to your web page only if the status code is not one of the good status codes, add the following to a LogFormat: %!200,302,304{Referer}i This particular conditional log entry is very useful, as it tells you when a remote page has a stale link pointing to your web site. Combine these features with the common log format to create a more useful log entry. Here we modify the Solaris combined format to include conditional logging: LogFormat "%h %l %u %t \"%r\" %>s %b \"%!200,302,304{Referer}i\" \"%{User-Agent}i\"" combined This entry provides all the data of the CLF and thus can be analyzed by standard tools. But it also provides the browser name and, when the user requests a stale link, it provides the address of the remote site that references that link. Despite the fact that the Solaris configuration file contains over 160 active lines, there are some interesting Apache features that the Solaris configuration does not exploit. Before we move on to the important ongoing tasks of server security and server monitoring, the following sections provide a quick overview of three features not included in the default Solaris configuration: proxies and caching, multi-homed server configuration, and virtual hosts. 11.3.9. Proxy Servers and CachingServers that act as intermediaries between clients and web servers are called proxy servers. When firewalls are used, direct web access is often blocked. Instead, users connect to the proxy server through the local network, and the proxy server is trusted to connect to the remote web server. Proxy servers can maintain cached copies of remote servers' web pages to improve performance by reducing the amount of traffic sent over the wide area network and by reducing the contention for popular web sites. The options that control caching behavior are:
All of these directives are commented out in the Solaris configuration. By default, the Solaris Apache server is not configured to be a proxy server. If you need to create a proxy server, refer to a book dedicated to Apache configuration such as Linux Apache Web Server Administration. 11.3.10. Multi-Homed Server OptionsWeb servers with more than one IP address are said to be multi-homed. A multi-homed web server needs to know what address it should listen to for incoming server requests. There are two configuration options to handle this:
The BindAddress and Listen directives are commented out of the Solaris configuration. 11.3.11. Defining Virtual HostsSome of the options commented out of the sample httpd.conf file are used if your server hosts multiple web sites. For example, to host web sites for fish.edu and mammals.com on the crab.wrotethebook.com server, add these lines to the httpd.conf file: <VirtualHost "www.fish.edu"> DocumentRoot /var/apache/fish ServerName www.fish.edu </VirtualHost> <VirtualHost "www.mammals.com"> DocumentRoot /var/apache/mammals ServerName www.mammals.com </VirtualHost> Each VirtualHost option defines a hostname alias that your server responds to. For this to be valid, DNS must define the alias with a CNAME record. Our example requires CNAME records that assign crab.wrotethebook.com the aliases of www.fish.edu and www.mammals.com. When crab receives a server request addressed to one of these aliases, it uses the configuration parameters defined here to override its normal settings. Therefore, when it gets a request for www.fish.edu, it uses www.fish.edu as its ServerName value instead of its own server name, and /var/apache/fish as the DocumentRoot. Copyright © 2002 O'Reilly & Associates. All rights reserved. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|