11.3.8. Logging Configuration Directives
Log files provide a great deal of information about the web server.
The following seven lines define the Apache logging configuration in
the default Solaris 8 httpd.conf file:
ErrorLog /var/apache/logs/error_log
LogLevel warn
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
LogFormat "%h %l %u %t \"%r\" %>s %b" common
LogFormat "%{Referer}i -> %U" referer
LogFormat "%{User-agent}i" agent
CustomLog /var/apache/logs/access_log common
ErrorLog defines the path of the error log file. Use the error log to
track and correct failures. You should review the log at least once a
day to check for problems. To keep a close eye on the file while
you're logged in, use the tail command with
the -f option:
$ tail -l 1 -f /var/log/httpd/apache/error_log
The tail command prints the tail end of a file; in
the example, the file is
/var/log/httpd/apache/error_log. The
-l option is the lines option. It tells
tail how many lines from the end of the file to
print. In this case, -l 1 directs
tail to print the (one) last line in the file. The
-f option keeps the tail
process running so that you will see each record as it is written to
the file. This allows you to monitor the file in real time.
The LogLevel directive defines the type of events written to the
error log. The Solaris configuration sets LogLevel to
warn, which specifies that warnings and other more
critical errors are to be written to the log. This is a safe setting
for an error log because it logs a wide variety of operational
errors. LogLevel has eight possible settings:
debug, info,
notice, warn,
error, crit,
alert, and emerg. The log
levels are cumulative. For example, warn provides
warnings, errors, critical messages, alerts, and emergency messages;
debug provides all types of logging, which causes
the file to grow at a very rapid rate; emerg keeps
the file small but notifies you only of disasters.
warn is a good compromise between not enough
detail and too much detail.
Just as important as reporting errors, the logs provide information
about who is using the server, how much it is being used, and how
well it is servicing the users. Web servers are used to distribute
information; if no one wants or uses the information, you need to
know it. The LogFormat and CustomLog directives do not configure the
error log, but rather how server activity is
logged.
11.3.8.1. Defining the log file format
The LogFormat
directives define the format of log
file entries. A LogFormat directive
contains two things: the layout of a file entry and a label used in
the httpd.conf file to identify the log entry.
The layout of the entry is placed directly after the LogFormat
keyword and is enclosed in quotes. The layout is defined using
literals and variables.
Examining a sample LogFormat directive shows how the variables are
used. The basic Apache log file conforms to the Common Log Format
(CLF). CLF is a standard used by all web server vendors, and using
this format means that the logs generated by Apache servers can be
processed by any log analysis tool that conforms to the standard. The
format of a standard CLF entry is clearly defined by the second
LogFormat directive in the Solaris httpd.conf
file:
LogFormat "%h %l %u %t \"%r\" %>s %b" common
This LogFormat directive specifies exactly the information required
for a CLF log entry. It does this using seven different
LogFormat variables:
- %h
-
Logs the IP address of the client. If HostnameLookups is set to on,
this is the client's fully qualified hostname. On the sample
Solaris system, this would be the client's IP address because
HostnameLookups is turned off to enhance server performance.
- %l
-
Logs the username used to log in to the client, if available. The
name is retrieved using the identd protocol;
however, most clients do not run identd and thus
do not provide this information. Therefore, this field usually
contains a hyphen to indicate a missing value. Likewise, if the
server does not have a value for a field, the log contains a hyphen
in the field.
- %u
-
Logs the username used to access a password-protected web page. This
should match a name you defined in the AuthUser file or the
AuthDBMUser database you created on the server. (AuthUser and
AuthDBMUser are covered in Section 11.4, "Web Server Security" of this chapter.) Most documents are not password protected,
and therefore this field contains a hyphen in most log entries.
- %t
-
Logs the date and time the log entry was made.
- %r
-
Logs the first line of the client's request. This often
contains the URL of the requested document. The \"
characters in the LogFormat directive indicate that quotes should be
inserted in the output. In the log file, the client's request
will be enclosed in quotes.
- %>s
-
Logs the status of the last request. This is the three-digit response
code that the server returned to the client.
- %b
-
Logs the number of bytes contained in the document sent to the client.
Apache log entries are not limited to the CLF format. The LogFormat
directive lets you define what information is logged. A wide variety
of information can be logged.
The Solaris configuration contains three additional LogFormat
directives that demonstrate some optional log formats. The three
directives are:
LogFormat "%{User-agent}i" agent
LogFormat "%{Referer}i -> %U" referer
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\""
combined
All of these directives log the contents of HTTP headers. For
example, the first directive logs the value received from the client
in the User-agent header.
User-agent is the user program that generates the
document request; generally this is the name of a browser. The format
that logs the header is:
%{User-agent}i
This format works for any header: simply replace
User-agent with the name of the header. The
i indicates that this is an input header; output
headers are indicated by an o. Apache can log the
contents of any header records received or sent.
The second LogFormat directive logs the contents of the
Referer header received from the client
(%{Referer}i), the literal characters dash and
greater-than sign (->), and the requested URL
(%U). Referer is the name of
the remote site that referred the client to your web site;
%U is the document to which the site referred the
client.
The last LogFormat directive starts with the CLF (%h %l %u
%t \"%r\" %>s %b \") and adds to that the values from
the Referer header and the
User-agent header. This format is labeled
combined because it combines the CLF with other
information; the other two formats are also aptly labeled as
agent and referer. Yet none of
these formats is actually used in the Solaris configuration. Simply
creating a LogFormat is not enough to generate a log file; you must
also add a matching CustomLog directive to map the format to a file,
as explained later.
In the LogFormat directive, the layout of the log entry is enclosed
in quotes. The label that occurs after the closing quote is not part
of the format. In the LogFormat directive that defines the CLF
format, the label common is an arbitrary string
used to tie the LogFormat directive to a CustomLog directive. In the
Solaris configuration, that particular LogFormat is tied to the file
/var/apache/logs/access_log defined by this
line:
CustomLog /var/apache/logs/access_log common
The label common binds the two directives
together. Thus the CLF entries defined by this LogFormat directive
are written to the file defined by this CustomLog directive.
In the Solaris configuration, the other CustomLog directives that
create the agent, referer, and
combined log files are commented out:
#CustomLog /var/apache/logs/referer_log referer
#CustomLog /var/apache/logs/agent_log agent
#CustomLog /var/apache/logs/access_log combined
The referer_log stores the URL of the source
page that linked to your web server. This helps you determine what
sites are pointing to your web pages. Entries in the
referer_log are defined by this line:
LogFormat "%{Referer}i -> %U" referer
To create the log, uncomment this line:
CustomLog /var/apache/logs/referer_log referer
The agent_log identifies the browsers that are
used to access your site, and is defined by this LogFormat statement:
LogFormat "%{User-agent}i" agent
To create the log, uncomment this line:
CustomLog /var/apache/logs/agent_log agent
Lastly, the format for the expanded CLF log is defined by this line:
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
To create a combined log, uncomment this line:
CustomLog /var/apache/logs/access_log combined
and comment this line:
#CustomLog /var/apache/logs/access_log common
These changes cause the combined log format to be
used to build a log file named
/var/apache/logs/access_log. This is the same
log file that is used by the default common log
format. To avoid duplicate log entries, turn off
common logging when you turn on
combined logging. In effect, these changes switch
the access_log file from using the
common log format to logging the
combined log entry.
Each LogFormat statement and its associated CustomLog statement end
with the same label. The label is an arbitrary name used to bind the
format and the file together.
11.3.8.2. Using conditional logging
Apache also
supports conditional logging to
identify fields that are logged only when certain status codes are
returned by the server. The status codes are listed in Table 11-2.
Table 11-2. Apache server status codes
Status code
|
Meaning
|
200: OK
|
A valid request
|
302: Found
|
The document was found
|
304: Not Modified
|
The requested document has not been modified
|
400: Bad Request
|
An invalid request
|
401: Unauthorized
|
The client or user is denied access
|
403: Forbidden
|
The requested access is not allowed
|
404: Not Found
|
The requested document does not exist
|
500 Server Error
|
An unspecified server error occurred
|
503: Out of Resources (Service Unavailable)
|
The server has insufficient resources to honor the request
|
501: Not Implemented
|
The requested server feature is not available
|
502: Bad Gateway
|
The client specified an invalid gateway
|
To make a field conditional, put one or more status codes on the
field in the LogFormat entry. If multiple status codes are used,
separate them with commas. Assume that you want to log the browser
name only if the browser requests a service that is not implemented
in your server. Combine the Not Implemented (501) status code with
User-agent header in this manner:
%501{User-agent}i
If this value appears in the LogFormat statement, the name of the
browser is logged only when the status code is 501.
Place an exclamation mark in front of the status codes to specify
that you want to log a field only when the status code does not
contain the specified values. For example, to log the address of the
site that referred the user to your web page only if the status code
is not one of the good status codes, add the following to a
LogFormat:
%!200,302,304{Referer}i
This particular conditional log entry is very useful, as it tells you
when a remote page has a stale link pointing to your web site.
Combine these features with the common log format
to create a more useful log entry. Here we modify the Solaris
combined format to include conditional logging:
LogFormat "%h %l %u %t \"%r\" %>s %b \"%!200,302,304{Referer}i\" \"%{User-Agent}i\"" combined
This entry provides all the data of the CLF and thus can be analyzed
by standard tools. But it also provides the browser name and, when
the user requests a stale link, it provides the address of the remote
site that references that link.
Despite the fact that the Solaris configuration file contains over
160 active lines, there are some interesting Apache features that the
Solaris configuration does not exploit. Before
we move on to the important ongoing tasks of server security and
server monitoring, the following sections provide a quick overview of
three features not included in the default Solaris configuration:
proxies and caching, multi-homed server configuration, and virtual
hosts.