home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam  


sendmail

sendmailSearch this book
Previous: 25.8 Pitfalls Chapter 26 Next: 26.2 Statistics
 

26. Logging and Statistics

The sendmail program can keep the system administrator up-to-date about many aspects of mail delivery and forwarding. It does this by logging its activities using the syslog (3) facility. It can also gather (or aid in gathering) statistics about what it is doing. Information about things like total message volume and site connectivity, for example, can help the administrator to make sendmail more efficient. Information about the SMTP dialog that was used to send the message can help the administrator to solve delivery problems.

In this chapter we cover these three important aspects of sendmail . First, we explain the use of the syslog (3) facility and illustrate several ways to tune its logging. Second, we show the built-in means that sendmail has for recording statistics and other information and how scripts can be used to extend that information. Third, we explore ways to tune the performance of sendmail .

26.1 Logging with syslog

Logging is the process of issuing one-line warnings that will be either displayed to a human, archived to a file, or both. The mechanism that sendmail uses to produce these warnings is called syslog (3). The sendmail program is concerned only with issuing its warnings. Once they are issued, the syslog facility takes over and disposes of them in a manner described in the file /etc/syslog.conf . Statements in this file determine whether a warning is written to a device (such as /dev/console ), appended to a file, forwarded to another host, or displayed on a logged-in user's screen.

In the following discussion of syslog and syslog.conf , we will describe the BSD 4.4 version. Some versions of UNIX, such as Ultrix, use the 4.2 version of syslog , but because syslog is public domain, we recommend upgrading and will not cover that old version here.

26.1.1 syslog(3)

The syslog (3) facility uses two items of information in determining how to handle messages: facility and level . The facility is the category of program issuing a message. The syslog facility can handle many categories, but only one, mail , is used by sendmail . The level is the degree of severity of the warnings. The sendmail program issues messages with syslog (3) at various levels depending on how serious the warning is.

When sendmail first starts to run, it opens its connection to the syslog facility with the following C language line:

openlog("sendmail", LOG_PID, LOG_MAIL);

This tells syslog three things:

  • All messages should be printed using sendmail as the name of the program doing the logging. This means that no matter what name is used to run sendmail (such as newaliases or smtpd ), the name that is logged will always be sendmail .

  • The LOG_PID tells syslog that the PID (process identification number) should be included when each message is written. This is necessary because sendmail forks often, and each parent and child will have a different PID. Because queue file identifiers are constructed from PIDs, this record helps to determine which invocation of sendmail created a particular queued file. The PID also allows messages from the daemon form of sendmail to be differentiated from others.

  • The facility for sendmail (and all mail-handling programs) is LOG_MAIL. We'll show why this is important when we discuss the syslog.conf file.

Just before sendmail issues a warning, it looks at the logging level defined by its LogLevel ( L ) option (see Section 34.8.33, LogLevel (L) ). If the severity of the warning is greater than the logging level, nothing is output. If the severity of the warning that it intends to issue is less than or equal to the logging level (lower is more serious), it issues that warning with a C language call like this:

syslog(

pri, msg

);

Here, pri is the syslog logging priority, and msg is the text of the warning message. Note that the LogLevel ( L ) option (see Section 34.8.33 ) level is different than the syslog priority. The former is used internally by sendmail to decide whether it should log a message. The latter is used by syslog to determine how it will dispose of the message (if it gets one).

The LogLevel ( L ) option sets a threshold at and below which sendmail will issue warnings. When the LogLevel ( L ) option has a zero value, nothing is ever issued. When the LogLevel ( L ) option has a low value, only critical warnings are issued. At higher values, less critical messages are also issued.

The syntax of the LogLevel ( L ) option and the kinds of information issued for each level are explained in Section 34.8.33 . For each level, all the information produced at lower levels is also issued. That is, setting the LogLevel ( L ) option to 9 causes messages for levels 1 through 8 also to be issued.

The relationship between the LogLevel ( L ) option logging levels and syslog priorities is shown in Table 26.1 .

Table 26.1: L Levels Versus syslog Priorities
Level Priority
1 LOG_CRIT and LOG_ALERT
2-8 LOG_NOTICE
9-10 LOG_INFO
11+ LOG_DEBUG

26.1.2 Tuning syslog.conf

Although all messages are emitted by sendmail using a single facility, that of syslog , they need not all arrive at the same place. The disposition of messages is tuned by the syslog.conf file.

The file syslog.conf (usually located in the /etc directory) contains routing commands for use by syslog . That file can be complex, because it is designed to handle messages from many programs other than sendmail , even messages from the kernel itself. Under SunOS the syslog.conf file is also complex because it is preprocessed by m4 (1) when it is read by syslog .

The file syslog.conf is composed of lines of text that each have the form:

facility.level           target

The facility is the type of program that may be producing a message. The facility called mail is the one that sendmail uses. For the complete list, see the online manual for syslog.conf (5).

The level indicates the severity at or above which messages should be handled. These levels correspond to the LogLevel ( L ) option levels shown in Table 26.1 of Section 26.1.1, "syslog(3)" . The complete list of syslog.conf levels used by sendmail is shown in Table 26.2 .

Table 26.2: syslog.conf Levels Used by sendmail
Level Meaning of Severity (Highest to Lowest)
alert Conditions requiring immediate correction
crit Critical conditions for which action may be deferred
err Other errors
warning Warning messages
notice Nonerrors that may require special handling
info Statistical and informational messages
debug Messages used only in debugging a program

The target is one of the four possibilities shown in Table 26.3 . It is the target and the preceding level that must be tuned for use by sendmail .

Table 26.3: syslog.conf Targets
Target Description
@ host Forward message to named host
/ file Append message to named file
user , user ,... Write to users' screens, if logged in
* Write to all logged-in users' screens

For example, the following syslog.conf line causes messages from "mail" (the facility ) that are at or above severity "info" (the level ) to be appended to the file /var/log/syslog (the target ):

  
facility
                  
target

  
-v
                      
-v

mail.info                /var/log/syslog
      
-^

      
level

A typical (albeit much simplified) /etc/syslog.conf file might look like this:

*.err;kern.debug;user.none          /dev/console
*.err;kern.debug;user.none          /var/adm/messages
auth.notice                         @authhost
mail.info                           /var/log/syslog
*.emerg;user.none                   *

Notice that there may be multiple facility.level pairs on the left, each separated from the others by semicolons. The first two lines handle messages for all facilities at level err , all kernel messages ( kern ) at level debug and above, and none of the levels ( none ) for the facility user . The first line sends those messages to the file /dev/console , the computer's screen. The second appends its messages to the file /var/adm/messages .

The third line sends authorization messages (such as repeated login failures) to the host named authhost .

The fourth line appends all messages printed by sendmail at level info and above (the LogLevel ( L ) option is level 10 and below) to the file /var/log/syslog .

The last line is an emergency broadcast facility. A message to any facility (the leftmost * ) at the highest level ( emerg ), except for the facility user (the .none ), will be written to the screen of all currently logged-in users (the target * ).

Finally, note that facilities may be listed together by using a comma:

mail,daemon.info

This causes the level info to be the level for both the facilities mail and daemon . Only the facility may be listed this way. The level may not, and (unfortunately) the target may not.

26.1.3 syslog's Output

When the LogLevel ( L ) option level is 9 or above (see Section 34.8.33 ), sendmail logs one line of information for each envelope sender and one line of information for each recipient delivery or deferral. As sendmail has evolved, these lines of logging information have grown more complex. Here, we discuss the lines produced by sendmail 8.8.4.

Each line of information logged looks something like this:



date host 

sendmail[

pid

]: 

qid

: 

what=value

, ...

Each line of output that syslog produces begins with five pieces of information. The date is the month, day, and time that the line of information was logged (note that the year is absent). The host is the name of the host that produced this information (note that this may differ from the name of the host on which the log files are kept). The sendmail is literal. Because of the LOG_PID argument that is given to openlog (3) by sendmail (see Section 26.1.1 ), the process ID of the invocation of sendmail that produced this information is included in square brackets. Finally, each line includes the qid queue identifier (see Section 23.2.1, "The Queue Identifier" ) that uniquely identifies each message on a given host.

This initial information is followed by a comma-separated list of what=value equates. Which equate appears in which line depends on whether the line documents the sender or the recipient and whether deliver succeed, failed, or was deferred. In Table 26.4 we list the possibilities in alphabetical order. Then, in the sections that follow, we describe the role that each plays.

Table 26.4: what= in syslog Output Lines
what= Description
class= Section 26.1.3.1, "class= the queue class" The queue class
delay= Section 26.1.3.2, "delay= total time to deliver" Total time to deliver
from= Section 26.1.3.3, "from= show envelope sender" Show envelope sender
mailer= Section 26.1.3.4, "mailer= the delivery agent used" The delivery agent used
msgid= Section 26.1.3.5, "msgid= the Message-ID: identifier" The Message-ID: identifier
nrcpts= Section 26.1.3.6, "nrcpts= the number of recipients" The number of recipients
pri= Section 26.1.3.7, "pri= the initial priority" The initial priority
proto= Section 26.1.3.8, "proto= the protocol used in transmission" The protocol used in transmission
relay= Section 26.1.3.9, "relay= the host that sent or accepted the message" The host that sent or accepted the message
size= Section 26.1.3.10, "size= the size of the message" The size of the message
stat= Section 26.1.3.11, "stat= status of delivery" Status of delivery
to= Section 26.1.3.12, "to= show final recipient" The final recipient
xdelay= Section 26.1.3.13, "xdelay= transaction" Transaction delay for this address only

26.1.3.1 class= the queue class

If the mail message contained a Precedence: header (see Section 35.8, "Precedence" ), the class= reflects sendmail 's interpretation of the keyword that follows that header. For example, given the configuration command

PPlist=-30

The following header will yield a class= value of -30 :

Precedence: list

If no Precedence: header is present in the message, the value shown for class= is zero. The class= is shown only for sender records.

26.1.3.2 delay= total time to deliver

A mail message can be delivered immediately, without ever having been queued, or it can be queued and retried over and over again until it either times out or succeeds. The delay= shows the total amount of time the message took to be delivered. This period of time starts when sendmail first receives the message and ends when the message is finally delivered or bounced. This interval is displayed with the delay= syslog line equate:

delay=

days+

HH:MM:SS

The time expression shows the time it took in hours ( HH ), minutes ( MM ), and seconds ( SS ) to handle delivery or rejection of the message. If the delay exceeds 24 hours, the time expression is prefixed with the number of days and a plus character. For example, the following message took 5 seconds to deliver or bounce:

delay=00:00:05

The following message took 4 days, 2 hours, 16 minutes, and 2 seconds to deliver or bounce:

delay=4+02:16:02

Note that the delay= equate is shown only for recipient records.

26.1.3.3 from= show envelope sender

The envelope sender may or may not appear in any of the sender headers. The from= syslog line equate shows the envelope sender:

from=

addr

The addr is the address of the envelope sender with any RFC822 commentary (see Section 35.3.4, "Comments in the Header Field" ) removed. This will usually be the address of an actual person, but it can also be postmaster or the value of the $n macro in the case of a bounced message. The from= equate is shown only for sender records.

26.1.3.4 mailer= the delivery agent used

The sendmail program does not perform the actual delivery of mail. Instead, it calls other programs (called mail delivery agents) to perform that service. The mailer= equate shows the symbolic name (see Section 30.1, "Configuration File Syntax" ) of the delivery agent that was used to perform delivery to the recipient"

mailer=

agent

A list of symbolic names assigned to delivery agents can be viewed with the -d0.15 debugging switch (see Section 37.5.4, -d0.15 ). The mailer= equate is shown only for recipient records.

26.1.3.5 msgid= the Message-ID: identifier

RFC822 requires that each email message have a unique worldwide identifier associated with it. That identifier is listed with the Message-ID: header (see Section 35.10.19, Message-ID: ) and often looks something like this:

Message-Id: <199605051122.VAA07923@here.us.edu>

The information inside, and including, the angle brackets is the message identifier. That identifier is what is listed with the msgid= equate:

msgid=<199605051122.VAA07923@here.us.edu>

If a mail message arrives without a Message-ID: header, and if your configuration file correctly includes a definition for that header, a new identifier will be created and listed with msgid= . If a Message-ID: header is absent, and if your configuration file incorrectly excludes a definition for that header, the msgid= equate will be excluded from the syslog report.

The msgid= equate is shown only for sender records.

26.1.3.6 nrcpts= the number of recipients

The nrcpts= equate shows the number of recipients after all aliasing has been performed. If the original message was addressed to root , if root was aliased like this:

root: bob, hans

and if bob 's ~/.forward file contained this:

\bob
|"/usr/ucb/vacation bob"

then the nrcpts= equate would show three recipients.

Note that nrcpts= is included only with the sender record and that record is emitted when the message is first processed. Any later changes in aliasing that may happen while the message is queued are not reported. Aliasing on remote machines (as would be the case with exploder mailing lists) is also not reported for obvious reasons.

26.1.3.7 pri= the initial priority

The pri= equate shows the initial priority assigned to the message (see Section 35.8 ). This value is calculated once when the message is first processed and changed each time the queued file is tried. This pri= equate shows the initial value.

The pri= syslog equate is displayed only for the sender.

26.1.3.8 proto= the protocol used in transmission

The $r macro (see Section 31.10.31, $r ) holds as its value the protocol that was used when a mail message was first received. That value is either SMTP, ESMTP, or internal or is a protocol assigned with the -p command-line switch (see Section 36.7.32, -p ). If $r lacks a value, this proto= equate is omitted. If $r has a value, the first 20 characters of that value are printed following the proto= in the syslog line:

proto=ESMTP

26.1.3.9 relay= the host that sent or accepted the message

When running as a daemon and listening for incoming connections, sendmail attempts to look up the true identity of connecting users and hosts. When it can find that information, it saves it in the $_ macro (see Section 31.10.1, $- ).

When transporting mail to other hosts, sendmail looks up the MX records for those hosts and connects to the MX records when they are available. If MX records are not available, sendmail connects to the hostname that is specified in the recipient address.

If the $_ information is available, that information appears following the relay= equate:

relay=root@other.site.edu [123.45.67.89]

If the sender is a local user, the login name and localhost will appear in the relay= equate:

relay=bob@localhost

Otherwise, the canonical name of the host that actually accepted delivery will appear here:

relay=mx.host.domain [123.45.67.89]

In summary, the relay= equate shows who really accepted or sent the message. The relay= syslog equate is included with both sender and recipient records.

26.1.3.10 size= the size of the message

The size of an incoming SMTP message is the number of bytes sent during the DATA phase (see Section 34.8.70.5, "Timeout.datainit" ), including end-of-line characters. The size of a message received via sendmail 's standard input is a count of the bytes received, including the newline characters. In both instances the size is displayed with the size= equate:

size=23

Note that this size is reported before sendmail adds or deletes any headers. Therefore for mail being relayed through a site, the size will usually be small coming in and somewhat larger going out.

The size= syslog equate is produced only for sender records.

26.1.3.11 stat= status of delivery

Whenever the delivery status of a mail message changes, sendmail logs the event and includes the status= to specify why the change happened. For example, a mail message may initially be queued because the recipient's host was down:

stat=queued

Later it might change again because it succeeded in being delivered:

stat=Sent (HAA03001 Message accepted for delivery)

In transmitting a mail message via SMTP the stat= will include the actual text that the other host printed when it accepted the mail message, as shown above. But in delivering locally, the stat= is more succinct:

stat=Sent

In the case of bounced mail the stat= will show the reason for failure:

stat=User unknown

The stat= syslog equate is included only in recipient records.

26.1.3.12 to= show final recipient

As each recipient is delivered to, deferred, or bounced, sendmail logs a line of information that includes the recipient address:

to=bob@here.us.edu

Each such address is that of a final recipient (from the point of the view of the local host) after all aliasing, list expansions, and processing of ~/.forward files.

26.1.3.13 xdelay= transaction

The xdelay= equate shows the total amount of time the message took to be transmitted during final delivery. This differs from delay= in that delay= is computed from when the message was originally received (and can be days), whereas this xdelay= shows how fast the receiving host was (and is usually seconds).

In the case of SMTP mail the xdelay= computation starts when sendmail successfully connects to the remote host. In the case of locally delivered mail the computation starts when sendmail executes the delivery agent. The computation ends when the dot is accepted at the close of the DATA SMTP phase or when the local delivery agent exits.

The form of the xdelay= looks like this:

xdelay=HH:MM:SS

The time expression shows the hours ( HH ), minutes ( MM ), and seconds ( SS ) it took to perform delivery via the final delivery agent. In the case of networked mail that interval can be long:

xdelay=00:41:05

But in the case of locally delivered mail this interval can seem instantaneous:

xdelay=00:00:00

Note that the xdelay= equate is shown only for recipient records.

26.1.4 Gathering Statistics from syslog

The log files that syslog creates provide a wealth of information that can be used to examine and tune the performance of sendmail . To illustrate, we will present a simple shell script for printing daily total message volume.

In the following discussion we will assume that sendmail logging is enabled (the LogLevel ( L ) option, see Section 34.8.33 , is nonzero) and that all syslog (8) messages for the facility mail at level LOG_INFO are being placed into the file /var/log/syslog .

26.1.4.1 message_volume.sh

Each mail message that sendmail receives for delivery (excluding those processed from the queue) causes sendmail to log a message like this:

date host sendmail[pid]: quid: from=sender, size=bytes, ...

That is, for each sender that is logged (the from= ), sendmail also logs the total received size of the message in bytes (the size= ).

By summing all the size= lines in a /var/log/syslog file, we can generate the total volume of all messages received for the period represented by that file. One way to generate such a total is shown in the following Bourne shell script:

#!/bin/sh
LOG=/var/log/syslog
TOTAL=`(echo 0;
        sed -e '/size=/!d' -e 's/.*size=//' -e 's/,.*/+/' $LOG;
        echo p;
       ) | dc`
echo Total characters sent: $TOTAL

The sed (1) selects only the lines in /var/log/syslog that contain the expression size= . [1] It then throws away all but the number immediately following each size= (the actual number of bytes of each message), and appends a + to each.

[1] If other programs also put size= expressions into the log file, you may also want to screen for "sendmail."

The entire sequence of processes is enclosed in parentheses. An echo statement first prints a zero. Then the list of + -suffixed sizes is printed. Finally, another echo prints a character p . The resulting combined output might look like this:

123+
456+
7890+
p

The leading 0 , the + suffixes, and the final p are commands for the dc (1) program, which adds up all the numbers (the + suffixes) and prints the total (the p ). That total is saved in the variable TOTAL for later use in the final echo statement. The output of this simple script might look something like this:

Total characters sent: 8469

More sophisticated scripts are possible, but the Bourne shell's lack of arrays suggest that perl (1) would provide a more powerful scripting environment. Most of the scripts that are available publicly are written in the perl scripting language.