Managing System Logs (Running Linux)

8.3. Managing System Logs

The syslogd utility logs various kinds of system activity, such as debugging output from sendmail and warnings printed by the kernel. syslogd runs as a daemon and is usually started in one of the rc files at boot time.

The file /etc/syslog.conf is used to control where syslogd records information. Such a file might look like the following:

*.info;*.notice    /var/log/messages 
mail.debug         /var/log/maillog 
*.warn             /var/log/syslog 
kern.emerg         /dev/console

The first field of each line lists the kinds of messages that should be logged, and the second field lists the location where they should be logged. The first field is of the format:

facility.level [; facility.level … ]

where facility is the system application or facility generating the message, and level is the severity of the message.

For example, facility can be mail (for the mail daemon), kern (for the kernel), user (for user programs), or auth (for authentication programs such as login or su). An asterisk in this field specifies all facilities.

level can be (in increasing severity): debug, info, notice, warning, err, crit, alert, or emerg.

In the previous /etc/syslog.conf, we see that all messages of severity info and notice are logged to /var/log/messages, all debug messages from the mail daemon are logged to /var/log/maillog, and all warn messages are logged to /var/log/syslog. Also, any emerg warnings from the kernel are sent to the console (which is the current virtual console, or an xterm started with the -C option).

The messages logged by syslogd usually include the date, an indication of what process or facility delivered the message, and the message itself--all on one line. For example, a kernel error message indicating a problem with data on an ext2fs filesystem might appear in the log files as:

Dec  1 21:03:35 loomer kernel: EXT2-fs error (device 3/2): 
  ext2_check_blocks_bit map: Wrong free blocks count in super block, 
  stored = 27202, counted = 27853

Dec 11 15:31:51 loomer su: mdw on /dev/ttyp3

Log files can be important in tracking down system problems. If a log file grows too large, you can delete it using rm; it will be recreated when syslogd starts up again.

Your system probably comes equipped with a running syslogd and an /etc/syslog.conf that does the right thing. However, it's important to know where your log files are and what programs they represent. If you need to log many messages (say, debugging messages from the kernel, which can be very verbose) you can edit syslog.conf and tell syslogd to reread its configuration file with the command:

kill -HUP `cat /var/run/syslog.pid`

Note the use of backquotes to obtain the process ID of syslogd, contained in /var/run/syslog.pid.

Other system logs might be available as well. These include:

/var/log/wtmp

Note that the format of the wtmp and utmp files differs from system to system. Some programs may be compiled to expect one format and others another format. For this reason, commands that use the files may produce confusing or inaccurate information--especially if the files become corrupted by a program that writes information to them in the wrong format.

Logfiles can get quite large, and if you do not have the necessary hard disk space, you have to do something about your partitions being filled too fast. Of course, you can delete the log files from time to time, but you may not want to do this, since the log files also contain information that can be valuable in crisis situations.

One option is to copy the log files from time to time to another file and compress this file. The log file itself starts at 0 again. Here is a short shell script that does this for the log file /var/log/messages:

mv /var/log/messages /var/log/messages-backup
cp /dev/null /var/log/messages

CURDATE=`date +"%m%d%y"`

mv /var/log/messages-backup /var/log/messages-$CURDATE
gzip /var/log/messages-$CURDATE

First, we move the log file to a different name and then truncate the original file to 0 bytes by copying to it from /dev/null. We do this so that further logging can be done without problems while the next steps are done. Then, we compute a date string for the current date that is used as a suffix for the filename, rename the backup file, and finally compress it with gzip.

You might want to run this small script from cron, but as it is presented here, it should not be run more than once a day--otherwise the compressed backup copy will be overwritten, because the filename reflects the date but not the time of day. If you want to run this script more often, you must use additional numbers to distinguish between the various copies.

There are many more improvements that could be made here. For example, you might want to check the size of the log file first and only copy and compress it if this size exceeds a certain limit.

Even though this is already an improvement, your partition containing the log files will eventually get filled. You can solve this problem by keeping only a certain number of compressed log files (say, 10) around. When you have created as many log files as you want to have, you delete the oldest, and overwrite it with the next one to be copied. This principle is also called log rotation. Some distributions have scripts like savelog or logrotate that can do this automatically.

To finish this discussion, it should be noted that most recent distributions like SuSE, Debian, and Red Hat already have built-in cron scripts that manage your log files and are much more sophisticated than the small one presented here.