7.3 Check Headers with Rule Sets

Beginning with V8.10, sendmail provides the ability to screen selected headers with rule sets. This is described in detail in Section 25.5. In this section we show two more techniques for using header checks to reject spam messages:

Reject messages that have subjects which indicate that the message contains a virus.
Reject messages that have an illegally formed Received: header.

7.3.1 Virus Screening by Subject

Many messages that contain viruses, worms, or Trojan horses have distinctive subject lines, the text of which is usually reported in the news. When a new virus is discovered, it is often quicker to reject messages based on its reported subject line than it is to await the latest update of your favorite virus filter software. But this is only a temporary fix. Because legitimate email will often share the same subjects, it is best to only screen on the Subject: header between the time the virus is detected and announced, and the time your virus screening software is updated.

One way to screen by subject is to create a database of subject lines to reject, and then use that database in a subject checking rule set. Consider the following text file which contains one subject per line. The subject is to the left, the word REJECT is to the right, and the two are separated by one or more tab characters:

I Love You       REJECT
Visit Home Now!  REJECT

If you were to call this file /etc/mail/spamsubjects, you could turn it into a database map with commands like this:

# cd /etc/mail 
# makemap -t\  tab hash spamsubjects < spamsubjects

The -t command-line switch tells makemap that the key and value pairs are separated by a tab instead of spaces or tabs. The backslash protects the tab from interpretation by your shell. We use that command-line switch because our keys can contain internal spaces.^[13]

^[13] Depending on your shell, you might have to prefix the tab with a control-V character to embed it into your command line.

Once this database is in place, it will be easy to update its contents whenever a new virus is announced. Because it is a database, you will be able to update it without having to restart sendmail. In fact, because the righthand side says REJECT, you simply have to change that word to OK to allow a header. This allows you to maintain a history of spam subjects for later review or reuse.

The rules for the use of this database can be added to your mc configuration file like this:

LOCAL_CONFIG
Kspamsubjdb hash /etc/mail/spamsubjects
HSubject: $>ScreenSubject

LOCAL_RULESETS
SScreenSubject
R $*            $: $(spamsubjdb $&{curHeader} $: OK $) $1
R REJECT $*     $#error $: "553 Subject:" $2 ": Indicates virus, rejected"

Here, the LOCAL_CONFIG part defines a database map called spamsubjdb of type hash that will use the database file you created earlier. The second line under LOCAL_CONFIG defines the Subject: header, and says that the value of that header should be passed (the $ > operator) through the ScreenSubject rule set.

In the LOCAL_RULESETS part of your mc file the S configuration line defines the ScreenSubject rule set, which has just two rules.

The first rule looks up the entire workspace (the $* operator) in the LHS, in the database map called spamsubjdb. If the literal text of the Subject: header's value is found in the database, the token from the right side of the database, the REJECT in our example, is returned. If it is not found in the database, the default (as indicated by the $: operator) is returned (the OK is returned). Whichever token is returned, the original subject value is also returned (the trailing $1 operator).

The second rule looks for the literal text REJECT in the workspace, followed by zero or more tokens (the $* operator). If the workspace begins with REJECT, the message is rejected, otherwise it is accepted.

The RHS of the second rule performs the rejection. The $#error instructs sendmail to reject the message. The $: part defines the text of the error message that will be issued. For a subject value of I Love You, the following error will be produced during the SMTP exchange.

553 5.3.5 Subject: I Love You : Indicates virus, rejected

Note that when sendmail sees an SMTP code of 553 that is not followed by a DSN code, it will insert the appropriate DSN code, here the 5.3.5.

Finally, we say again that you should reject email based on the subject only as a temporary measure. The likelihood that legitimate email will have an identical subject is very high. When erring, it is better to allow the occasional spam than it is to reject any legitimate email.

7.3.2 Check Validity of Received:

The Received: header traces the succession of hosts that an email messages passes through. One technique used by spam messages is to create false Received: headers both to mask the real identity of the original sending host, and to divert blame to some innocent site. One form of bad Received: header that appears in spam messages looks like this:

Received: from ...............................................................
........................................................................
........................................................................
........................................................................
........................................................................!

This form of Received: header was popular with spam software for a few months, then fell out of favor. The following rule shows one way of dealing with such headers:

LOCAL_RULESETS
H*: $>ScreenForDots

SScreenForDots
R $+ .......... $*     $#error $: "553 Ten or more dots begin " $&{hdr_name} "header"

Here, the LOCAL_RULESETS part of your mc file begins with an unusual-looking H configuration command. The H* is special (Section 25.5.2) because it matches all headers. When sendmail screens headers, it first calls each rule set specified for a specific named header (as with Subject: in the previous section). If no rule set exists for a particular header name, sendmail next looks for the special definition H* and, if found, passes the header to that rule set. You can think of H* as specifying a default rule set.

The rule set named ScreenForDots has only a single rule. That rule matches any value part of any header that does not have its own rule set. The LHS checks for a value that begins with 10 dots followed by zero or more arbitrary tokens.

Any header that has such a bad value will be rejected and the message bounced. The bounce will have the following text as its error, where the offending header was the Received: header shown earlier:

553 5.3.5 Ten or more dots begin Received header

Remember that the techniques used by spam email senders change over time—the bad guys learn and adapt too. We solved the dots in the Received: header with a general rule set because it was transient (a spam technique used for a brief period, then abandoned). The problem will doubtless appear again, perhaps in a different header, or when some poor sap downloads an old version of spamming software. But by defining with a general-purpose rule set (the H* one), we anticipate the return of a technique in the future, possibly with a differently named header.