7.3 Check Headers with Rule Sets
Beginning with V8.10,
sendmail provides the ability to screen selected
headers with rule sets. This is described in detail in Section 25.5. In this section we show two more techniques
for using header checks to reject spam messages:
7.3.1 Virus Screening by Subject
Many messages that contain viruses, worms, or Trojan horses have
distinctive subject lines, the text of which is usually reported in
the news. When a new virus is discovered, it is often quicker to
reject messages based on its reported subject line than it is to
await the latest update of your favorite virus filter software. But
this is only a temporary fix. Because legitimate email will often
share the same subjects, it is best to only screen on the
Subject: header between the time the virus is
detected and announced, and the time your virus screening software is
updated.
One way to screen by subject is to create a database of subject lines
to reject, and then use that database in a subject checking rule set.
Consider the following text file which contains one subject per line.
The subject is to the left, the word REJECT is to the right, and the
two are separated by one or more tab characters:
I Love You REJECT
Visit Home Now! REJECT
If you were to call this file
/etc/mail/spamsubjects, you could turn it into a
database map with commands like this:
# cd /etc/mail
# makemap -t\ tab hash spamsubjects < spamsubjects
The -t command-line switch tells
makemap that the key and value pairs are
separated by a tab instead of spaces or tabs. The backslash protects
the tab from interpretation by your shell. We use that command-line
switch because our keys can contain internal spaces.
Once this database is in place, it will be easy to update its
contents whenever a new virus is announced. Because it is a database,
you will be able to update it without having to restart
sendmail. In fact, because the righthand side
says REJECT, you simply have to change that word to OK to allow a
header. This allows you to maintain a history of spam subjects for
later review or reuse.
The rules for the use of this database can be added to your
mc configuration file like this:
LOCAL_CONFIG
Kspamsubjdb hash /etc/mail/spamsubjects
HSubject: $>ScreenSubject
LOCAL_RULESETS
SScreenSubject
R $* $: $(spamsubjdb $&{curHeader} $: OK $) $1
R REJECT $* $#error $: "553 Subject:" $2 ": Indicates virus, rejected"
Here, the LOCAL_CONFIG part defines a database map called
spamsubjdb of type hash that
will use the database file you created earlier. The second line under
LOCAL_CONFIG defines the Subject: header, and says
that the value of that header should be passed (the
$ > operator) through the
ScreenSubject rule set.
In the LOCAL_RULESETS part of your mc file the
S configuration line defines the
ScreenSubject rule set, which has just two rules.
The first rule looks up the entire workspace (the
$* operator) in the LHS, in the database map
called spamsubjdb. If the literal text of the
Subject: header's value is found
in the database, the token from the right side of the database, the
REJECT in our example, is returned. If it is not found in the
database, the default (as indicated by the $:
operator) is returned (the OK is returned). Whichever token is
returned, the original subject value is also returned (the trailing
$1 operator).
The second rule looks for the literal text REJECT in the workspace,
followed by zero or more tokens (the $* operator).
If the workspace begins with REJECT, the message is rejected,
otherwise it is accepted.
The RHS of the second rule performs the rejection. The
$#error instructs sendmail to
reject the message. The $: part defines the text
of the error message that will be issued. For a subject value of
I Love You, the following error will be produced
during the SMTP exchange.
553 5.3.5 Subject: I Love You : Indicates virus, rejected
Note that when sendmail sees an SMTP code of 553
that is not followed by a DSN code, it will insert the appropriate
DSN code, here the 5.3.5.
Finally, we say again that you should reject email based on the
subject only as a temporary measure. The likelihood that legitimate
email will have an identical subject is very high. When erring, it is
better to allow the occasional spam than it is to reject any
legitimate email.
7.3.2 Check Validity of Received:
The Received: header traces the succession of
hosts that an email messages passes through. One technique used by
spam messages is to create false Received: headers
both to mask the real identity of the original sending host, and to
divert blame to some innocent site. One form of bad
Received: header that appears in spam messages
looks like this:
Received: from ...............................................................
........................................................................
........................................................................
........................................................................
........................................................................!
This form of Received: header was popular with
spam software for a few months, then fell out of favor. The following
rule shows one way of dealing with such headers:
LOCAL_RULESETS
H*: $>ScreenForDots
SScreenForDots
R $+ .......... $* $#error $: "553 Ten or more dots begin " $&{hdr_name} "header"
Here, the LOCAL_RULESETS part of your mc file
begins with an unusual-looking H configuration
command. The H* is special (Section 25.5.2) because it matches all headers. When
sendmail screens headers, it first calls each
rule set specified for a specific named header (as with
Subject: in the previous section). If no rule set
exists for a particular header name, sendmail
next looks for the special definition H* and, if
found, passes the header to that rule set. You can think of
H* as specifying a default rule set.
The rule set named ScreenForDots has only a single
rule. That rule matches any value part of any header that does not
have its own rule set. The LHS checks for a value that begins with 10
dots followed by zero or more arbitrary tokens.
Any header that has such a bad value will be rejected and the message
bounced. The bounce will have the following text as its error, where
the offending header was the Received: header
shown earlier:
553 5.3.5 Ten or more dots begin Received header
Remember that the techniques used by spam email senders change over
time—the bad guys learn and adapt too. We solved the dots in
the Received: header with a general rule set
because it was transient (a spam technique used for a brief period,
then abandoned). The problem will doubtless appear again, perhaps in
a different header, or when some poor sap downloads an old version of
spamming software. But by defining with a general-purpose rule set
(the H* one), we anticipate the return of a
technique in the future, possibly with a differently named
header.
|