25.5 Rules Check Header Contents
Recall that a header line declaration
looks like the following:
H?flags?name:field
Here, the H begins the line and tells
sendmail that a header definition follows. The
?flags?
expression causes sendmail to include the header
only if one of the flags is found in the
selected delivery agent's F=
equate. As you saw in the previous section, beginning with V8.10, a
macro name can replace the flags. The
name and a colon then follow.
Beginning with V8.10, sendmail allows the name
of a rule set to replace the field value.
That rule set declaration can come in two forms:
Hname: $> rule set
Hname: $>+ rule set don't strip comments
Both forms basically say the same thing: if
sendmail finds a header
name already in a message it is processing, it
passes the existing header field to the
rule set indicated. The + in the second form tells
sendmail to leave intact (not strip)
parenthesized RFC2822 comments from the passed
field:
text (comments)
The $> in the earlier declaration passes just
text to the rule set, while
$>+ passes the unstripped text with RFC2822
comments intact.
If the rule set specified is not a legal rule set name, or if it is
missing, the following error will be printed and logged:
cf file name: line number: invalid rule set name: "bad name"
If the named rule set does not exist in the configuration file, the
effect is the same as if it did exist and had returned a legal value.
Rule sets called to process headers can return two possible rejection
values, a $#error or a
$#discard. If a $#error is
returned, the entire message is rejected. If a
$#discard is returned, the message is accepted,
then silently discarded. If anything else is returned, the message
and that header are both allowed. To illustrate, consider the
following code which rejects spam messages that are addressed with a
To: header that contains unwanted usernames:
LOCAL_CONFIG
C{SpamUserNames} investor adult friend you ValuedCustomer Valued-Customer
HTo: $>ScreenTo
LOCAL_RULESETS
SScreenTo
R $* $={SpamUserNames} @ $* $#error $: "553 To: header rejected"
R $* $: OK
In the LOCAL_CONFIG part of your mc file, the
line beginning with C declares a class and assigns
values to that class. The class name is
{SpamUserNames} and the class contains as its
values six usernames that commonly appear as the user part of
addresses in the To: header.
The line beginning with H declares a
To: header and a rule set to handle that header.
The $> tells sendmail to
strip parenthesized RFC2822 comments from the address that followed
the To: in the message, and to pass that stripped
address to the ScreenTo rule set.
The LOCAL_RULESETS part of this mc file contains
a single rule set, the ScreenTo rule set, which
contains two rules. The first rule asks if the address in the
workspace has a user part that matches any of the names listed in the
class $={SpamUserNames}. If the address contains
an objectionable username, the entire message is rejected by
returning the error delivery agent with the
expression $#error.
The last rule (the $*) causes all other addresses
to return OK. Technically, the last rule is not
needed because, even in its absence, the original workspace will be
returned, and because that original workspace will contain neither
$#error nor $#discard, the
message will be allowed.
The $: part following the
$#error is required. It tells
sendmail how to reject the message. See error for a description of how this process works.
25.5.1 Use $>+ to Include RFC2822 Comments
Some headers contain addresses, along with
other important information, that appears as RFC2822 commentary. The
Received: header is one such header:
RFC2822 commentary starts here and ends here
Received: from some.other.domain (root@some.other.domain [29.22.14.17])
by your.domain (8.12.4/8.12.4) with ESMTP id g5CMW6KF010979
for <you@your.domain>; Wed, 12 Jun 2002 16:32:09 -0600 (MDT)
Other headers, such as the Subject: header, do not
contain addresses:
Subject: Make money now (Adult Triple-X web site)
When screening such headers, it is important that they are not
interpreted as addresses or information might be lost.
Consider the previous Subject:
header's value. If such a header were screened with
an H configuration file line like this:
HSubject: $>ScreenSubject
the rule set named ScreenSubject would be given
the following value to parse:
Make money now
Beginning with V8.10, sendmail offers the
$>+ operator to prevent parenthetical RFC2822
comments from being stripped out of headers that do not contain
addresses as values:
HSubject: $>+ScreenSubject
note
By using this new operator, the original subject is passed to the
ScreenSubject rule set in a form that is much more
intact:
Make money now(Adult Triple-X web site)
Note that because of the way sendmail splits up
addresses and pastes them back together, the space between the
now and the ( has been lost.
But this does not matter because of the way rule matching operates.
As a side benefit, the ${currHeader}
sendmail macro is filled with the
header's value, and so will contain the original
header value unchanged and quoted. The fact that it is quoted is
important because quoting prevents the value from being viewed by
sendmail as tokens.
Consider the need to screen out messages that contain the text
Adult Triple-X anywhere in the
Subject: header.
LOCAL_CONFIG
KRegxxx regex -a@MATCH Adult Triple-X
HSubject: $>+ScreenSubject
LOCAL_RULESETS
SScreenSubject
R$* $: $( Regxxx $&{currHeader} $)
R@MATCH $#error $@ 5.7.0 $: "553 pornographic subject"
Here, the LOCAL_CONFIG part of this mc file
contains two configuration commands. The first creates a regular
expression database map (regex) called
Regxxx. It says to return (the
-a) the value @MATCH if the
value looked up contains the text Adult Triple-X
surrounded by any other text.
The second declares a header with the H
configuration command. This tells sendmail to
pass the value of all Subject: headers to the rule
set named ScreenSubject. The addition of the
+ to the $ > prevents
sendmail from stripping RFC parenthetical
comments from the value.
The LOCAL_RULESETS part of this mc file contains
a single rule set, the ScreenSubject rule set,
which contains two rules. The first rule looks up the unaltered
Subject:'s value in the
${currHeader} sendmail macro
using the Regxxx database map. If the value in the
${currHeader} macro contains the text
Adult Triple-X anywhere in it, the first rule
returns the new workspace value @MATCH. If the
text Adult Triple-X is not found, the value of the
${currHeader} macro is returned as the workspace.
The second rule looks for a match by detecting a workspace that
contains only @MATCH. If there is a match, the
message is rejected with the error message "553
pornographic subject."
25.5.1.1 Check the header's length
Sometimes it can be desirable to reject headers based on their
length. As we described in the previous section, when a header is
screened with $> or $>+,
the unaltered value of the header is stored in the
${currHeader} macro. At the same time, the length
of the header's value is also stored in the
${hdrlen} macro.
To illustrate one possible use for this macro, consider the following
abstract from your mc file:
LOCAL_CONFIG
Kcompute arith V8.10 and above
HSubject: $>ScreenSubject
LOCAL_RULESETS
SScreenSubject
R$* $: $(compute l $@ 200 $@ $&{hdrlen} $)
RTRUE $#error $@ 5.7.0 $: "553 Subject too long"
The LOCAL_CONFIG part of this mc file contains
two configuration commands. The first declares an
arith database map (arith)
named compute. The second tells
sendmail to screen all
Subject: headers using the
ScreenSubject rule set.
The LOCAL_RULESETS part of this mc file contains
a single rule set, the ScreenSubject rule set,
which has two rules. The first rule uses the
compute database map to compare the value in the
${hdrlen} macro with the constant 200. The
l asks if 200 is less than the value in
${hdrlen}. If it is, this rule will return
TRUE in the workspace. Otherwise, it will return
FALSE.
The second rule says that if the first rule returned
TRUE (200 is less than the
header's length, or the header's
length is greater than 199), reject the message.
25.5.2 H* a Default for All Headers
The previous two sections have shown it is
possible to screen specific headers for properties to accept or
reject. There will be times, however, when you might wish to screen
all headers that do not have their own rule sets. Using an
* in place of the header name provides just such a
mechanism:
H*: $>ScreenAll
The * tells sendmail to pass
all headers, except those that have their own H
configuration line rule set, to the ScreenAll rule
set. Use $>+ instead of
$>, if you want to prevent
sendmail from stripping RFC2822 parenthetical
comments from each header's value.
Consider a site that sends email only to mailing lists. On such a
site, it is desirable to prevent mail that is considered spam from
going out. One way to do this is to reject all mail that contains
addresses that are either in Cc: or
Bcc: headers (good addresses should only be in
To: headers). Such a site might have an
mc file that contains the following:
LOCAL_CONFIG
C{BannedRecipientHeaders} Cc Bcc
H*: $>CheckBanned
LOCAL_RULESETS
SCheckBanned
R $* $: $&{hdr_name}
R $={BannedRecipientHeaders} $#error $@ 5.7.0 $: "553 Banned recipient header"
The LOCAL_CONFIG part of this mc file contains
two configuration commands. The first declares a class called
BannedRecipientHeaders and assigns to that class a
list of header names that should be banned, those being the
Cc: or Bcc: headers with the
colon removed.
The second configuration command starts with the wildcard form of the
H configuration command. The *
in place of a header's name causes all headers,
other than those that have their own H
configuration commands, to be screened by the
CheckBanned rule set.
The LOCAL_RULESETS part of this mc file contains
a single rule set, the CheckBanned rule set, which
contains two rules. The first rule simply replaces the workspace with
the value in the ${hdr_name}
sendmail macro. That macro contains as its
current value the name of the header passed to this rule set.
The second rule checks, on its LHS, to see if the header name is one
of those listed in the class
$={BannedRecipientHeaders}. If the header is
found, the entire message is rejected.
Note that this example will also reject inbound mail that contains
Cc: or Bcc: headers. A better
design would include a test to be sure the message originated from
the local machine.
25.5.3 The check_eoh Rule Set
After all
headers have been processed by sendmail, a
couple of statistics become available that can be of use in screening
messages. One is the number of headers found. The other is the total
number of bytes in all the headers (including the names, colons,
whitespace, and values). If you should ever need this information,
you can process it by declaring a special rule set named
check_eoh. If that rule set exists, it will be
passed the number of headers, and the total number of bytes in all
the headers:
number of headers $| total bytes
If it exists, sendmail will call the
check_eoh rule set after all headers have
otherwise been processed.
Some users have been known to bury information in headers that should
not leave a security-conscious site. Clearly, it is not possible to
individually screen all possible headers. Instead, one approach might
simply be to reject messages that contain more than 25 headers or
more than 10000 bytes of headers. The following extract from a
site's mc file does just that:
LOCAL_CONFIG
Kcompute arith
LOCAL_RULESETS
Scheck_eoh
R $* $| $* $: $(compute l $@ 25 $@ $1 $) $| $2
R TRUE $| $* $#error $@ 5.7.0 $: "553 Too many headers"
R $* $| $* $: $(compute l $@ 10000 $@ $2 $)
R TRUE $#error $@ 5.7.0 $: "553 Too many header bytes"
The LOCAL_CONFIG part of this mc file declares
an arith database map (arith) named compute.
The LOCAL_RULESETS part of this mc file declares
the specially named rule set check_eoh, which has
four rules.
The first rule passes $1, the value to the left of
the $| in the workspace, to the
compute database map. A comparison is made to see
if 25 is less than that value. If it is, this rule will return
TRUE, or a $| and
$2 in the workspace. Otherwise, it will return
FALSE, or a $| and
$2.
The second rule checks to see if the comparison was true. If it was
(if 25 is less than the number of headers—that is, if the
number of headers is greater than 25), the message is rejected.
The third rule passes the value to the right of the
$| in the workspace, to the
compute database map. A comparison is made to see
if 10000 is less than that value—that is, less than the total
number of bytes in the values of all the headers. If it is, this rule
will return TRUE. Otherwise, it will return
FALSE.
The fourth rule checks to see if the comparison was true. If it was
(if 10000 is less than the number of bytes—that is, if the
number of bytes is greater than 9999), the message is rejected.
Note that this example could wrongly reject inbound mail. A better
design would include a test to be sure the message originated from
the local network.
25.5.3.1 Check for missing headers
The check_eoh rule set can also be used to detect
missing headers. Although the Message-Id: is not
mandatory, its absence often indicates that a message is a
spam. The following
abstract from an mc file shows one way to detect
a missing header, and to reject a message based on that absence:
LOCAL_CONFIG
Kstorage macro
HMessage-Id: $>ScreenMessageId
LOCAL_RULESETS
SScreenMessageId
R $* $: $(storage {GotMessageId} $@ YES $) $1
Scheck_eoh
R $* $: < $&{GotMessageId} >
R $* $: $(storage {GotMessageId} $) $1
R < YES > $@ OK
R < > $#error $@ 5.7.0 $: 553 Missing Header
The LOCAL_CONFIG part of this mc file contains
two configuration commands. The first declares a
macro-type database map (macro) which is used to store a value into a
sendmail macro via a rule set. The second
configuration command causes the Message-Id:
header to be screened by the ScreenMessageId rule
set.
The LOCAL_RULESETS part of this mc file declares
two rule sets. The ScreenMessageId rule set has a
single rule which simply stores the literal value YES into the
${GotMessageId} macro. This means that the
Message-Id: header was found.
The check_eoh rule set, which contains five rules,
is called after all headers have been processed. The first rule
fetches the current value (the $& prefix)
found in the {GotMessageId} macro and places that
value (surrounded by angle braces) into the workspace. If the
{GotMessageId} macro lacks a value (if no
Message-Id: header was found), the workspace will
contain angle braces with nothing between them.
The second rule clears the value from the
${GotMessageId} macro so that it can be reused for
the next message that is processed by sendmail.
The third rule looks for a literal <YES> in
the workspace, which would appear if the
Message-Id: header had been found, and causes the
message to be accepted by returning a $@OK on the
RHS.
The last rule looks for nothing between the angle braces, which means
there was no Message-Id: header in the message.
The $#error causes the message to be rejected with
the line error 553 5.7.0 Missing Header.
You probably should not use these rules as is because email that
originates internally might not have a Message-Id:
header and you will need to allow for such mail.
|