22.2 Access Classes in Rules
Class macros are useful only in the LHS of rules. The
sendmail program offers two ways to use them:
- $=X
-
The $= prefix causes sendmail
to seek a match between the workspace and one of the words in a class
list.
- $~X
-
The $~ prefix causes sendmail
to accept only a single token in the workspace that does not match
any of the words in a class list.
22.2.1 Matching Any in a Class: $=
The list of words that form a class are
searched by prefixing the class name with the characters
$=:
R$=X $@<$1>
In this rule, the expression $=X causes
sendmail to search a class for the word that is
in the current workspace. If sendmail finds that
the word has been defined, and if it finds that the word is
associated with the class $=X, only then is a
match made.
The matching word is made available for use in the RHS rewriting.
Because the value of $=X is not known ahead of
time, the matched word can be referenced in the RHS with the
$digit positional
operator.
Consider the following example. Two classes have been declared
elsewhere in the configuration file. The first,
$=w, contains all the possible names for the local
host:
Cw localhost mailhost server1 server2
The second, $=D, contains the domain names of the
two different networks on which this host sits:
CD internal.domain external.domain
If the object of a rule is to match any variation on the local
hostname at either of the domains and to rewrite the result as the
official hostname at the appropriate domain, the following rule can
be used:
R $=w . $=D $@ $w . $2 make any variations "official"
If the workspace contains the tokenized address
server1.external.domain,
sendmail first checks to see whether the word
server1 has been defined as part of the class
w. If it has, the dot in the rule and workspace
match each other, and then sendmail looks up
external.domain.
If both the host part and the domain part are found to be members of
their respective classes, the RHS of the rule is called to rewrite
the workspace. The $2 in the workspace corresponds
to the $=D in the LHS. The $=D
matches the external.domain from the workspace,
so that text is used to rewrite the new workspace.
Note that prior to V8, when sendmail looked up
the workspace to check for a match to a class, it looked up only a
single token. V8 sendmail allows multitoken
class matching.
22.2.2 Matching Any Token Not in a Class: $~
The $~ prefix is used to match any single token in
the workspace that is not in a class. It is used fewer than a dozen
times in a typical production configuration file, but when the need
for its properties arises, it can be very useful.
To illustrate, consider a network with three PC machines on it. The
PC machines cannot receive mail, whereas all the other machines on
the network can. If the list of PC hostnames is defined in the class
{PChosts}:
C{PChosts} pc1 pc2 pc3
a rule can be designed that will match any but a PC hostname:
R $* < @ $~{PChosts} > $@ $1 < @ $2 > filter out the PC hosts
Here the LHS looks for an address of the form:
"user" "<" "@" "not-a-PC" "">
This matches only if the @ token is
not followed by one of the PC hosts listed in
class $={PChosts}. If the part of the workspace
that is tested against the list provided by $~ is
found in that list, the match fails.
Note that the $digit
positional operator in the RHS (the $2 in the
preceding example) references the part that matches
$~{PChosts}. That is, $2
references the token in the workspace that is not in the class
{PChosts}. If the workspace contains
ben<@philly>, the $2
references the philly.
Also note that multitoken expressions in the workspace will not
match. That is, for multitoken expressions in the workspace,
$~ is not the opposite of
$=. To illustrate, consider this miniconfiguration
file:
V10
CX hostA.com
Stest
R $~X $@ no $1 is not in X
R $=X $@ yes $1 is in X
R $* $@ neither
Now feed a multitokened address through these rules in rule-testing
mode:
% /usr/sbin/sendmail -Cx.cf -bt
ADDRESS TEST MODE (ruleset 3 NOT automatically invoked)
Enter <ruleset> <address>
> test hostC.com
test input: hostC . com
test returns: neither
Here, the rule set returned neither because a
multitoken expression in the workspace should never be used with
$~. That is, $~ looks for a
workspace that is not a member of the class and, indeed,
hostC.com is not. But because
hostC.com is multi-tokened, $~
acts as though it is a member of the class, and so does not call the
RHS of the rule:
R $~X a multi-tokened workspace will never call the RHS
If you consider multitokens and $~ as illegal to
use together, this failure, although convoluted, makes sense.
Another way to think of this failure is by comparing the
$~ operator to the $- operator.
Neither will match more than a single token in the workspace. If the
$~ does not match a single token, the LHS does not
match, and the RHS is not called.
There are two ways to circumvent this problem. One alternative is to
make the $~ always look up only a single token:
R $~X $* $@ no $1 is not in X
Here, the $* will match the
.com. Then $~X will correctly
look up only the single token hostC, and correctly
not find it.
A second alternative is to invert the logic of the test, and use the
$= prefix only when multiple tokens are in the
workspace:
R $=X $@ yes $1 is in X
R $* $@ no $1 is not in X
Here, we first check to see if the multitokened workspace is in the
class $=X, and return yes if it
is. Otherwise, we know it is not in the class.
22.2.3 Backup and Retry
Multitoken matching operators,
such as $+, always try to match the least that
they can (Section 18.6.2). Such a simple-minded
approach could lead to problems in matching (or not matching) classes
in the LHS. However, the ability of sendmail to
backup and retry alleviates this problem. For example, consider the
following five tokens in the workspace:
"A" "." "B" "." "C"
and consider the following LHS rule:
R $+ . $=X $*
Because the $+ tries to match the minimum, it
first matches only the A in the workspace. The
$=X then tries to match the B.
and then B.C to the class $=X.
If this match fails, sendmail backs up to the
$+ and tries again.
The next time through, the $+ matches
A. in the workspace, but that fails to match the
dot in the rule, so it backs up again and matches
A.B. The $=X tries to match the
C in the workspace. If C is not
in the class $=X, the entire LHS fails.
The ability of the sendmail program to back up
and retry LHS matches eliminates much of the ambiguity from rule
design. The multitoken matching operators try to match the minimum
but match more if necessary for the whole LHS to match.
22.2.4 Class Name Hashing Algorithm
When comparing a
token in the workspace to a list of words in a class array,
sendmail tries to be as efficient as possible.
Instead of comparing the token to each word in the list, one by one,
it simply looks up the token in its internal string
pool. If the token is in the pool and if the pool listing
is marked as belonging to the class being sought, a match is found.
The comparison of tokens to entries in the string pool is
case-insensitive. Each token is converted to lowercase before the
comparison, and all strings in the string pool are stored in
lowercase.
Because strings are stored in the pool as text with a type, the same
string value can be used for different types with no conflict. For
example, the symbolic name of a delivery agent and a word as a class
macro's value can be identical, yet they will still
be separate entries in the string pool.
The sendmail program uses a simple hashing
algorithm to ensure that the token is compared to the fewest possible
strings in the string pool. In normal circumstances that algorithm
performs its job well. At sites with unusually large classes (perhaps
a few thousand hosts in a class of host aliases), it might be
necessary to tune the hashing algorithm. The code is in the file
stab.c with the sendmail
source. The number of hash buckets is set by the constant STABSIZE.
As an alternative to very full classes, sendmail
offers database maps (Section 23.1). No information
is currently available contrasting the efficiency of the various
approaches.
|