18.7 The RHS
The purpose of the RHS in a rule is to rewrite the workspace. To make
this rewriting more versatile, sendmail offers
several special RHS operators. The complete list is shown in Table 18-2.
Table 18-2. RHS operators
$digit
|
Section 18.7.1
|
Copy by position
|
$:
|
Section 18.7.2
|
Rewrite once (when used as a prefix), or specify the user in a
delivery-agent "triple," or specify
the default value to return on a failed database-map lookup
|
$@
|
Section 18.7.3
|
Rewrite and return (when used as a prefix), or specify the host in a
delivery-agent "triple", or specify
an argument to pass in a database-map lookup or action
|
$>set
|
Section 18.7.4
|
Rewrite through another rule set (such as a subroutine call that
returns to the current position)
|
$#
|
Section 18.7.5
|
Specify a delivery agent or choose an action, such as to reject or
discard a recipient, sender, connection, or message.
|
$[ $]
|
Section 18.7.6
|
Canonicalize hostname
|
$( $)
|
Section 23.4
|
Perform a lookup in an external database, file, or network service,
or perform a change (such as dequoting), or store a value into a
macro.
|
$&
|
Section 21.5.3
|
Delay conversion of a macro until runtime
|
18.7.1 Copy by Position: $digit
The
$digit operator in the
RHS is used to copy tokens from the LHS into the workspace. The
digit refers to positions of LHS wildcard
operators in the LHS:
R $+ @ $* $2!$1
$1 $2
Here, the $1 in the RHS indicates tokens matched
by the first wildcard operator in the LHS (in this case, the
$+), and the $2 in the RHS
indicates tokens matched by the second wildcard operator in the LHS
(the $*). In this example, if the workspace
contains A@B.C, it will be rewritten by the RHS as
follows (note that the order is defined by the RHS):
$* matches B.C so $2 copies it to workspace
! explicitly added to the workspace
$+ matches A so $1 adds it to workspace
The $digit copies all
the tokens matched by its corresponding wildcard operator. For the
$+ wildcard operator, only a single token
(A) is matched and copied with
$1. The ! is copied as is. For
the $* wildcard operator, three tokens are matched
(B.C), so $2 copies all three.
Thus, this rule rewrites A@B.C into
B.C!A.
Not all LHS operators need to be referenced with
a $digit in the RHS.
Consider the following:
R $* < $* > $* <$2>
Here, only the middle LHS operator (the second one) is required to
rewrite the workspace. So only the $2 is needed in
the RHS ($1 and $3 are not
needed and are not present in the RHS).
Although macros appear to be operators in the LHS, they are not.
Recall that macros are expanded when the configuration file is read
(Section 18.2.1). As a consequence, although they
appear as $letter in
the configuration file, they are converted to tokens when that
configuration file is read. For example:
DAxxx
R $A @ $* $1
Here, the macro A is defined to have the value
xxx. To the unwary, the $1
appears to indicate the $A.
But when the configuration file is read, the previous rule is
expanded into:
R xxx @ $* $1
Clearly, the $1 refers to the
$* (because $
digit references only operators
and $A is a macro, not an operator). The
sendmail program is unable to detect errors of
this sort. If the $1 were instead
$2 (in a mistaken attempt to reference the
$*), sendmail prints the
following error and skips that rule:
ruleset replacement number out of bounds
V8 sendmail catches these errors when the
configuration file is read. Earlier versions caught this error only
when the rule was actually used.
The digit of the
$digit must be in the
range one through nine. A $0 is meaningless and
causes sendmail to print the previous error
message and to skip that rule. Extra digits are considered tokens
rather than extensions of the
$digit. That is,
$11 is the RHS operator $1 and
the token 1, not a reference to the
11th LHS operator.
18.7.2 Rewrite Once Prefix: $:
Ordinarily, the RHS rewrites the
workspace as long as the workspace continues to match the LHS. This
looping behavior can be useful. Consider the need to strip extra
trailing dots off an address in the workspace:
R $* .. $1.
Here, the $* matches any address that has two or
more trailing dots. The $1. in the RHS then strips
one of those two trailing dots when rewriting the workspace. For
example:
xxx . . . . . becomes xxx . . . .
xxx . . . . becomes xxx . . .
xxx . . becomes xxx . .
xxx . . becomes xxx .
xxx . match fails
Although this looping behavior of rules can be handy, for most rules
it can be dangerous. Consider the following example:
R $* <$1>
The intention of this rule is to cause whatever is in the workspace
to become surrounded with angle brackets. But after the workspace is
rewritten, the LHS again checks for a match; and because the
$* matches anything, the match succeeds, the RHS
rewrites the workspace again, and again the LHS checks for a match:
xxx becomes < xxx >
< xxx > becomes < < xxx > >
< < xxx > > becomes < < < xxx > > >
and so on, until ...
sendmail prints: rewrite: expansion too long
In this case, sendmail catches the problem
because the workspace has become too large. It prints the preceding
error message and skips that and all further rules in the rule set.
If you are running sendmail in test mode, this
fatal error would also be printed:
= = Ruleset 0 (0) status 65
Unfortunately, not all such endless looping produces a visible error
message. Consider the following example:
R $* $1
Here is an LHS that matches anything and an RHS that rewrites the
workspace in such a way that the workspace never changes. For older
versions this causes sendmail to appear to hang
(as it processes the same rule over and over and over). Newer
versions of sendmail will catch such endless
looping and will print and log the following error:
Infinite loop in ruleset ruleset_name, rule rule_number
In this instance the original workspace is returned.
It is not always desirable (or even possible) to write
"loop-proof" rules. To prevent
looping, sendmail offers the
$: RHS prefix. By starting the RHS of a rule with
the $: operator, you are telling
sendmail to rewrite the workspace only once, at
most:
R $* $: <$1>
Again the rule causes the contents of the workspace to be surrounded
by a pair of angle brackets. But here the $:
prefix prevents the LHS from checking for another match after the
rewrite.
Note that the $: prefix must begin the RHS to have
any effect. If it instead appears inside the RHS, its special meaning
is lost:
foo rewritten by $: $1 becomes foo
foo rewritten by $1 $: becomes foo $:
18.7.3 Rewrite-and-Return Prefix: $@
The flow of rules
is such that each and every rule in a series of rules (a rule set) is
given a chance to match the workspace:
R xxx yyy
R yyy zzz
The first rule matches xxx in the workspace and
rewrites the workspace to contain yyy. The first
rule then tries to match the workspace again but, of course, fails.
The second rule then tries to match the workspace. Because the
workspace contains yyy, a match is found, and the
RHS rewrites the workspace to be zzz.
There will often be times when one rule in a series performs the
appropriate rewrite and no subsequent rules need to be called. In the
earlier example, suppose xxx should only become
yyy and that the second rule should not be called.
To solve problems such as this, sendmail offers
the $@ prefix for use in the RHS.
The $@ prefix tells sendmail
that the current rule is the last one that should be used in the
current rule set. If the LHS of the current rule matches, any rules
that follow (in the current rule set) are ignored:
R xxx $@ yyy
R yyy zzz
If the workspace contains anything other than xxx,
the first rule does not match, and the second rule is called. But if
the workspace contains xxx, the first rule matches
and rewrites the workspace. The $@ prefix for the
RHS of that rule prevents the second rule (and any subsequent rules
in that rule set) from being called.
Note that the $@ also prevents looping. The
$@ tells sendmail to skip
further rules and to rewrite only once. The
difference between $@ and $: is
that both rewrite only once, but $@
doesn't proceed to the next
rule, whereas $: does.
The $@ operator must be used as a prefix because
it has special meaning only when it begins the RHS of a rule. If it
appears anywhere else inside the RHS it loses its special meaning:
foo rewritten by $@ $1 becomes foo
foo rewritten by $1 $@ becomes foo $@
18.7.4 Rewrite Through Another Rule Set: $>set
Rules are organized in sets that can be thought of as subroutines.
Occasionally, a series of rules can be common to two or more rule
sets. To make the configuration file more compact and somewhat
clearer, such common series of rules can be made into separate
subroutines.
The RHS $>set
operator tells sendmail to perform additional
rewriting using a secondary set of rules. The
set is the rule-set name or number of that
secondary set. If set is the name or
number of a nonexistent rule set, the effect is the same as if the
subroutine rules were never called (the workspace is unchanged).
If the set is numeric and is greater than
the maximum number of allowable rule sets,
sendmail prints the following error and skips
that rule:
bad ruleset bad_number (maximum max)
If the set is a name and the rule-set name
is undeclared, sendmail prints the following
error and skips that rule:
Unknown ruleset bad_name
Neither of these errors is caught when the configuration file is
read. They are caught only when mail is sent because a rule set name
can be a macro:
$> $&{SET}
The
$& prefix prevents the macro named
{SET} from being expanded when the configuration
file is read. Therefore, the name or number of the rule set cannot be
known until mail is sent.
The process of calling another set of rules proceeds in five stages:
- First
-
As usual, if the LHS matches the workspace, the RHS gets to rewrite
the workspace.
- Second
-
The RHS ignores the
$>set part and
rewrites the rest as usual.
- Third
-
The part of the rewritten workspace following the
$>set is then given
to the set of rules specified by set. They
either rewrite the workspace or do not.
- Fourth
-
The portion of the original RHS from the
$>set to the end is
replaced with the subroutine's rewriting, as though
it had performed the subroutine's rewriting itself.
- Fifth
-
The LHS gets a crack at the new workspace as usual unless it is
prevented by a $: or $@ prefix
in the RHS.
For example, consider the following two sets of rules:
# first set
S21
R $*.. $:$>22 $1. strip extra trailing dots
...etc.
# second set
S22
R $*.. $1. strip trailing dots
Here, the first set of rules contains, among other things, a single
rule that removes extra dots from the end of an address. But because
other rule sets might also need extra dots stripped, a subroutine
(the second set of rules) is created to perform that task.
Note that the first rule strips one trailing dot from the workspace
and then calls rule set 22 (the
$>22), which then
strips any additional dots. The workspace, as rewritten by rule set
22, becomes the workspace yielded by the RHS in the first rule. The
$: prevents the LHS of the first rule from looking
for a match a second time.
Prior to V8.8 sendmail the subroutine call must
begin the RHS (immediately follow any $@ or
$: prefix, if any), and only a single subroutine
can be called. That is, the following causes rule set 22 to be called
but does not call 23:
$>22 xxx $>23 yyy
Instead of calling rule set 23, the $> operator
and the 23 are copied as is into the workspace,
and that workspace is passed to rule set 22:
xxx $> 23 yyy passed to rule set 22
Beginning with V8.8
sendmail, subroutine calls can appear anywhere
inside the RHS, and there can be multiple subroutine calls. Consider
the same RHS as shown earlier:
$>22 xxx $>23 yyy
Beginning with V8.8 sendmail, rule set 23 is
called first and is given the workspace yyy to
rewrite. The workspace, as rewritten by rule set 23, is added to the
end of the xxx, and the combined result is passed
to rule set 22.
Under V8.8 sendmail, subroutine rule-set calls
are performed from right to left. The result (rewritten workspace) of
each call is appended to the RHS text to the left.
You should beware of one problem with all versions of
sendmail. When ordinary text immediately follows
the number of the rule set, that text is likely to be ignored. This
can be witnessed by using the -d21.3 debugging
switch.
Consider the following RHS:
$>3uucp.$1
Because sendmail parses the 3
and the uucp as a single token, the subroutine
call succeeds, but the uucp is lost. The
-d21.3 switch illustrates this problem:
-----callsubr 3uucp (3) sees this
-----callsubr 3 (3) but should have seen this
The 3uucp is interpreted as the number 3, so it is
accepted as a valid number despite the fact that
uucp was attached. Because the
uucp is a part of the number, it is not available
for comparison to the workspace and so is lost. The correct way to
write the previous RHS is:
$>3 uucp.$1
Note that the space between the 3 and the
uucp causes them to be viewed as two separate
tokens.
This problem can also arise with macros. Consider the following:
$>3$M
Here, the $M is expanded when the configuration
file is parsed. If the expanded value lacks a leading space, that
value (or the first token in it) is lost.
Note that operators that follow a rule-set number are correctly
recognized:
$>3$[$1$]
Here, the 3 is immediately followed by the
$[ operator. Because operators are token
separators, the call to rule set 3 will be correctly interpreted as:
-----callsubr 3 (3) good
But as a general rule, and just to be safe, the number of a
subroutine call should always be followed by a space.
18.7.5 Return a Selection: $#
The
$# operator in the RHS is copied as is into the
workspace and functions as a flag advising
sendmail that an action has been selected. The
$# must be the first token copied into the
rewritten workspace for it to have this special meaning. If it
occupies any other position in the workspace, it loses its special
meaning:
$# local selects delivery agent in the parse rule set 0
$# OK accepts a message in the Local_check_mail rule set
xxx $# local no special meaning
When it is used in the parse rule set 0 (Section 19.5) and localaddr rule set 5
(Section 19.6) (and occupies the first position in
the rewritten workspace), the $# operator tells
sendmail that the second token in the workspace
is the name of a delivery agent (here, local).
When used in the check_ rule sets (Section 7.3 and Section 7.1), subsequent
tokens in the workspace (here, OK) say how a
message should be handled.
Note that the $# operator can be prefixed with a
$@ or a $: without losing its
special meaning because those prefix operators are not copied to the
workspace:
$@ $# local rewritten as $# local
However, those prefix operators are not necessary because the
$# acts just like a $@ prefix.
It prevents the LHS from attempting to match again after the RHS
rewrite, and it causes any following rules (in that rule set) to be
skipped. When used in non-prefix roles in the
parse rule set 0 and localaddr
rule set 5, $@ and $: also act
like flags, conveying host and address information to
sendmail (Section 19.5).
18.7.6 Canonicalize Hostname: $[ and $]
Tokens that appear between a
$[ and $] pair of operators in
the RHS are considered to be the name of a host. That hostname is
looked up by using DNS and replaced with the
full canonical form of that name. If found, it is then copied to the
workspace, and the $[ and $]
are discarded.
For example, consider a rule that looks for a hostname in angle
brackets and (if found) rewrites it in canonical form:
R < $* > $@ < $[ $1 $] > canonicalize hostname
Such canonicalization is useful at sites where users frequently send
mail to machines using the short version of a
machine's name. The $[ tells
sendmail to view all the tokens that follow (up
to the $]) as a single hostname.
If the name cannot be canonicalized (perhaps because there is no such
host), the name is copied as is into the workspace. For configuration
files lower than 2, no indication is given that it could not be
canonicalized (more about this soon).
Note that if the $[ is omitted and the
$] is included, the $] loses
its special meaning and is copied as is into the workspace.
The hostname between the $[ and
$] can also be an IP address. By surrounding the
hostname with square brackets ([ and
]), you are telling sendmail
that it is really an IP address:
wash.dc.gov a hostname
[123.45.67.8] an IPv4 address
[IPv6:2002:c0a8:51d2::23f4] an IPv6 address
When the IP address between the square brackets corresponds to a
known host, the address and the square brackets are replaced with
that host's canonical name. Note that when handling
IPv6 addresses, the IPv6: prefix must be present.
After the successful lookup of a known host, the entire expression
between $[ and $] will be
replaced with the new information.
If the version of the configuration file is 2 or
greater (as set with the V configuration command,
Section 17.5), a successful canonicalization has a
dot appended to the result:
myhost becomes myhost . domain . success
nohost becomes nohost failure
Note that a trailing dot is not legal in an address specification, so
subsequent rules (such as rule set 4) must
remove these added trailing dots.
Also, the K configuration command (Section 23.2) can be used to redefine (or eliminate) the
dot as the added character. For example:
Khost host -a.found
This causes sendmail to add the text
.found to a successfully canonicalized hostname
instead of the dot.
One difference between V8 sendmail and other
versions is the way it looks up names from between the
$[ and $] operators. The rules
for V8 sendmail are as follows:
- First
-
If the name contains at least one dot (.) anywhere within it, it is
looked up as is; for example, host.com.
- Second
-
If that fails, it appends the default domain to the name (as defined
in /etc/resolv.conf) and tries to look up the
result; for example, host.com.foo.edu.
- Third
-
If that fails, each entry in the domain search path (as defined in
/etc/resolv.conf) is appended to the original
host; for example, host.com.edu.
- Fourth
-
If the original name did not have a dot in it, it is looked up as is;
for example, host.
This approach allows names such as host.com to
first match an actual site, such as sendmail.com
(if that was intended), instead of wrongly matching a host in a local
department of your school. This is particularly important if you have
wildcard MX records for your site.
18.7.6.1 An example of canonicalization
The following three-line configuration file can be used to observe
how sendmail canonicalizes hostnames:
V10
SCanon
R $* $@ $[ $1 $]
If this file were called test.cf,
sendmail could be run in rule-testing mode with
a command such as the following:
% /usr/sbin/sendmail -Ctest.cf -bt
Thereafter, hostname canonicalization can be observed by specifying
the Canon rule set and a hostname. One such run of
tests might appear as follows:
ADDRESS TEST MODE (ruleset 3 NOT automatically invoked)
Enter <ruleset> <address>
> Canon wash
canon input: wash
canon returns: wash . dc. gov .
> Canon nohost
canon input: nohost
canon returns: nohost
>
Note that the known host named wash is rewritten
in canonicalized form (with a dot appended because the version of
this miniconfiguration file, the V10, is greater
than 2). The unknown host named nohost is
unchanged and has no dot appended.
18.7.6.2 Default in canonicalization: $:
IDA
and V8 sendmail both offer an alternative to
leaving the hostname unchanged when canonicalization fails with
$[ and $]. A default can be
used instead of the failed hostname by prefixing that default with a
$: operator:
$[ host $: default $]
The $: default must follow
the host (or square-brace-enclosed
address) and precede the $]. To illustrate its
use, consider the following rule:
R $* $: $[ $1 $: $1.notfound $]
If the hostname $1 can be canonicalized, the
workspace becomes that canonicalized name. If it cannot, the
workspace becomes the original hostname with a
.notfound appended to it. If the
default part of the
$:default is omitted, a
failed canonicalization is rewritten as zero tokens.
Because the $[ and $] operators
are implemented using the host dbtype (Section 23.4.3), you can modify the behavior of that dbtype
by adding a -T to it:
Khost host -T.tmp
Thereafter, whenever $[ and $]
find a temporary lookup failure, the suffix .tmp
is returned, and .notfound, in this example, is
returned only if the host truly does not exist.
18.7.7 Other Operators
Many other operators (depending on your version of
sendmail) can also be used in rules. Because of
their individual complexity, all of the following are detailed in
other chapters. We outline them here, however, for completeness.
- Class macros
-
Class macros are described in Section 22.2.1 and Section 22.2.2 of Chapter 22. Class macros
can appear only in the LHS. They begin with the prefix
$= to match a token in the workspace to one of
many items in a class. The alternative prefix $~
causes a single token in the workspace to match if it does
not appear in the list of items that are in the
class.
- Conditionals
-
The conditional macro operator $? is rarely used
in rules (Section 21.6). When it is used in rules,
the result is often not what was intended. Its
else part, the $| conditional
operator, is used by the various rule sets (Section 7.1.4) to separate two differing pieces of
information in the workspace.
- Database Maps
-
The database-map operators, $( and
$), are used to look up tokens in various types of
database files, plain files, and network services. They also provide
access to internal services, such as dequoting or storing a value in
the macro (see Chapter 23).
|