17. The Hub's Complex Rules
Contents:
In this chapter we look at some of the rules that are needed to make a hub function. Until now we have focused on the client form of the configuration file. Since the role of the client is narrow (to forward all mail to the hub), its configuration file is simple. But a hub can be a very busy machine, receiving and sending mail for many client machines, and because its role is broad, its configuration file is complex. Fundamentally, all configuration files, simple and complex, tend to look pretty much the same. Both begin by selecting delivery agents using rule set 3 and 0. Both then process recipient or sender addresses with rule sets 3, 1 or 2, R= or S=, then 4, but the hub's rules are more complex:
In this chapter we explore high points of the V8 configuration files. Along the way, we also mix in rules contributed by others to help illustrate difficult concepts. 17.1 Rule Set 3Recall that all addresses are first processed by rule set 3. Its job is to find an address among other clutter and to normalize all addresses into a form that other rules can recognize. 17.1.1 Find the AddressRecall that addresses can legally assume two forms:
address ( comment ) comment < address > In the first form, sendmail strips (and saves) the parenthesized comment, then gives the naked address to rule set 3. In the second form, sendmail passes the entire address, angle brackets and all, to rule set 3. The rules to strip the angle brackets look like this: [1]
S3 R$* $: <$1> Guarantee at least one <> pair R$+ <$*> <$2> Remove everything before the last < R<$*> $+ $: <$1> Remove everything after the first > R<> $@ <@> Null address to @ R<$*> $: $1 Strip remaining <> In the following, we discuss each of these rules individually. 17.1.1.1 At least one <> pairTo find the address in addresses of the form
comment < address >
we must use rules to search for the
R$* $: <$1> Guarantee at least one <> pair
This rule places angle brackets around all addresses, even those
that already have them. Note that the
A side benefit of this rule is that it also surrounds an empty (null) address
with angle brackets. This allows old versions of
sendmail
to
detect null addresses without needing to use the new (beginning with V8.7
sendmail
) 17.1.1.2 Strip to left of <A common problem is that of finding the address when it is deeply nested in many pairs of angle brackets. Consider an address like this:
<<<<address>>>> Such addresses are not common but do appear every now and then as a result of overzealous users or MUAs. Another problem address looks like this:
comment <phone> <address> Here, just noting the outermost pair of angle brackets is not sufficient because the rightmost pair contains the address. The process of finding the rightmost innermost pair of angle brackets requires two rules:
R$+ <$*> <$2> Remove everything before the last < R<$*> $+ $: <$1> Remove everything after the first >
The first recursively discards everything (including angle brackets)
to the left of the rightmost balanced The behavior of these two rules may not be obvious. To better understand them, first create a small configuration file (called x.cf ) that includes the following two lines: [2]
R$+ <$*> <$2> R<$*> $+ $: <$1> Then run sendmail in rule-testing mode with a command like this:
% Enter a series of addresses, one at a time, to see how each is handled. Be as extreme as you want when nesting angle brackets:
>
If you want to see, step by step, how each rule works,
run
sendmail
again, this time
with the
>
17.1.1.3 Handle null address
The fourth rule in rule set 3 is designed to convert a null - pty)
address into the magic symbol
R<> $@ <@> Null address to @
The
The 17.1.1.4 Remove remaining angle bracketsThe last of our five preliminary rules simply removes the angle brackets from whatever remains:
R<$*> $: $1 Strip remaining <>
17.1.2 Normalize the AddressThe rules that we have just looked at isolate the address from other possible information and leave it in its initial form, not surrounded by angle brackets. The rest of the rules in rule set 3 are designed to highlight the host part of any address. They assume that all addresses are composed of a user and a host part. 17.1.2.1 A rule to handle List:;RFC822 allows addresses of the form
Here,
Undisclosed Recipients :;
The colon and semicolon are mandatory and may contain
one or more addresses between them, which may themselves be lists.
[3]
Rule set 3 needs to check for the presence of an empty list (one with
no addresses between the colon and semicolon). The following rule does just that and
turns the empty list into the magic token
R$* :; $@ $1 :; <@> Handle empty List:;
17.1.3 Internet AddressesAfter lists have been disposed of, domain-type addresses need to be handled. Domain type addresses are of the form user@host :
R$+ @ $+ $: $1 <@$2> Focus on host R$+ < $+ @ $+ > $1 $2 <@$3> move gaze right R$* < @ $* : $* > $* $1 <@ $2$3> $4 strip colons R$+ < @ $+ > $@ $>96 $1<@$2> localize and canonicalize The first rule detects addresses of the form something @ something and rewrites them in such a way that the second something becomes the focused host part.
The second rule handles addresses with
multiple
The third rule recursively removes any
colons from the resulting host part as a "sanity check."
This is necessary because
strange forms of route addresses may have bypassed earlier rules
(see the The fourth rule passes any addresses that have been successfully focused to rule set 96 (which will be discussed in Section 17.2, "Rule Set 96" ) so that the local host can be detected and the host part canonicalized. The result from rule set 96 is returned. 17.1.4 UUCP AddressesUUCP addresses contain one or more exclamation points (such as lady!sonya!george ). They fall into two categories: those that are delivered locally by uux (8) and those that are forwarded to another host. The rules to handle them look like this:
R$- ! $+ $@ $>96 $2 <@ $1.UUCP> host!user uucp R$+ . $- ! $+ $@ $>96 $3 <@ $1.$2> Domain style uucp R$+ ! $+ $@ $>96 $2 <@ $1.UUCP> Bang path uucp The first rule looks for a single token hostname followed by an exclamation point. A single token host always becomes the next host in line for delivery. The .UUCP suffix added in the RHS allows rule set 0 to recognize this address as one requiring uux (8) delivery. The second rule looks for a dot in the hostname part of the address. A dot indicates the new-style, domain-based hostname, such as host.domain!user . Such names are assumed to have MX records pointing to service providers and are rewritten into the normal user@host.domain form. The third rule catches any remaining addresses with exclamation points in them. The host to the left of the leftmost exclamation point is taken as the next hop in the UUCP path for delivery. A .UUCP suffix is added to that host, just as in the first rule.
All three rules exit (the leading 17.1.5 The % HackA common technique in mail debugging is to send mail to one host and have that host deliver it to another. Often, this is done by sending the mail something like:
%
Here, the intention is send mail to
first
and from there to
usr@second
. This type of addressing is nonstandard.
Essentially, it is route addressing with
R$*%$* $1 @ $2 Convert all % to @ R$*@$*@$* $1 % $2 @ $3 Undo all but last @ R$*@$* $@ $>96 $1 <@$2> Focus on rightmost
Here, the first rule changes all the percent characters into
|
|