29.4 Rule Set 3Rule set 3 is the first to process every address. It puts each into a form that simplifies the tasks of other rule sets. The most common method is to have rule set 3 focus an address (place angle brackets around the host part). Then later rules don't have to search for the host part, because it is already highlighted. For example, consider trying to spot the recipient host in this mess:
uuhost!user%host1%host2
Here,
user%host1%host2<@uuhost.uucp>
Note that 29.4.1 A Special Case: From:<>The first rule in a typical rule set 3 handles addresses that are composed of empty angle brackets. These represent the special case of an empty or nonexistent address. Empty addresses should be turned into the address of the pseudo-user that bounces mail, Mailer-Daemon :
# handle "from:<>" special case R$*<>$* $@<@> empty becomes special
Here, empty angle brackets, no matter what surrounds them ( 29.4.2 Basic Textual CanonicalizationAddresses can be legally expressed in only four formats: [2]
address address (full name) <address> full name <address> When sendmail preprocesses an address that is in the second format, it removes (and saves for later use) the full name from within the parentheses. The last two formats, however, contain additional characters and information that are not discarded during preprocessing. As a consequence, rule set 3 must take on the job of discarding the unwanted information:
# basic textual canonicalization R$*<$*<$*<$*>$*>$*>$* $4 3-level <> nesting R$*<$*<$*>$*>$* $3 2-level <> nesting R$*<$*>$* $2 basic RFC821/822 parsing Here, we discard everything outside of and including the innermost pair of angle brackets. Three rules are required to do this because of the minimal-matching nature of the LHS operators (see Section 8.7.2, "Minimal Matching" ). Consider trying to de-nest a three-level workspace using only a rule like the third:
the workspace A < B < C < D > C > B > A $* matches A < matches < $+ matches B < C < D > matches > $* matches C > B > A
Clearly, the result
Unbalanced '<' John Halleck designed a clever alternative to the above traditional technique that is now included with V8 sendmail :
R$* $: < $1 > housekeeping <> R$+ < $* > < $2 > strip excess on left R< $* > $+ < $1 > strip excess on right R<> $@ < @ > MAIL FROM:<> case R< $+ > $: $1 remove housekeeping <> Here, angle bracket pairs are stripped first from the left of an address, then from the right, and finally whatever is left must be the address. 29.4.3 Handling Routing AddressesThe sendmail program must be able to handle addresses that are in route address syntax. Such addresses are in the form @A,@B:user@C (which means that mail should be sent first to A , then from A to B , and finally from B to C ). [3] The commas are converted to colons for easier design of subsequent rules. They must be converted back to commas by rule set 4. Rule set 3 uses a simple rule to convert all commas to colons:
# make sure list syntax is easy to parse R@ $+ , $+ @ $1 : $2 change all "," to ":"
The iterative nature of rules comes into play here. As long as there
is an
R@ $+ : $+ $@ <@ $1> : $2 focus route-addr
Once that host has angle brackets placed around it (is focused),
the job of rule set 3 ends, and it exits (the 29.4.4 Handling Specialty AddressesA whole book is dedicated to the myriad forms of addressing that might face a site administrator: !%@:: A Directory of Electronic Mail Addressing & Networks by Donnalyn Frey and Rick Adams (O'Reilly & Associates, 1993). We won't duplicate that work here; rather, we point out that most such addresses are handled nicely by existing configuration files. Consider the format of a DECnet address:
host::user One approach to handling such an address in rule set 3 is to convert it into the Internet user@host.domain form:
R$+ :: $+ $@ $2 @ $1.decnet
Here, we reverse the This is a simple example of a special address problem from the many that can develop. In addition to DECnet, for example, your site may have to deal with Xerox Grapevine addresses, X.400 addresses, or UUCP addresses. The best way to handle such addresses is to copy what others have done. 29.4.5 Focusing for @ SyntaxThe last few rules in our illustration of rule set 3 are used to process the Internet-style user@domain address:
# find focus for @ syntax addresses R$+ @ $+ $: $1 <@ $2> focus on domain R$+ < $+ @ $+ > $1 $2 <@ $3> move gaze right R$+ <@ $+ > $@ $1 <@ $2> already focused
For an address like
something@something
, the first rule
focuses on all the tokens following the first
user@host1@host2 this first rewrite results in
user<@host1@host2>
The second rule (
user<@host1@host2@host3@host4> becomes user@host1@host2@host3<@host4>
The third rule checks to see whether the workspace
has been focused. If it has, it returns the focused workspace
(the Any address that has not been handled by rule set 3 is unchanged and probably not focused. Since rule set 0 expects all addresses to be focused so that it can select appropriate delivery agents, such unfocused addresses may bounce. Many configuration files allow local addresses (just a username) to be unfocused. |
|