28.2 Tokenizing RulesThe sendmail program views the text that makes up rules and addresses as being composed of individual tokens. Rules are tokenized - divided up into individual parts - while the configuration file is being read and while they are being normalized. Addresses are tokenized at another time (as we'll show later), but the process is the same for both.
The text
our.domain
, for example,
is composed of three tokens:
our
, a dot, and
domain
.
These 10 characters are divided into tokens
by the list of separation characters defined by the
Do.:%@!^=/[] prior to V8.7 O OperatorChars=.:%@!^/[] V8.7 and above When any of these separation characters are recognized in text, they are considered individual tokens. Any leftover text is then combined into the remaining tokens.
xxx@yyy;zzz becomes xxx @ yyy;zzz
()<>,;\"\r\n These two lists are combined into one master list that is used for all tokenizing. The above example, when divided by using this master list, becomes five tokens instead of just three:
xxx@yyy;zzz becomes xxx @ yyy ; zzz In rules, quotation marks can be used to override the meaning of tokenizing characters defined in the master list. For example,
"xxx@yyy";zzz becomes "xxx@yyy" ; zzz
Here, three tokens are produced, because the
Because the configuration file is read sequentially from start to finish,
the
. : % @ ! ^ / [ ]
28.2.1 $ Operators Are Tokens
As we progress into the details of rules, you will see that certain
characters become operators when prefixed with a For tokenizing purposes, operators always divide one token from another, just as the characters in the master list did. For example
xxx$*zzz becomes xxx $* zzz
28.2.2 The Space Character Is SpecialThe space character is special for two reasons. First, although the space character is not in the master list, it always separates one token from another:
xxx zzz becomes xxx zzz Second, although the space character separates tokens, it is not itself a token. That is, in the above example the seven characters on the left (the seventh is the space in the middle) become two tokens of three letters each, not three tokens. Therefore the space character can be used inside the LHS or RHS of rules for improved clarity but does not itself become a token or change the meaning of the rule. 28.2.3 Pasting Addresses Back TogetherAfter an address has passed through all the rules (and has been modified by rewriting), the tokens that form it are pasted back together to form a single string. The pasting process is very straightforward in that it mirrors the tokenizing process:
xxx @ yyy becomes xxx@yyy
The only exception to this straightforward pasting process occurs when
two adjoining tokens are both simple text. Simple text is anything
other than the separation characters (defined by the
When two tokens of simple text are pasted together, the character
defined by the
xxx yyy becomes xxx.yyy Note that the improper use of a space character in the LHS or RHS of rules can lead to addresses that have a dot (or other character) inserted where one was not intended. |
|