Use regular expressions |
V8.9 and above |
The regex type allows you to parse tokens in the
workspace using POSIX regular expressions. For information on how to
use regular expressions see the online manuals
ed(1) and regexp(1). A
regex database-map type is declared like this:
Kname regex expression
The name is the symbolic name you will use
to reference this database map from inside the RHS of rule sets. The
expression is the literal text that
composes your regular expression. Here is a simple example:
Knumberedname regex ^[0-9]+<@(aol|msn).com.?>
The intention here is for this regular expression to match any
address that has an all-numeric user part (the part before the
<@), and a domain part that is either
aol.com or (the | character)
msn.com. To make rules that use this type easier
to write, you can add a -a switch to the
declaration:
Knumberedname regex -a.FOUND ^[0-9]+<@(aol|msn).com.?>
Here the -a database switch causes
.FOUND to be appended to any successful match.
Note that because of the way we have declared this database map,
nothing but the suffix will be returned on a successful match. To get
the original key returned you need to also use the
-m database switch (-m).
This regex type can use a number of switches to
good advantage. The complete list is shown in Table 23-24.
Table 23-24. The regex database-map type K command switches
-a
|
-a
|
Append tag on successful match
|
-b
|
See this section
|
Use basic, not extended, regular expression matching
|
-D
|
-D
|
Don't use this database map if DeliveryMode=defer
|
-d
|
See this section
|
The delimiting string
|
-f
|
-f
|
Don't fold keys to lowercase, and cause the regular
expression to match in a case-insensitive manner
|
-m
|
-m
|
Suppress replacement on match
|
-n
|
See this section
|
NOT—that is, invert the test
|
-q
|
-q
|
Don't strip quotes from key
|
-S
|
-S
|
Space replacement character
|
-s
|
See this section
|
Substring to match and return
|
-T
|
-T
|
Suffix to append on temporary failure
|
-t
|
-t
|
Ignore temporary errors
|
Note that some additional explanation for a few of these switches is
provided in the sections that follow. Also, for an actual example of
the regex type, see the file
cf/cf/knecht.mc, which demonstrates a way to
deal with one type of spam email.
The -b regex database-map switch
The -b switch limits the
regular expression to a more limited but faster form. If you are
using only simple regular expressions, as in the nature of those
defined by ed(1), you can use this
-b switch to slightly speed up the process:
Kmatch regex -b -aLOCAL @localhost
Here, the search is for a workspace that contains the substring
@localhost. Because this is a very simple regular
expression, the -b switch is appropriate. If you
use the -b on a complex match (such as the one in
the previous section's -n
example), you might see an error such as this:
configfile: line num: field (2) out of range, only 1 substring in pattern
The -d regex database-map switch
There might be times when you would prefer
some other character, operator, or token to replace the
$| that is returned when using the
-s switch. If so, you can specify a different one
with the -d database switch. Consider:
Kmatch regex -s2,3 -d+|+ -a.FOUND (\<a\>|\<b\>)@(\<bob\>|\<ted\>).(\<com\>|\<org\>)
Here we specify that the three characters +|+ will
replace the single operator $| in the returned
value:
> test a@bob.com
test input: a @ bob . com
test returns: bob+|+com . FOUND
Note that here the bob+|+com is a single token.
You can opt to have the original key returned. This is done by
specifying the -m database switch:
Kmatch regex -s2,3 -m -d+|+ -a.FOUND (\<a\>|\<b\>)@(\<bob\>|\<ted\>).(\<com\>|\<org\>)
Note that the -m switch overrides the presence of
the -s and -d switches:
> test a@bob.com
test input: a @ bob . com
test returns: a @ bob . com . FOUND
The -n regex database-map switch
The -n switch inverts the entire sense of the
regular expression lookup. It returns a successful match only if the
regular expression does not match. Consider:
Kmatch regex -m -n -a.FOUND (\<a\>|\<b\>)@(\<bob\>|\<ted\>).(\<com\>|\<org\>)
If you view the effect of this switch in rule-testing mode, you will
see that the result is inverted:
> test a@bob.com
test input: a @ bob . com
test returns: a @ bob . com
> test x@y.net
test input: x @ y . net
test returns: x @ y . net . FOUND
The -s regex database-map switch
The
-s database-map switch is used with the
regex type to specify a substring to match and
return. To illustrate, consider the following mini-configuration
file:
V10
Kmatch regex -s (\<bob\>|\<ted\>)
Stest
R $* $@ $(match $1 $)
The regular expression looks to match either the name
bob or ted, but no other names.
The -s says to return the substring actually
matched in the expression along with the key, the two separated from
each other by a $| operator. Now, observe this
mini-configuration file in rule-testing mode:
% /usr/sbin/sendmail -bt -Cdemo.cf
ADDRESS TEST MODE (ruleset 3 NOT automatically invoked)
Enter <ruleset> <address>
> test bob
test input: bob
test returns: bob $| bob
> test alice
test input: alice
test returns: alice
By adding a -a switch, which appends text to the
matched key:
Kmatch regex -s -a.FOUND (bob|ted)
we see that the matched key with -s is second:
> test bob
test input: bob
test returns: bob $| bob . FOUND
When multiple substrings can be matched, the -s
database switch can be used to specify which substring match to
return. Consider:
Kmatch regex -s2 -a.FOUND (\<a\>|\<b\>)@(\<bob\>|\<ted\>)
There are two substring searches here, first the
(\<a\>|\<b\>) choice, then the
(\<bob\>|\<ted\>) choice. Because the
-s has a 2 as its argument, the
second matched substring will be returned, not the first:
> test a@bob
test input: a @ bob
test returns: bob . FOUND
In more complex expressions it might be desirable to return multiple
substrings. To do that just list them following the
-s with each separated from the next by a comma:
Kmatch regex -s2,3 -a.FOUND (\<a\>|\<b\>)@(\<bob\>|\<ted\>).(\<com\>|\<org\>)
When multiple substrings are listed in this way, they are separated
by the $| operator when they are returned:
> test a@bob.com
test input: a @ bob . com
test returns: bob $| com . FOUND
|