25.3 Header Field Contents
The field
of the H configuration command can contain any
ASCII characters, including whitespace and newlines that result from
joining.
For most headers, however, those characters must obey the following
rules for grouping:
- Atom
-
In the header field, space characters
separate one item from another. Each space-delimited item is further
subdivided by specials (described next), into atoms:
smtp an atom
foo@host atom special atom
Babe Ruth atom atom
An atom is the smallest unit in a header and
cannot contain any control characters. When the
field is an address, an atom is the same
thing as a token (see Chapter 18).
- Specials
-
The special characters are those used
to separate one component of an address from another. They are
internally defined as:
( ) < > @ , ; : \ " . [ ]
A special character can be made nonspecial by preceding it with a
backslash character. For example:
foo;fum atom special atom
foo\;fum one atom
The space and tab characters (also called linear-whitespace
characters) are also used to separate atoms and can be thought of as
specials.
- Quoted text
-
Quotation marks can be used to force multiple items to be treated as
a single atom. For example:
Babe Ruth atom atom
"Babe Ruth" a single atom
Quoted text can contain any characters except for the quotation mark
(") and the backslash character
(\).
- Any text
-
Some headers, such as Subject: (Subject:), impose minimal rules on the text in the
header field. For such headers, atoms,
specials, and quotes have no significance, and the entire field is
taken as arbitrary text.
The detailed requirements of each header name are covered at the end
of this chapter.
25.3.1 Macros in the Header Field
Macros can appear in
any position in the field of a header
definition line. Such macros are not expanded (their values tested or
used) until mail is queued or delivered. For the meaning of each
macro name and a description of when each is given a value, see Chapter 21.
Only two macro prefixes can be used in the field
of header definitions:
- $
-
The $ prefix tells sendmail
to replace the macro's name with its value at that
place in the field definition.
- $?
-
The $? prefix tells sendmail
to perform conditional replacement of a macro's
value.
For example, the following header definition uses the
$ prefix to insert the value of the macro
x into the header field:
HFull-Name: $x
The macro $x ($x)
contains as its value the full name of the sender.
When the possibility exists that a macro will not have a value at the
time the header line is processed, the $?
conditional prefix (Section 21.6) can be used:
HReceived: $?sfrom $s $.by $j ($v/$V)
Here, the $? prefix and $.
operator cause the text:
from $s
to be inserted into the header field only if the
macro s has a value. $s can
contain as its value the name of the sending site.
25.3.2 Escape Character in the Header Field
Recall that the backslash escape
character (\) is used to deprive the special
characters of their special meaning. In the field
of header definitions the escape character can be used only inside
quoted strings (see next item), in domain literals (addresses
enclosed in square bracket pairs), or in comments (discussed later).
Specifically, this means that the escape character
cannot be used within atoms. Therefore, the
following is not legal:
Full\ Name@domain not legal
Instead, the atom to the left of the @ must be
isolated with quotation marks:
"Full Name"@domain legal
25.3.3 Quoted Strings in the Header Field
Recall
that quotation marks (") force arbitrary text to
be viewed as a single atom. Arbitrary text is everything (including
joined lines) that begins with the first quotation mark and ends with
the final quotation mark. The following example illustrates two
quoted strings:
"Full Name"
"One long string carried over
two lines by indenting the second"
whitespace
The quotation mark character can appear inside a quoted string only
if it is escaped by using a backslash:
"George Herman \"Babe\" Ruth"
Internally, sendmail does not check for balanced
quotation marks. If it finds the first but not the second, it takes
everything up to the end of the line as the quoted string.
When quotation marks are used in an H
configuration command, they must be balanced. Although
sendmail remains silent, unbalanced quotation
marks can cause serious problems when they are propagated to other
programs.
25.3.4 Comments in the Header Field
Comments consist of
text inside a header field that is
intended to give users additional information. Comments are saved
internally by sendmail when processing headers,
then are restored, but otherwise are not used. Beginning with V8.7
sendmail, the F=c delivery
agent flag (F=c) can be used to prevent
restoration of the saved comments.
A comment begins with a left parenthesis and ends with a right
parenthesis. Comments can nest. The following lines illustrate a
nonnested comment and a comment nested inside another:
(this is a comment)
(text(this is a comment nested inside another)text)
Comments can be split over multiple lines by indenting:
(this is a comment
split into two lines)
whitespace
A comment (even if nested) separates one atom from another just like
a space or a tab does. Therefore, the following produces two atoms
rather than one:
Bill(postmaster)Johnson
However, comments inside quoted strings are not special, so the
following produces a single atom:
"Bill(postmaster)Johnson"
Parentheses can exist inside of comments only if they are escaped
with a backslash:
<root@host.domain> (The happy administrator ;-\))
note
25.3.4.1 Balancing special characters
Many of the special characters that
are used in the header field and in
addresses need to appear in balanced pairs. Table 25-2 shows these characters and the characters
needed to balance them. Failure to maintain balance can lead to
failed mail. Note that only parentheses can be nested. None of the
other balanced pairs can nest.
Table 25-2. Balancing characters
"
|
"
|
(
|
)
|
[
|
]
|
<
|
>
|
You have already seen the quoted string and comments. The angle
brackets (< and >) are
used to specify a machine-readable address, such as
<gw@wash.dc.gov>. The square brackets
([ and ]) are used to specify a
direct Internet address (one that bypasses normal DNS name lookups)
such as [123.45.67.89].
The sendmail program gives warnings about
unbalanced characters only when it is attempting to extract an
address from a header definition, from the header line of a mail
message, or from the envelope. Beginning with V8.6, when
sendmail finds an unbalanced condition, it tries
to balance the offending characters as rationally as possible.
Regardless of whether it can balance them, it prints one of the
following warning messages:
Unbalanced ')'
Unbalanced '>'
Unbalanced '('
Unbalanced '<'
Unbalanced '"'
If it did not succeed in balancing them, the mail will probably
bounce.
|