26.3 Understanding Expressions
You are probably familiar with the kinds of expressions that a calculator interprets. Look at the following arithmetic expression:
2 + 4
"Two plus four" consists of several constants or
literal values and an operator.
A calculator program must
recognize, for instance, that 2 is a numeric constant and
that the plus sign represents an operator, not to
be interpreted as the
An expression tells the computer how to produce a result. Although it is the sum of "two plus four" that we really want, we don't simply tell the computer to return a six. We instruct the computer to evaluate the expression and return a value.
An expression can be more complicated than 2+4; in fact, it might consist of multiple simple expressions, such as the following:
2 + 3 * 4
A calculator normally evaluates an expression from left to right. However, certain operators have precedence over others: that is, they will be performed first. Thus, the above expression will evaluate to 14 and not 20 because multiplication takes precedence over addition. Precedence can be overridden by placing the simple expression in parentheses. Thus, (2+3)*4 or "the sum of two plus three times four" will evaluate to 20. The parentheses are symbols that instruct the calculator to change the order in which the expression is evaluated.
A regular expression, by contrast, is descriptive of a pattern or sequence of characters. Concatenation is the basic operation implied in every regular expression. That is, a pattern matches adjacent characters. Look at the following example of a regular expression:
Each literal character is a regular expression that
matches only that single character.
This expression describes an "
Programs such as
that accept regular expressions
must first evaluate
the syntax of the regular expression to produce a pattern.
They then read the input line by line trying to match the pattern.
An input line is a string, and to see if a string matches the pattern,
a program compares the first character in the string to the first
character of the pattern.
If there is a match, it compares the second character in
the string to the second character of the pattern.
Whenever it fails to make a match, it compares the next character
in the string to the first character of the pattern.
illustrates this process, trying to match the pattern
Figure 26.1: Interpreting a Regular Expression
A regular expression is not limited to literal characters.
There is, for
instance, a metacharacter - the dot (
If you understand the difference between
It should also be apparent that by use of metacharacters you can expand or limit the possible matches. You have more control over what is matched and what is not. In article 26.4 , Bruce Barnett explains in detail how to use regular expression metacharacters.
- from O'Reilly & Associates' sed & awk