9.3. Metacharacters
The following characters have special meaning in search patterns:
Character |
Meaning |
. |
Match any single character except newline. |
* |
Match any number (or none) of the single character
that immediately precedes it.
The preceding character also can be a regular expression
(e.g., since . (dot) means any character,
.* means match any number of any character -- except newlines).
|
^ |
Match the beginning of the line or string.
|
$ |
Match the end of the line or string.
|
[ ] |
Match any one of the enclosed characters.
A hyphen (-) indicates a range of consecutive characters.
A circumflex (^) as the first character in the brackets
reverses the sense: it matches any one character not in the list.
A hyphen or close bracket (]) as the first character
is treated as a member of the list. All other metacharacters
are treated as members of the list.
|
[^ ] |
Match anything except enclosed characters.
|
\{n,m\} |
Match a range of occurrences of the single character
that immediately precedes it.
The preceding character also can be a regular expression.
\{n\} matches exactly n occurrences,
\{n,\} matches at least n occurrences, and
\{n,m\} matches any number of occurrences between
n and m.
|
{n,m} |
Like
\{n,m\}.
Available in grep by default and in
gawk with the -Wre-interval option.
|
\ |
Turn off the special meaning of the character that follows.
|
\(\) |
Save the matched text enclosed between \( and \) in a special holding
space. Up to nine patterns can be saved on a single line.
They can be "replayed" in the same pattern or within substitutions by the escape sequences \1
to \9.
|
\n |
Reuse matched text stored in nth \( \). |
() |
In egrep and gawk, save the matched text enclosed between
\( and \) in a holding space to be replayed in substitutions by the escape sequences \1
to \9.
|
\<\> |
Match the beginning (\<) or end (\>) of a word.
|
+ |
Match one or more instances of preceding regular expression. |
? |
Match zero or one instance of preceding regular expression. |
| |
Match the regular expression specified before or after. |
() |
Group regular expressions. |
Many utilities support POSIX character lists, which are useful for matching
non-ASCII characters in languages other than English. These lists are
recognized only within [] ranges. A typical use would be
[[:lower:]], which in English
is the same as [a-z].
The following table lists POSIX character lists:
Notation |
Action
|
[:alnum:] |
Alphanumeric characters |
[:alpha:] |
Alphabetic characters, uppercase and lowercase |
[:blank:] |
Printable whitespace: spaces and tabs but not control characters |
[:cntrl:] |
Control characters, such as ^A through ^Z |
[:digit:] |
Decimal digits |
[:graph:] |
Printable characters, excluding whitespace |
[:lower:] |
Lowercase alphabetic characters |
[:print:] |
Printable characters, including whitespace but not control characters |
[:punct:] |
Punctuation, a subclass of printable characters |
[:space:] |
Whitespace, including spaces, tabs, and some control characters |
[:upper:] |
Uppercase alphabetic characters |
[:xdigit:] |
Hexadecimal digits |
The following characters have special meaning in replacement
patterns:
Character |
Meaning |
\ |
Turn off the special meaning of the character that follows.
|
\n |
Restore the nth pattern previously saved by \( and \).
n is a number from 1 to 9, matching the
patterns searched sequentially from left to right.
|
& |
Reuse the search pattern as part of the replacement pattern.
|
~ |
Reuse the previous replacement pattern in the current replacement pattern. |
\e |
End replacement pattern started by
\L or \U. |
\E |
End replacement pattern started by
\L or \U. |
\l |
Convert first character of replacement pattern to lowercase. |
\L |
Convert replacement pattern to lowercase. |
\u |
Convert first character of replacement pattern to uppercase. |
\U |
Convert replacement pattern to uppercase. |
| | | 9.2. Metacharacters, Listed by Linux Program | | 9.4. Examples of Searching |
Copyright © 2001 O'Reilly & Associates. All rights reserved.
|
|