Chapter 9. Pattern Matching
A number of Linux text-editing utilities let you search for and, in some cases change, text patterns rather than fixed strings. These utilities include the editing programs ed, ex, vi, and sed; the awk scripting language; and the commands grep and egrep. Text patterns (also called regular expressions) contain normal characters mixed with special characters (also called metacharacters).
Perl's regular expression support is so rich that it does not fit into the tables in this chapter; you can find a description in the O'Reilly books Perl in a Nutshell, Perl 5 Pocket Reference, or Programming Perl. The Emacs editor also provides regular expressions similar to those shown in this chapter.
ed and ex are hardly ever used as standalone, interactive editors nowadays. But ed can be found as a batch processor invoked from shell scripts, and ex commands often are invoked within vi through the colon (:) command. We use vi in this chapter to refer to the regular expression features supported by both vi and the ex editor on which it is based.
sed and awk are widely used in shell scripts and elsewhere as filters to alter text.
This chapter presents the following information:
A thorough guide to pattern matching can be found in the Nutshell handbook Mastering Regular Expressions by Jeffrey E. F. Friedl.
9.1. Filenames Versus Patterns
Metacharacters used in pattern matching are different from those used for filename expansion. When you issue a command on the command line, special characters are seen first by the shell, then by the program; therefore, unquoted metacharacters are interpreted by the shell for filename expansion. The command:
$ grep [A-Z]* chap
could, for example, be interpreted by the shell as:
$ grep Array.c Bug.c Comp.c chap1 chap2
and grep then would try to find the pattern "Array.c" in files Bug.c, Comp.c, chap1, and chap2. To bypass the shell and pass the special characters to grep, use quotes:
$ grep "[A-Z]*" chap
Double quotes suffice in most cases, but single quotes are the safest bet.
Note also that * and ? have subtly different meanings in pattern matching and filename expansion.
Copyright © 2001 O'Reilly & Associates. All rights reserved.