home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam  


UNIX Power Tools

UNIX Power ToolsSearch this book
Previous: 26.9 Valid Metacharacters for Different UNIX Programs Chapter 26
Regular Expressions (Pattern Matching)
Next: 27. Searching Through Files
 

26.10 Pattern Matching Quick Reference with Examples

Article 26.4 gives a tutorial introduction to regular expressions. This article is intended for those of you who just need a quick listing of regular expression syntax as a refresher from time to time. It also includes some simple examples. The characters in Table 26.6 have special meaning only in search patterns.

Table 26.6: Special Characters in Search Patterns
Pattern What Does it Match?
. Match any single character except newline.
*

Match any number (or none) of the single characters that immediately precede it. The preceding character can also be a regular expression. For example, since . (dot) means any character,.* means "match any number of any character."

^

Match the following regular expression at the beginning of the line.

$

Match the preceding regular expression at the end of the line.

[ ]

Match any one of the enclosed characters.

A hyphen (- ) indicates a range of consecutive characters. A caret (^ ) as the first character in the brackets reverses the sense: it matches any one character not in the list. A hyphen or a right square bracket (] ) as the first character is treated as a member of the list. All other metacharacters are treated as members of the list.

\{n ,m \}

Match a range of occurrences of the single character that immediately precedes it. The preceding character can also be a regular expression. \{n \} will match exactly n occurrences; \{n ,\} will match at least n occurrences; and \{n ,m \} will match any number of occurrences between n and m .

\

Turn off the special meaning of the character that follows.

\( \)

Save the pattern enclosed between \( and \) into a special holding space. Up to nine patterns can be saved on a single line. They can be "replayed" in substitutions by the escape sequences \1 to \9.

\< \>

Match characters at beginning (\< ) or end (\> ) of a word.

+

Match one or more instances of preceding regular expression.

?

Match zero or one instances of preceding regular expression.

|

Match the regular expression specified before or after.

( )

Apply a match to the enclosed group of regular expressions.

The characters in Table 26.7 have special meaning only in replacement patterns.

Table 26.7: Special Characters in Replacement Patterns
Pattern What Does it Match?
\

Turn off the special meaning of the character that follows.

\n

Restore the n th pattern previously saved by \( and \) .n is a number from 1 to 9, with 1 starting on the left.

&

Re-use the search pattern as part of the replacement pattern.

~

Re-use the previous replacement pattern in the current replacement pattern.

\u

Convert first character of replacement pattern to uppercase.

\U

Convert replacement pattern to uppercase.

\l

Convert first character of replacement pattern to lowercase.

\L

Convert replacement pattern to lowercase.

26.10.1 Examples of Searching

When used with grep or egrep , regular expressions are surrounded by quotes. (If the pattern contains a $ , you must use single quotes; e.g., ' pattern ' .) When used with ed , ex , sed , and awk , regular expressions are usually surrounded by / (although any delimiter works). Table 26.8 has some example patterns.

Table 26.8: Search Pattern Examples
Pattern What Does it Match?
bag The string bag .
^bag bag at beginning of line.
bag$ bag at end of line.
^bag$ bag as the only word on line.
[Bb]ag Bag or bag .
b[aeiou]g Second letter is a vowel.
b[^aeiou]g Second letter is a consonant (or uppercase or symbol).
b.g Second letter is any character.
^...$ Any line containing exactly three characters.
^\. Any line that begins with a . (dot).
^\.[a-z][a-z] Same, followed by two lowercase letters (e.g., troff requests).
^\.[a-z]\{2\} Same as previous, grep or sed only.
^[^.] Any line that doesn't begin with a . (dot).
bugs* bug , bugs , bugss , etc.
"word" A word in quotes.
"*word"* A word, with or without quotes.
[A-Z][A-Z]* One or more uppercase letters.
[A-Z]+ Same, egrep or awk only.
[A-Z].* An uppercase letter, followed by zero or more characters.
[A-Z]* Zero or more uppercase letters.
[a-zA-Z] Any letter.
[^0-9A-Za-z] Any symbol (not a letter or a number).
[567] One of the numbers 5 , 6 , or 7 .
egrep or awk pattern:
five|six|seven One of the words five , six , or seven .
80[23]?86 One of the numbers 8086 , 80286 , or 80386 .
compan(y|ies) One of the words company or companies .
ex or vi pattern:
\<the Words like theater or the .
the\> Words like breathe or the .
\<the\> The word the .
sed or grep pattern:
0\{5,\} Five or more zeros in a row.
[0-9]\{3\}-[0-9]\{2\}-[0-9]\{4\} US social security number (nnn - nn - nnnn ).

26.10.2 Examples of Searching and Replacing

The following examples show the metacharacters available to sed or ex . (ex  commands begin with a colon.) A space is marked by  ; a TAB is marked by tab .

Table 26.9: Search and Replace Commands
Command Result
s/.*/( & )/ Redo the entire line, but add parentheses.
s/.*/mv & &.old/ Change a wordlist into mv commands.
/^$/d Delete blank lines.
:g/^$/d ex version of previous.
/^[ tab ]*$/d Delete blank lines, plus lines containing only spaces or TABs.
:g/^[ tab ]*$/d ex version of previous.
s/  */ /g Turn one or more spaces into one space.
:%s/  */ /g ex version of previous.
:s/[0-9]/Item &:/ Turn a number into an item label (on the current line).
:s Repeat the substitution on the first occurrence.
:& Same.
:sg Same, but for all occurrences on the line.
:&g Same.
:%&g Repeat the substitution globally.
:.,$s/Fortran/\U&/g Change word to uppercase, on current line to last line.
:%s/.*/\L&/ Lowercase entire file.
:s/\<./\u&/g Uppercase first letter of each word on current line (useful for titles).
:%s/yes/No/g Globally change a word to No .
:%s/Yes/~/g Globally change a different word to No (previous replacement).
s/die or do/do or die/ Transpose words.
s/\([Dd]ie\) or \([Dd]o\)/\2 or \1/ Transpose, using hold buffers to preserve case.

- DG from O'Reilly & Associates' UNIX in a Nutshell (SVR4/Solaris)


Previous: 26.9 Valid Metacharacters for Different UNIX Programs UNIX Power Tools Next: 27. Searching Through Files
26.9 Valid Metacharacters for Different UNIX Programs Book Index 27. Searching Through Files

The UNIX CD Bookshelf NavigationThe UNIX CD BookshelfUNIX Power ToolsUNIX in a NutshellLearning the vi Editorsed & awkLearning the Korn ShellLearning the UNIX Operating System