home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam  


Webmaster in a Nutshell

Previous Chapter 15
Perl Quick Reference
Next
 

15.13 Regular Expressions

Each character matches itself, unless it is one of the special characters + ? . * ^ $ ( ) [ ] { } | \. The special meaning of these characters can be escaped using a \.

.

Matches an arbitrary character, but not a newline unless it is a single-line match (see m/ /s).

(...)

Groups a series of pattern elements to a single element.

^

Matches the beginning of the target. In multiline mode (see m//m) also matches after every newline character.

$

Matches the end of the line. In multiline mode also matches before every newline character.

[...]

Denotes a class of characters to match. [^] negates the class.

(... | ... | ...)

Matches one of the alternatives.

(?# text)

Comment.

(?: regexp)

Like (regexp) but does not make back-references.

(?= regexp)

Zero width positive look-ahead assertion.

(?! regexp)

Zero width negative look-ahead assertion.

(? modifier)

Embedded pattern-match modifier. modifier can be one or more of i, m, s, or x.

Quantified subpatterns match as many times as possible. When followed with a ? they match the minimum number of times. These are the quantifiers:

+

Matches the preceding pattern element one or more times.

?

Matches zero or one times.

*

Matches zero or more times.

{n,m}

Denotes the minimum n and maximum m match count. {n} means exactly n times; {n,} means at least n times.

A escapes any special meaning of the following character if non-alphanumeric, but it turns most alphanumeric characters into something special:

\w

Matches alphanumeric, including _, \W matches non-alphanumeric.

\s

Matches whitespace, \S matches non-whitespace.

\d

Matches numeric, \D matches non-numeric.

\A

Matches the beginning of the string, \Z matches the end.

\b

Matches word boundaries, \B matches non-boundaries.

\G

Matches where the previous m/ /g search left off.

\n, \r, \f, \t, etc.

Have their usual meaning.

\w, \s, and \d

May be used within character classes, \b denotes a backspace in this context.

Back-references:

\1...\9

Refer to matched subexpressions, grouped with ( ), inside the match.

\10 and up

Can also be used if the pattern matches that many subexpressions.

See also $1...$9, $+, $&, $`, and $' in Special Variables.

With modifier x, whitespace can be used in the patterns for readability purposes.


Previous Home Next
Array and List Functions Book Index Search and Replace Functions

HTML: The Definitive Guide CGI Programming JavaScript: The Definitive Guide Programming Perl WebMaster in a Nutshell