Mastering Regular
Expressions Second Edition
Full Index -- use your browser's find function to search.
\? 139 \<...\> 21,
25, 50, 131-132, 150 \<...\>,
in egrep 15 \<...\>, in
Emacs 100 \<...\>, mimicking in
Perl 341-342 \+ 139 \(...\) 135 `\+' history 87 \0 116-117 \1 136, 300, 303 \1, in
Perl 41 \A 111, 127-128 \A, in
Java 373 \A,
optimization 246 \a 114-115 \b 65, 114-115, 400 \b, backspace and word
boundary 44, 46 \b, in
Perl 286 \b\B 240 \C 328 \D 49, 119 \d 49, 119 \d, in
Perl 288 \e 79, 114-115 \E 290 \f 114-115 \f, introduced 44 \G 128-131, 212, 315-316,
362 \G, advanced
example 130 \G, in
Java 373 \G,
in .NET 402 \G, optimization 246 \G, optimization, \kname (see
named capture) \l
290 \L...\E
290 \L...\E, inhibiting 292 \n 49, 114-115 \n, introduced 44 \n, machine-dependency 114 \N{LATIN SMALL LETTER SHARP
S} 290 \N{name}
290 \N{name}, inhibiting 292 \p{...} 119 \p{^...}
288 \p{all}
380 \p{All}
123 \p{All}, in
Perl 288 \p{Any} 123 \p{Any}, in
Perl 288 \p{Arrows} 122 \p{Assigned} 123-124 \p{Assigned}, in
Perl 288 \p{Basic_Latin} 122 \p{Box_Drawing} 122 \p{C} 120 \p{Cc} 121 \p{Cf} 121 \p{Cherokee} 120 \p{Close_Punctuation}
121 \p{Cn} 121,
123-124, 380, 401 \p{Co} 121 \p{Connector_Punctuation}
121 \p{Control}
121 \p{Currency}
122 \p{Currency_Symbol}
121 \p{Cyrillic}
120, 122 \p{Dash_Punctuation}
121 \p{Decimal_Digit_Number}
121 \p{Dingbats}
122 \p{Enclosing_Mark}
121 \p{Final_Punctuation}
121 \p{Format}
121 \p{Gujarati}
120 \p{Han}
120 \p{Hangul_Jamo}
122 \p{Hebrew} 120,
122 \p{Hiragana}
120 \p{InArrows}
122 \p{InBasic_Latin}
122 \p{InBox_Drawing}
122 \p{InCurrency}
122 \p{InCyrillic}
122 \p{InDingbats}
122 \p{InHangul_Jamo}
122 \p{InHebrew}
122 \p{Inherited}
122 \p{Initial_Punctuation}
121 \p{InKatakana}
122 \p{InTamil}
122 \p{InTibetan}
122 \p{IsCherokee}
120 \p{IsCommon}
122 \p{IsCyrillic}
120 \p{IsGujarati}
120 \p{IsHan}
120 \p{IsHebrew}
120 \p{IsHiragana}
120 \p{IsKatakana}
120 \p{IsLatin}
120 \p{IsThai}
120 \p{IsTibetan}
122 \p{Katakana}
120, 122 \p{L}
119-120, 131, 380, 390 \p{L&} 120-121,
123 \p{L&}, in
Perl 288 \p{Latin} 120 \p{Letter} 120, 288 \p{Letter_Number}
121 \p{Line_Separator}
121 \p{Ll} 121,
400 \p{Lm} 121,
400 \p{Lo} 121,
400 \p{Lowercase_Letter}
121 \p{Lt} 121,
400 \p{Lu} 121,
400 \p{M} 120,
125 \p{Mark}
120 \p{Math_Symbol}
121 \p{Mc}
121 \p{Me}
121 \p{Mn}
121 \p{Modifier_Letter}
121 \p{Modifier_Symbol}
121 \p{N} 120,
390 \p{Nd} 121, 380,
400 \p{Nl}
121 \p{No}
121 \p{Non_Spacing_Mark}
121 \p{Number}
120 \p{Open_Punctuation}
121 \p{Other}
120 \p{Other_Letter}
121 \p{Other_Number}
121 \p{Other_Punctuation}
121 \p{Other_Symbol}
121 \p{P}
120 \p{Paragraph_Separator}
121 \p{Pc} 121,
400 \p{Pd}
121 \p{Pe}
121 \p{Pf} 121,
400 \p{Pi} 121,
400 \p{Po}
121 \p{Private_Use}
121 \p{Ps}
121 \p{Punctuation}
120 \p{S}
120 \p{Sc}
121-122 \p{Separator} 120 \p{Sk} 121 \p{Sm} 121 \p{So} 121 \p{Space_Separator}
121 \p{Spacing_Combining_Mark}
121 \p{Symbol}
120 \p{Tamil}
122 \p{Thai}
120 \p{Tibetan}
122 \p{Titlecase_Letter}
121 \p{Unassigned}
121, 123 \p{Unassigned}, in
Perl 288 \p{Uppercase_Letter}
121 \p{Z} 119-120,
380, 400 \p{Zl}
121 \p{Zp}
121 \p{Zs}
121 \Q...\E
290 \Q...\E, inhibiting 292 \Q...\E, in
Java 373 \r 49, 114-115 \r, machine-dependency 114 \s 49, 119 \s, introduction 47 \s, in
Emacs 127 \s,
in Perl 288 \S 49, 56, 119 \t 49, 114-115 \t, introduced 44 \u 116, 290, 400 \U 116 \U...\E 290 \U...\E, inhibiting 292 \V 364 \v 114-115, 364 \W 49, 119 \w 49, 65, 119 \w, in
Emacs 127 \w,
many different interpretations
93 \w, in
Perl 288 \x 116, 400 \x, in
Perl 286 \X 107, 125 \z 111, 127-128, 316 \z, in
Java 373 \z,
optimization 246 \Z 111, 127-128 \Z, in
Java 373 \Z,
optimization 246 // 322 /c 129-130, 315 /e 319-321 /g 61, 130, 307, 311-312,
315, 319 /g, introduced 51 /g, with regex
object 354 /i 134 /i, introduced 47 /i, with
study 359 /m 134 /o 352-353 /o, with regex
object 354 /osmosis 293 /s 134 /x 134, 288 /x, introduced 72 /x, history 90 -Dr 363 -i as -y 86 -y old grep
86 <>
54 <>, and
$_ 79 !~ 309 $_ 79, 308, 311, 314, 318,
322, 353-354, 359 $_, in
.NET 418 $& 299-300 $&, checking
for 358 $&, mimicking 302, 357 $&, naughty 356 $&, in
.NET 418 $&, okay for
debugging 331 $&, pre-match
copy 355 $$ in .NET 418 $* 362 $ 111-112, 128 $, escaping 77 $, optimization 246 $, Perl
interpolation 289 $+ 300-301, 345 $+, example 202 $+, .NET 202 $+, in
.NET 418 $/ 35, 78 $' 300 $', checking
for 358 $',
mimicking 357 $', naughty 356 $', in
.NET 418 $',
okay for debugging 331 $', pre-match
copy 355 $` 300 $`, checking
for 358 $`,
mimicking 357 $`, naughty 356 $`, in
.NET 418 $`,
okay for debugging 331 $`, pre-match
copy 355 $0 300 $1 135-136, 300, 303 $1, introduced 41 $1, in
Java 388 $1,
in .NET 418 $1, in other
languages 136 $1, pre-match
copy 355 $ARGV 79 $HostnameRegex 76, 136, 303,
351 $HttpUrl 303,
305, 345, 351 $LevelN 330, 343 $^N 300-301, 344-346 ${name}
403 ${name~}
418 $NestedStuffRegex 339,
346 $^R 302,
327 $^W 297 % Perl interpolation
289 (?!) 240, 333,
335, 340-341 (?#...) 99, 134,
414 (?#...), in
Java 373 (?#...), in Java,
(?:...) (see
non-capturing parentheses) (?#...),
in Java, (...) (see
parentheses) (?#...), in Java, (?i) (see: case-insensitive
mode; mode modifier) (?#...), in Java, (?i:...) (see
mode-modified span) (?#...), in Java, (?if
then|else)
(see conditional) (?#...), in Java, (?m:...) (see
mode-modified span) (?#...), in Java, (?m) (see: enhanced
line-anchor mode; mode modifier) (?n) 402 .*, introduced 55 .*, mechanics of
matching 152 .*, optimization 246 .*, warning
about 56 .NET 399-432 .NET, $+ 202 .NET, flavor
overview 91 .NET,
after-match data 136 .NET, benchmarking 236 .NET, JIT
404 .NET, line
anchors 128 .NET,
literal-text mode 135 .NET, MISL
404 .NET, object
model 411 .NET, regex approach 96-97 .NET, regex
flavor 401 .NET, search-and-replace 408,
417-418 .NET, URL parsing
example 204 .NET,
version covered 91 .NET, word
boundaries 132 =~ 308-309, 318 =~, introduced 38 =~, introduced, ? (see question
mark) ?...?
308 @+ 300, 302,
314 @"..."
102 @- 300, 302,
339 @ Perl
interpolation 289 [=...=] 126 [:<:] 92 [:...:]
125-126 [.....] 126 \p{...} in java.util.regex
380 ^ 111-112,
128 ^, optimization 245-246 ^Subject: example 94,
151-152, 154, 242, 244-245, 289 ^Subject:
example, in Java 95,
393 ^Subject: example, in
Perl 55 ^Subject:
example, in Perl
debugger 361 ^Subject: example, in
Python 97 ^Subject:
example, in VB.NET
96 {min,max}
20, 140 \0
116-117 $0
300 \1 136, 300,
303 \1, in
Perl 41 $1 135-136, 300, 303 $1, introduced 41 $1, in
Java 388 $1,
in .NET 418 $1, in other
languages 136 $1, pre-match
copy 355 8859-1
encoding 29, 87, 105, 107, 121 \A 111, 127-128 \A, in
Java 373 \A,
optimization 246 @ escaping 77 \a 114-115 issues overview encoding 105 after-match variables, in
Perl 299 after-match
variables, pre-match
copy 355 Aho,
Alfred 86, 180 \p{All} 123 \p{All}, in
Perl 288 \p{all} 380 all-in-one object model 369 alternation 138 alternation, and
backtracking 231 alternation, introduced 13-14 alternation, efficiency 222, 231 alternation, greedy 174-175 alternation, hand
tweaking 260-261 alternation, order
of 175-177, 223, 260 alternation, order of, for correctness 28, 189,
197 alternation, order of,
for efficiency 224 alternation, and
parentheses 13 analogy, backtracking, bread crumbs 158-159 analogy, backtracking, stacking dishes 159 analogy, ball
rolling 261 analogy,
building a car 31 analogy, charging
batteries 179 analogy,
engines 143-147 analogy, first come, first
served 153 analogy,
gas additive 150 analogy, learning regexes, Pascal 36 analogy, learning regexes, playing rummy 33 analogy, regex as a
language 5, 27 analogy, regex as filename
patterns 4 analogy,
regex as filename patterns, regex-directed match (see
NFA) analogy, regex as filename
patterns, text-directed
match (see DFA) analogy, transmission 148-149, 228 analogy, transparencies (Perl's
local) 298 analogy, transparencies (Perl's
local), anchor
(also see: word boundaries; enhanced line-anchor mode) analogy, overview 127 analogy, caret
127 analogy, dollar 127 analogy, end-of-line
optimization 246 analogy, exposing 255 analogy, line
87, 111-112, 150 anchored(...)
362 anchored
`string' 362 AND class set operations
123-124 ANSI escape sequences
79 \p{Any}
123 \p{Any}, in
Perl 288 \p{Any}, in Perl, any character (see dot) Apache, org.apache.xerces.utils.regex
372 Apache, ORO 392-398 Apache, ORO, benchmark results 376 Apache, ORO, comparative description 374 Apache, Regexp, comparative description 375 Apache, Regexp, speed 376 appendReplacement()
388 appendTail()
389 $ARGV
79 \p{Arrows}
122 ASCII encoding 29,
105-106, 114, 121 Asian character
encoding 29 AssemblyName 429 \p{Assigned} 123-124 \p{Assigned}, in
Perl 288 \p{Assigned}, in Perl, asterisk (see star) \p{Assigned}, in Perl, atomic grouping (also see possessive
quantifiers) \p{Assigned}, introduced 137-138 \p{Assigned}, details 170-172 \p{Assigned}, for
efficiency 171-172, 259, 268-270 \p{Assigned}, essence 170-171 \p{Assigned}, example 198, 201, 213, 271, 330,
340-341, 346 AT&T Bell
Labs 86 auto-lookaheadification 403 automatic possessification
251 awk, after-match
data 136 awk, gensub 183 awk, history
87 awk, search-and-replace 99 awk, version
covered 91 awk, word boundaries 132 \b 65, 114-115, 400 \b, backspace and word
boundary 44, 46 \b, in
Perl 286 <B>...</B>
165-167 <B>...</B>,
unrolling 270 \b\B 240 backreferences 117, 135 backreferences, introduced with
egrep 20-22 backreferences, DFA 150, 182-183 backreferences, vs. octal
escape 406-407 backreferences, remembering
text 21 backreferences, remembering text, backspace (see \b) backtracking 163-177 backtracking, introduction 157-163 backtracking, and
alternation 231 backtracking, avoiding 171-172 backtracking, computing
count 227 backtracking, counting 222, 224 backtracking, detecting
excessive 249-250 backtracking, efficiency 179-180 backtracking, essence 168-169 backtracking, exponential
match 226 backtracking, global
view 228-232 backtracking, LIFO 159 backtracking, of
lookaround 173-174 backtracking, neverending
match 226 backtracking, non-match
example 160-161 backtracking, POSIX NFA
example 229 backtracking, saved
states 159 backtracking, simple
example 160 backtracking, simple lazy
example 161 balanced
constructs 328-331, 340-341, 430 balancing regex issues 186 Balling, Derek xxii Barwise, J. 85 base
character 107, 125 Basic
Regular Expressions 87-88 \p{Basic_Latin} 122 \b\B 240 beginOffset 396 benchmarking 232-239 benchmarking, comparative 248, 376-377 benchmarking, compile
caching 351 benchmarking, in
Java 234-236, 375-377 benchmarking, for naughty
variables 358 benchmarking, in
.NET 236, 404 benchmarking, with neverending
match 227 benchmarking, in
Perl 360 benchmarking,
pre-match copy 356 benchmarking, in
Python 237 benchmarking, in
Ruby 238 benchmarking,
in Tcl 239 Bennett, Mike xxi Berkeley 86 Better-Late-Than-Never 234-236,
375 <B>...</B>
165-167 <B>...</B>,
unrolling 270 blocks 122, 288, 380, 400 BLTN 235-236, 375 BOL 362 \p{Box_Drawing} 122 Boyer-Moore 244, 247 bracket expressions 125 BRE 87-88 bread-crumb analogy 158-159 Bulletin of Math. Biophysics
85 bump-along, introduction 148-149 bump-along, avoiding 210 bump-along, distrusting 215-218 bump-along, optimization 255 bump-along, in overall
processing 241 byte matching 328 /c 129-130, 315 /c, strings 102 \p{C} 120 \C 328 ¢ 122 C
comments, matching
272-276 C comments, unrolling 275-276 C comments, unrolling, caching (also see regex
objects) C comments, benchmarking 351 C
comments, compile
242-244 C comments, in
Emacs 244 C comments,
integrated 242 C comments, in
Java 393 C comments,
in .NET 426 C comments, object-oriented 244 C comments, procedural 243 C
comments, in Tcl
244 C comments, unconditional 350 CANON_EQ (Pattern
flag) 108, 380 Capture 431 CaptureCollection
432 car analogy
83-84 caret anchor introduced
8 carriage return
109 case title 109 case folding 290, 292 case folding, inhibiting 292 CASE_INSENSITIVE (Pattern
flag) 95, 109, 380, 383 case-insensitive mode 109 case-insensitive mode, introduced 14-15 case-insensitive mode, egrep 14-15 case-insensitive mode, /i 47 case-insensitive mode, Ruby 109 case-insensitive mode, with
study 359 cast 294-295 \p{Cc} 121 \p{Cf} 121 character, base 125 character, classes 117 character, combining 107, 125, 288 character, combining, Inherited script
122 character, vs. combining
characters 107 character, control 116 character, initial character
discrimination 244-246, 249, 251-252, 257-259,
332, 361 character, machine-dependent
codes 114 character,
multiple code points
107 character, as opposed to
byte 29 character,
separating with split
322 character, shorthands 114-115 character class, introduced 9-10 character class, vs.
alternation 13 character
class, mechanics of
matching 149 character
class, negated, must match
character 11-12 character
class, negated, and
newline 118 character
class, negated, Tcl 111 character
class, positive
assertion 118 character
class, of POSIX bracket
expression 125 character
class, range 9,
118 character class, as separate
language 10 character
equivalent 126 CharacterIterator
372 charnames pragma
290 CharSequence
372, 390 CheckNaughtiness 358 \p{Cherokee} 120 Chinese text processing 29 chr 414 chunk limit, Java
ORO 395 chunk limit,
java.util.regex
391 chunk limit, Perl 323 class, vs. dot
118 class, elimination
optimization 249 class, initial class
discrimination 244-246, 249, 251-252, 257-259,
332, 361 class, and lazy
quantifiers 167 class,
set operations 123-125,
375 class, subtraction 124 Clemens, Sam 375 Click, Cliff xxii client VM 234, 236 clock clicks 239 \p{Close_Punctuation}
121 closures 339 \p{Cn} 121, 123-124, 380,
401 \p{Co}
121 code point, introduced 106 code point, beyond
U+FFFF 108 code
point, multiple
107 code point, unassigned in
block 122 coerce 294-295 cold
VM 235 collating
sequences 126 combining
character 107, 125, 288 combining character, Inherited
script 122 com.ibm.regex, comparative
description 372 com.ibm.regex, speed 377 commafying a number example
64-65 commafying a number example, introduced 59 commafying a number example, in
Java 393 commafying a number
example, without
lookbehind 67 COMMAND.COM 7 comments 99, 134 comments, in
Java 98 comments,
matching of C comments
272-276 comments, matching of Pascal
comments 265 comments,
in .NET regex 414 COMMENTS (Pattern
flag) 99, 218, 378, 380, 386 comments and free-spacing mode
110 Communications of the
ACM 85 compile() 383 compile, caching 242-244 compile, once
(/o) 352-353 compile, on-demand 351 compile, regex
404-405 compile() (Pattern
factory) 383 Compiled (.NET) 236, 402,
404, 414, 421-422, 429 Compilers -- Principles,
Techniques, and Tools 180 CompileToAssembly 427,
429 com.stevesoft.pat, comparative description 374 com.stevesoft.pat, speed 377 conditional 138-139 conditional, with embedded
regex 327, 335 conditional, in
Java 373 conditional,
mimicking with lookaround
139 conditional, in
.NET 403 Config
module 290, 299 conflicting
metacharacters 44-46 \p{Connector_Punctuation}
121 Constable, Robert
85 Constable, Robert, forcing 310 Constable, Robert, metacharacters 44-46 Constable, Robert, regex
use 189 continuation
lines 178, 186-187 continuation lines, unrolling 270 contorting an expression
294-295 \p{Control}
121 control characters
116 Conway, Damian
339 cooking for HTML 68,
408 correctness vs.
efficiency 223-224 www.cpan.org 358 CR 109, 382 Cruise,
Tom 51 crummy
analogy 158-159 CSV parsing
example, java.util.regex 218,
386 CSV parsing example, .NET 429 CSV
parsing example, ORO
397 CSV parsing example, Perl 212-219 CSV
parsing example, unrolling 271 currency, \p{Currency} 122 currency, \p{Currency_Symbol}
121 currency, \p{Sc} 121 currency, Unicode
block 121-122 \p{Currency} 122 \p{Currency_Symbol}
121 currentTimeMillis()
236 \p{Cyrillic}
120, 122 \d 49,
119 \d, in
Perl 288 \D 49, 119 Darth 197 dash in
character class 9 \p{Dash_Punctuation}
121 DBIx::DWIW
258 debugcolor
363 debugging
361-363 debugging, with embedded
code 331-332 debugging, regex
objects 305-306 debugging, run-time 362 \p{Decimal_Digit_Number}
121 default regex
308 define-key
100 delegate 417-418 delimited text 196-198 delimited text, standard
formula 196, 273 delimiter, with
shell 7 delimiter,
with substitution 319 delimiter, with substitution, Deterministic Finite Automaton (see
DFA) Devel::FindAmpersand
358 Devel::SawAmpersand
358 DFA, introduced 145, 155 DFA, acronym spelled
out 156 DFA, backreferences 150, 182-183 DFA, boring
157 DFA, compared with
NFA 224, 227 DFA,
efficiency 179 DFA, implementation
ease 182 DFA, lazy evaluation 181 DFA, longest-leftmost
match 177-179 DFA,
testing for 146-147 DFA, in theory, same as an
NFA 180 dialytika 108 \p{Dingbats} 122 dish-stacking analogy 159 dollar for Perl variable 37 dollar anchor 127 dollar anchor, introduced 8 dollar value example 24-25, 51-52,
167-170, 175, 194-195 DOS
7 dot 118 dot, introduced 11-12 dot, vs. character
class 118 dot, mechanics of matching 149 dot, Tcl
112 .NET 399-432 .NET, $+ 202 .NET, flavor
overview 91 .NET,
after-match data 136 .NET, benchmarking 236 .NET, JIT
404 .NET, line
anchors 128 .NET,
literal-text mode 135 .NET, MISL
404 .NET, object
model 411 .NET, regex approach 96-97 .NET, regex
flavor 401 .NET, search-and-replace 408,
417-418 .NET, URL parsing
example 204 .NET,
version covered 91 .NET, word
boundaries 132 DOTALL (Pattern
flag) 380, 382 dot-matches-all mode 110-111 doubled-word example, description 1 doubled-word example, in
egrep 22 doubled-word
example, in Emacs
100 doubled-word example, in
Java 81 doubled-word
example, in Perl 35,
77-80 double-quoted string example, allowing escaped quotes 196 double-quoted string example, egrep 24 double-quoted string example, final
regex 263 double-quoted
string example, makudonarudo 165, 169,
228-232, 264 double-quoted string example, sobering example 222-228 double-quoted string example, unrolled 262, 268 double-word finder example, description 1 double-word finder example, in
egrep 22 double-word
finder example, in Emacs
100 double-word finder example, in
Java 81 double-word finder
example, in Perl 35,
77-80 -Dr
363 dragon book 180 DWIW (DBIx)
258 dynamic regex
327-331 dynamic regex, sanitizing 337 dynamic scope 295-299 dynamic scope, vs. lexical
scope 299 /e 319-321 \e 79, 114-115 \E 290 earliest match wins 148-149 EBCDIC 29 ECMAScript (.NET) 400, 402,
406-407, 415, 421 ed
85 ed, and
backtracking 179-180 ed, correctness 223-224 ed, Perl-specific
issues 347-363 ed, regex
objects 353-354 ed, unlimited
lookbehind 133 egrep, flavor
overview 91 egrep, introduced 6-8 egrep, metacharacter
discussion 8-22 egrep, after-match
data 136 egrep,
backreference support
150 egrep, case-insensitive
match 15 egrep,
doubled-word solution
22 egrep, example
use 14 egrep,
flavor summary 32 egrep, history 86-87 egrep, regex
implementation 182 egrep, version
covered 91 egrep, word
boundaries 132 electric
engine analogy 143-147 Emacs, flavor
overview 91 Emacs,
after-match data 136 Emacs, control
characters 116 Emacs,
re-search-forward
100 Emacs, search 100 Emacs, strings as
regexes 100 Emacs,
syntax class 127 Emacs, version
covered 91 Emacs,
word boundaries 132 email address example 70-73,
98 email address example, in
Java 98 email address
example, in VB.NET
99 embedded code, local 336 embedded code, my 338-339 embedded code, regex
construct 327, 331-335 embedded code, sanitizing 337 embedded string check optimization
247, 257 Embodiments of
Mind 85 Empty 426 \p{Enclosing_Mark}
121 \p{Enclosing_Mark}, introduced 29 \p{Enclosing_Mark}, issues
overview 105 \p{Enclosing_Mark}, ASCII 29, 105-106, 114, 121 \p{Enclosing_Mark}, Latin-1 29, 87, 105, 107,
121 \p{Enclosing_Mark}, UCS-2 106 \p{Enclosing_Mark}, UCS-4 106 \p{Enclosing_Mark}, UTF-16 106 \p{Enclosing_Mark}, UTF-8 106 end() 385 END block 358 endOffset 396 end-of-string anchor optimization
246 engine, introduced 27 engine, analogy 143-147 engine, hybrid
183, 239, 243 engine, implementation
ease 182 engine, testing type 146-147 engine, testing type, with neverending match 227 engine, type
comparison 156-157, 180-182 English module 357 English vs. regex 275 enhanced line-anchor mode
111-112 enhanced line-anchor mode, introduced 69 ERE 87-88 errata xxi Escape 427 escape, introduced 22 escape, term
defined 27 essence,
atomic grouping
170-171 essence, greediness,
laziness, and backtracking 168-169 essence, greediness, laziness, and
backtracking, NFA (see
backtracking) eval
319 example, atomic
grouping 198, 201, 213, 271, 330, 340-341,
346 example, commafying a
number 64-65 example,
commafying a number, introduced 59 example, commafying a number, in Java 393 example, commafying a number, without lookbehind 67 example, CSV parsing, java.util.regex 218,
386 example, CSV parsing,
.NET 429 example, CSV parsing, ORO 397 example, CSV parsing, Perl 212-219 example, CSV parsing, unrolling 271 example, dollar
value 24-25, 51-52, 167-170, 175,
194-195 example, double-quoted
string, allowing escaped
quotes 196 example,
double-quoted string, egrep 24 example, double-quoted string, final regex 263 example, double-quoted string, makudonarudo 165, 169,
228-232, 264 example, double-quoted
string, sobering
example 222-228 example, double-quoted string, unrolled 262, 268 example, double-word finder, description 1 example, double-word finder, in egrep 22 example, double-word finder, in Emacs 100 example, double-word finder, in Java 81 example, double-word finder, in Perl 35, 77-80 example, email
address 70-73, 98 example, email address, in Java 98 example, email address, in VB.NET 99 example, filename 190-192 example, five
modifiers 316 example,
floating-point number
194 example, form
letter 50-51 example,
gr[ea]y 9 example, hostname 22, 73, 76, 98-99, 136-137,
203, 260, 267, 304, 306 example, hostname, egrep 25 example, hostname, Java 209 example, hostname, plucking from text 71-73,
205-208 example, hostname,
in a URL 74-77 example, hostname, validating 203-205 example, hostname, VB.NET 204 example, HTML, conversion from text 67-77 example, HTML, cooking 68, 408 example, HTML, encoding 408 example, HTML, <HR> 194 example, HTML, link 201-203 example, HTML, optional 139 example, HTML, paired
tags 165 example,
HTML, parsing 130, 315, 321 example, HTML, tag 9, 18-19, 26, 200-201, 326,
357 example, HTML, URL 74-77, 203, 205-208,
303 example, HTML, URL-encoding 320 example, IP 5,
187-189, 267, 311, 314, 348-349 example, Jeffs 61-64 example, lookahead 61-64 example, mail
processing 53-59 example, makudonarudo 165, 169,
228-232, 264 example, pathname 190-192 example, population 59 example, possessive
quantifiers 198, 201 example, postal
code 208-212 example,
regex overloading
341-345 example, stock
pricing 51-52, 167-168 example, stock pricing, with alternation 175 example, stock pricing, with atomic grouping 170 example, stock pricing, with possessive quantifier
169 example, temperature
conversion, in .NET
419 example, temperature
conversion, in Java
389 example, temperature
conversion, in Perl
37 example, temperature
conversion, Perl
one-liner 283 example,
text-to-HTML 67-77 example, this|that 132, 138, 243,
245-246, 252, 255, 260-261 example, unrolling the loop 270-271 example, URL
74-77, 201-204, 208, 260, 303-304, 306, 320 example,
URL, egrep 25 example, URL, Java 209 example, URL, plucking 205-208 example, username 73, 76, 98 example, username, plucking from text 71-73 example, username, in
a URL 74-77 example,
variable names 24 example, ZIP
code 208-212 exception, IllegalArgumentException
383, 388 exception, IllegalStateException
385 exception, IndexOutOfBoundsException
384-385, 388 exception, IOException 81 exception, NullPointerException
396 exception, PatternSyntaxException 381,
383 Explicit
(Option) 409 ExplicitCapture (.NET) 402,
414, 421 exponential match
222-228, 330, 340 exponential match, avoiding 264-265 exponential match, discovery 226-228 exponential match, explanation 226-228 exponential match, non-determinism 264 exponential match, short-circuiting 250 exponential match, solving with atomic
grouping 268 exponential
match, solving with possessive
quantifiers 268 expose
literal text 255 expression, context 294-295 expression, contorting 294-295 Extended Regular Expressions
87-88 \f
114-115 \f, introduced 44 \f, introduced, Fahrenheit (see temperature
conversion example) failure, atomic
grouping 171-172 failure, forcing 240, 333, 335,
340-341 FF 109 file globs 4 file-check example 2, 36 filename, example 190-192 filename, patterns
(globs) 4 filename, prepending to
line 79 \p{Final_Punctuation}
121 find()
384 FindAmpersand
358 five modifiers example
316 Flanagan, David
xxii flavor, Perl 286-293 flavor, superficial chart, general 91 flavor, superficial chart, Perl 285, 287 flavor, superficial chart, POSIX 88 flavor, term
defined 27 flex version covered
91 floating
`string' 362 floating-point number example
194 forcing failure 240, 333,
335, 340-341 foreach vs. while vs.
if 320 form letter
example 50-51 \p{Format} 121 freeflowing regex 277-281 Friedl, Alfred 176 Friedl, brothers 33 Friedl, Fumie xxi Friedl, Fumie, birthday 11-12 Friedl, Liz 33 Friedl, Stephen xxii fully qualified name 295 functions related to regexes in Perl
285 \G 128-131, 212,
315-316, 362 \G, advanced
example 130 \G, in
Java 373 \G,
in .NET 402 \G, optimization 246 /g 61, 130, 307, 311-312,
315, 319 /g, introduced 51 /g, with regex
object 354 garbage
collection Java benchmarking 236 gas engine analogy 143-147 gensub 183 George, Kit xxii GetGroupNames
421-422 GetGroupNumbers
421-422 getMatch()
397 global vs. private Perl
variables 295 globs filename
4 GNU Java packages
374 GNU awk, after-match
data 136 GNU awk,
gensub 183 GNU awk, version
covered 91 GNU awk,
word boundaries 132 GNU egrep, after-match
data 136 GNU
egrep, backreference
support 150 GNU
egrep, doubled-word
solution 22 GNU
egrep, -i
bug 21 GNU
egrep, regex
implementation 182 GNU
egrep, word
boundaries 132 GNU
egrep, word boundaries, GNU Emacs (see Emacs) GNU grep, shortest-leftmost
match 183 GNU
grep, version
covered 91 GNU sed,
after-match data 136 GNU sed, version
covered 91 GNU sed,
word boundaries 132 gnu.regexp, comparative
description 374 gnu.regexp, speed 377 gnu.rex 374 Goldberger, Ray xxii Gosling, James 89 GPOS 362 gr[ea]y example 9 gr[ea]y example, introduced 151 gr[ea]y example, alternation 174-175 gr[ea]y example, and
backtracking 162-177 gr[ea]y example, deference to an
overall match 153, 274 gr[ea]y example, essence 159, 168-169 gr[ea]y example, favors
match 167-168 gr[ea]y example, first come, first
served 153 gr[ea]y
example, global vs.
local 182 gr[ea]y
example, in Java
373 gr[ea]y example, vs.
lazy 169, 256-257 gr[ea]y example, localizing 225-226 gr[ea]y example, quantifier 139-140 gr[ea]y example, too
greedy 152 green
dragon 180 grep, flavor
overview 91 grep, as an
acronym 85 grep, history 86 grep, regex
flavor 86 grep,
version covered 91 grep, -y
option 86 grep in Perl 324 group(), java.util.regex 385 group(), ORO 396 Group object (.NET)
412 Group object (.NET), Capture 431 Group object (.NET), creating 423 Group object (.NET), Index 424 Group object (.NET), Length 424 Group object (.NET), Success 424 Group object (.NET), ToString 424 Group object (.NET), using 424 Group object (.NET), Value 424 GroupCollection 423,
432 groupCount()
385 grouping and capturing
20-22 GroupNameFromNumber
421-422 GroupNumberFromName
421-422 groups() ORO
397 Groups
Match object method 423 \p{Gujarati} 120 Gutierrez, David xxii \p{Han} 120 hand tweaking, alternation 260-261 hand tweaking, caveats 253 \p{Hangul_Jamo} 122 HASH(0x80f60ac) 257 \p{Hebrew} 120, 122 hex escape 116-117 hex escape, in
Java 373 hex escape,
in Perl 286 Hietaniemi, Jarkko xxii highlighting with ANSI escape
sequences 79 \p{Hiragana} 120 history, `\+' 87 history, AT&T Bell
Labs 86 history, awk 87 history, Berkeley 86 history, ed
trivia 86 history,
egrep 86-87 history, grep 86 history, lex 87 history, Perl
88-90, 308 history, of
regexes 85-91 history,
sed 87 history, underscore in
\w 89 history, /x 90 hostname example 22, 73, 76, 98-99,
136-137, 203, 260, 267, 304, 306 hostname example,
egrep 25 hostname example, Java 209 hostname
example, plucking from
text 71-73, 205-208 hostname
example, in a URL
74-77 hostname example, validating 203-205 hostname example, VB.NET 204 $HostnameRegex 76, 136, 303,
351 hot VM 235, 375 HTML, cooking
68, 408 HTML, matching
tag 200-201 HTML
example, conversion from
text 67-77 HTML
example, cooking 68,
408 HTML example, encoding 408 HTML
example, <HR> 194 HTML example, link 201-203 HTML
example, optional
139 HTML example, paired
tags 165 HTML example,
parsing 130, 315, 321 HTML example, tag 9, 18-19, 26, 200-201, 326,
357 HTML example, URL 74-77, 203, 205-208, 303 HTML example, URL-encoding 320 HTTP newlines 115 HTTP URL example 25, 74-77, 201-209,
260, 303-304, 306, 320 http://regex.info/ xxi, 7,
345, 372 $HttpUrl
303, 305, 345, 351 hybrid regex
engine 183, 239, 243 hyphen
in character class 9 -i as -y 86 /i 134 /i, introduced 47 /i, with
study 359 /i, with study, (?i) (see: case-insensitive
mode; mode modifier) IBM (Java package), comparative description 372 IBM (Java package), speed 377 identifier matching 24 if vs. while vs.
foreach 320 IgnoreCase
(.NET) 96, 99, 402, 413, 421 IgnorePatternWhitespace
(.NET) 99, 402, 413, 421 IllegalArgumentException 383,
388 IllegalStateException
385 implementation of engine
182 implicit
362 implicit anchor
optimization 246 Imports 407, 409,
428 \p{InArrows}
122 \p{InBasic_Latin}
122 \p{InBox_Drawing}
122 \p{InCurrency}
122 \p{InCyrillic}
122 Index, Group
object method 424 Index, Match object
method 423 IndexOutOfBoundsException
384-385, 388 \p{InDingbats} 122 indispensable TiVo 3 \p{InHangul_Jamo}
122 \p{InHebrew}
122 \p{Inherited}
122 initial class
discrimination 244-246, 249, 251-252, 257-259,
332, 361 \p{Initial_Punctuation}
121 \p{InKatakana}
122 \p{InTamil}
122 integrated handling
94-95 integrated handling, compile
caching 242 interpolation 288-289 interpolation, introduced 77 interpolation, caching 351 interpolation, mimicking 321 interpolation, in
PHP 103 INTERSECTION class set operations
124 interval 140 interval, introduced 20 interval, [X{0,0}] 140 \p{InTibetan} 122 IOException 81 IP example 5, 187-189, 267, 311, 314,
348-349 Iraq 11 Is vs. In 120,
122-123 Is vs. In, with java.util.regex
380 Is vs. In, in
.NET 401 Is vs.
In, in Perl
288 \p{IsCherokee}
120 \p{IsCommon}
122 \p{IsCyrillic}
120 \p{IsGujarati}
120 \p{IsHan}
120 \p{IsHebrew}
120 \p{IsHiragana}
120 \p{IsKatakana}
120 \p{IsLatin}
120 IsMatch (Regex object
method) 415 ISO-8859-1
encoding 29, 87, 105, 107, 121 \p{IsThai} 120 \p{IsTibetan} 122 Japanese, text
processing 29 “japhy” 246 Java 365-398 Java, benchmarking 234-236 Java, BLTN
235-236, 375 Java, choosing a regex
package 366 Java,
exposed mechanics 374 Java, fastest
package 377 Java,
JIT 235 Java, list of
packages 372 Java,
matching comments
272-276 Java, object
models 368-372 Java,
package flavor comparison
373 Java, “Perl5
flavors” 375 Java,
strings 102 Java, version
covered 91 Java, VM 234-236, 375 java.util.regex 95-96,
378-391 java.util.regex, after-match data 136 java.util.regex, code
example 383, 389 java.util.regex, comparative
description 372 java.util.regex, CSV
parsing 386 java.util.regex, dot
modes 111 java.util.regex, doubled-word
example 81 java.util.regex, line
anchors 128 java.util.regex, line
terminators 382 java.util.regex, match
modes 380 java.util.regex, object
model 381 java.util.regex, regex
flavor 378-381 java.util.regex, search-and-replace 387 java.util.regex, speed 377 java.util.regex, split 390 java.util.regex, URL parsing
example 209 java.util.regex, version
covered 91 java.util.regex, word
boundaries 132 Jeffs
example 61-64 JfriedlsRegexLibrary 428-429 JIT, Java
235 JIT, .NET 404 JRE 234 jregex comparative
description 374 \p{Katakana} 120,
122 keeping in sync
210-211 Keisler, H. J.
85 Kleene, Stephen
85 The Kleene
Symposium 85 Korean text
processing 29 Kunen,
K. 85 \p{L&} 120-121,
123 \p{L&}, in
Perl 288 \p{L} 119-120, 131, 380,
390 £ 122 \l 290 \l, character
class 10, 13 \l, identifiers 24 \p{Latin} 120 Latin-1 encoding 29, 87, 105, 107,
121 lazy 166-167 lazy, essence
159, 168-169 lazy, favors
match 167-168 lazy,
vs. greedy 169,
256-257 lazy, in
Java 373 lazy, optimization 249, 256 lazy, quantifier 140 lazy evaluation 181, 355 \L...\E 290 \L...\E, inhibiting 292 lc 290 lcfirst 290 leftmost match 177-179 Length, Group object
method 424 Length, Match object
method 423 length() ORO 396 length-cognizance optimization 245,
247 \p{Letter} 120,
288 \p{Letter_Number}
121 $LevelN 330,
343 lex 86 lex, $ 111 lex, dot 110 lex, history 87 lex, and trailing
context 182 lexer building 130, 315 lexical scope 299 LF 109, 382 Li,
Yadong xxii LIFO
backtracking 159 limit, backtracking 237 limit, recursion 249-250 limit, recursion, line (also see string) limit, anchor
optimization 246 limit, vs.
string 55 line
anchor 111-112 line
anchor, mechanics of
matching 150 line
anchor, variety of
implementations 87 line
feed 109 LINE
SEPARATOR 109, 121, 382 line
terminators 108-109, 111, 128, 382 line terminators, with $ and
^ 111 \p{Line_Separator}
121 link, matching 201 link, matching, Java 204, 209 list context 294, 310-311 list context, forcing 310 literal string initial string
discrimination 244-246, 249, 251-252, 257-259, 332, 361 literal text, introduced 5 literal text, exposing 255 literal text, mechanics of
matching 149 literal
text, pre-check
optimization 244-246, 249, 251-252, 257-259,
332, 361 literal-text mode
112, 134-135, 290 literal-text mode, inhibiting 292 \p{Ll} 121, 400 \p{Lm} 121, 400 \p{Lo} 121, 400 local 296, 341 local, in embedded
code 336 local, vs.
my 297 locale 126 locale, overview 87 locale, \w 119 localizing 296-297 localtime 294, 319,
351 locking in regex literal
352 “A logical calculus of the ideas imminent in nervous
activity” 85 longest
match finding 334-335 longest-leftmost match 148,
177-179 lookahead
132 lookahead, introduced 60 lookahead, auto 403 lookahead, example 61-64 lookahead, in
Java 373 lookahead,
mimic atomic grouping
174 lookahead, mimic
optimizations 258-259 lookahead, negated, <B>...</B>
167 lookahead, positive vs.
negative 66 lookaround, introduced 59 lookaround, backtracking 173-174 lookaround, in
conditional 139 lookaround, and
DFAs 182 lookaround,
doesn't consume text
60 lookaround, mimicking class set
operations 124 lookaround, mimicking word
boundaries 132 lookaround, in
Perl 288 lookbehind 132 lookbehind, in
Java 373 lookbehind,
in .NET 402 lookbehind, in
Perl 288 lookbehind,
positive vs. negative
66 lookbehind, unlimited 402 lookingAt() 385 Lord, Tom 182 \p{Lowercase_Letter}
121 LS 109, 121, 382 \p{Lt} 121, 400 \p{Lu} 121, 400 Lunde, Ken xxii, 29 \p{M} 120, 125 /m 134 m/.../ introduced
38 machine-dependent character
codes 114 MacOS 114 mail
processing example 53-59 makudonarudo example 165,
169, 228-232, 264 \p{Mark} 120 match 306-318 match, actions
95 match, context 294-295, 309 match, context, list 294, 310-311 match, context, scalar 294, 310, 312-316 match, DFA vs.
NFA 224 match, efficiency 179 match, example with
backtracking 160 match, example without
backtracking 160 match, lazy
example 161 match,
leftmost-longest 335 match, longest
334-335 match, m/.../, introduced 38 match, m/.../, introduced, mechanics (also see: greedy;
lazy) match, m/.../, .* 152 match, m/.../, greedy introduced 151 match, m/.../, anchors 150 match, m/.../, capturing parentheses 149 match, m/.../, character classes and dot
149 match, m/.../, consequences 156 match, m/.../, literal text 149 match, modes
109-112 match, modes, java.util.regex
380 match, negating 309 match, neverending 222-228, 330,
340 match, neverending, avoiding 264-265 match, neverending, discovery 226-228 match, neverending, explanation 226-228 match, neverending, non-determinism 264 match, neverending, short-circuiting 250 match, neverending, solving with atomic grouping
268 match, neverending, solving with possessive quantifiers
268 match, NFA vs.
DFA 156-157, 180-182 match, NFA vs. DFA, position (see pos) match, POSIX, in
Perl 335 match, shortest-leftmost 183 match, side
effects 317 match,
side effects, intertwined 43 match, side effects, Perl 40 match, speed
181 match, in a
string 27 match, tag-team 130 match, viewing
mechanics 331-332 Match Empty
426 match()
393 Match (.NET)
Success 96 Match object
(.NET) 411 Match
object (.NET), Capture 431 Match object (.NET), creating 415, 423 Match object (.NET), Groups 423 Match object (.NET), Index 423 Match object (.NET), Length 423 Match object (.NET), NextMatch 423 Match object (.NET), Result 423 Match object (.NET), Success 421 Match object (.NET), Synchronized 424 Match object (.NET), ToString 422 Match object (.NET), using 421 Match object (.NET), Value 422 Match (Regex object
method) 415 “match rejected
by optimizer” 363 match
result object model 371 match
state object model 370 MatchCollection 416 matcher() (Pattern
method) 384 Matcher
object 384 Matcher
object, reusing
387 matches, unexpected 194-195 matches, viewing
all 332 matches()
(Pattern method) 384, 390 Matches (Regex object
method) 416 MatchEvaluator
417-418 matching, delimited
text 196-198 matching,
HTML tag 200 matching, longest-leftmost 177-179 MatchObject object (.NET)
creating 416 \p{Math_Symbol} 121 Maton, William xxii, 36 MBOL 362 \p{Mc} 121 McCloskey, Mike xxii McCulloch, Warren 85 \p{Me} 121 mechanics viewing 331-332 metacharacter, introduced 5 metacharacter, conflicting 44-46 metacharacter, differing
contexts 10 metacharacter, first-class 87, 92 metacharacter, vs.
metasequence 27 metasequence defined 27 mimic, $` 357 mimic, $' 357 mimic, $& 302, 357 mimic, atomic
grouping 174 mimic,
class set operations
124 mimic, conditional with
lookaround 139 mimic,
initial-character discrimination
optimization 258-259 mimic, named
capture 344-345 mimic,
POSIX matching 335 mimic, possessive
quantifiers 343-344 mimic, variable
interpolation 321 mimic, word
boundaries 66, 132, 341-342 minlen length
362 minus in character class
9 MISL .NET 404 \p{Mn} 121 mode modifier 109, 133-135 mode-modified span 109, 134 modes introduced with egrep
14-15 \p{Modifier_Letter}
121 \p{Modifier_Letter}, combining 69 \p{Modifier_Letter}, example with
five 316 \p{Modifier_Letter}, /g 51 \p{Modifier_Letter}, /i 47 \p{Modifier_Letter}, “locking
in” 304-305 \p{Modifier_Letter}, notation 98 \p{Modifier_Letter}, /osmosis 293 \p{Modifier_Letter}, Perl
core 292-293 \p{Modifier_Letter}, with regex
object 304-305 \p{Modifier_Symbol}
121 Mui, Linda xxii multi-character quotes
165-166 Multiline
(.NET) 402, 413-414, 421 MULTILINE (Pattern
flag) 81, 380, 382 multiple-byte character encoding
29 MungeRegexLiteral
342-344, 346 my, binding 339 my, in embedded
code 338-339 my, vs.
local 297 MySQL, after-match
data 136 MySQL, DBIx::DWIW 258 MySQL, version
covered 91 MySQL,
word boundaries 132 \n 49, 114-115 \n, introduced 44 \n, machine-dependency 114 \p{N} 120, 390 (?n) 402 $^N 300-301, 344-346 named capture 137 named capture, mimicking 344-345 named capture, .NET 402 named
capture, with unnamed
capture 403 naughty
variables 356 naughty
variables, okay for
debugging 331 \p{Nd} 121, 380, 400 negated class, introduced 10-11 negated class, and lazy
quantifiers 167 negated
class, Tcl 111 negated class, Tcl, negative lookahead (see lookahead,
negative) negated class, Tcl,
negative lookbehind (see
lookbehind, negative) NEL
109, 382, 400 nervous system
85 nested constructs, .NET 430 nested
constructs, Perl
328-331, 340-341 $NestedStuffRegex 339,
346 .NET 399-432 .NET, $+ 202 .NET, flavor
overview 91 .NET,
after-match data 136 .NET, benchmarking 236 .NET, JIT
404 .NET, line
anchors 128 .NET,
literal-text mode 135 .NET, MISL
404 .NET, object
model 411 .NET, regex approach 96-97 .NET, regex
flavor 401 .NET, search-and-replace 408,
417-418 .NET, URL parsing
example 204 .NET,
version covered 91 .NET, word
boundaries 132 neurophysiologists early regex study
85 neverending match 222-228,
330, 340 neverending match, avoiding 264-265 neverending match, discovery 226-228 neverending match, explanation 226-228 neverending match, non-determinism 264 neverending match, short-circuiting 250 neverending match, solving with atomic
grouping 268 neverending
match, solving with possessive
quantifiers 268 New
Regex 96, 99, 410, 415 newline and HTTP 115 NEXT LINE 109, 382, 400 NextMatch (Match object
method) 423 NFA, first introduced 145 NFA, introduction 153 NFA, acronym spelled
out 156 NFA, and alternation 174-175 NFA, compared with
DFA 156-157, 180-182 NFA, control
benefits 155 NFA,
efficiency 179 NFA, efficiency, essence (see backtracking) NFA, freeflowing
regex 277-281 NFA,
and greediness 162 NFA, implementation
ease 182 NFA, nondeterminism 265 NFA, nondeterminism, checkpoint 264 NFA, POSIX
efficiency 179 NFA,
testing for 146-147 NFA, theory
180 Nicholas, Ethan
xxii \p{Nl}
121 \N{LATIN SMALL LETTER SHARP
S} 290 \N{name}
290 \N{name}, inhibiting 292 \p{No} 121 no re 'debug' 361 no_match_vars 357 nomenclature 27 non-capturing parentheses 45, 136-137,
373 non-capturing parentheses, (also
see parentheses), Nondeterministic Finite
Automaton (see NFA) None (.NET) 415, 421 nonillion 226 nonregular sets 180 \p{Non_Spacing_Mark}
121 “normal” 262-266 null 116 null, with dot
118 NullPointerException
396 \p{Number}
120 /o
352-353 /o, with regex
object 354 Obfuscated Perl
Contest 320 object
model, Java
368-372 object model, .NET 410-411 Object Oriented Perl
339 object-oriented handling
95-97 object-oriented handling, compile caching 244 octal escape 115, 117 octal escape, vs.
backreference 406-407 octal
escape, in Java
373 octal escape, in
Perl 286 on-demand
recompilation 351 oneself example 332,
334 \p{Open_Punctuation}
121 operators Perl list
285 optimization
239-252 optimization, automatic
possessification 251 optimization, BLTN 235-236, 375 optimization, with
bump-along 255 optimization, end-of-string
anchor 246 optimization, excessive
backtrack 249-250 optimization, hand
tweaking 252-261 optimization, implicit line
anchor 191 optimization, initial character
discrimination 244-246, 249, 251-252, 257-259,
332, 361 optimization, JIT 235, 404 optimization, lazy
evaluation 181 optimization, lazy
quantifier 249, 256 optimization, leading [.*] 246 optimization, literal-string
concatenation 247 optimization, need
cognizance 252 optimization, needless class
elimination 249 optimization, needless
parentheses 248 optimization, pre-check of required
character 244-246, 249, 251-252, 257-259, 332,
361 optimization, simple
repetition, discussed
247-248 optimization, small
quantifier equivalence 251-252 optimization, state
suppression 250-251 optimization, string/line
anchors 149, 181 optimization, super-linear
short-circuiting 250 Option (.NET) 409 Option (.NET), whitespace 18 Options (Regex object
method) 421 OR class set operations
123-124 Oram, Andy xxii,
5 ordered alternation
175-177 ordered alternation, pitfalls 176 org.apache.oro.text.regex
392-398 org.apache.oro.text.regex, benchmark results 376 org.apache.oro.text.regex, comparative description 374 org.apache.regexp, comparative
description 375 org.apache.regexp, speed 376 org.apache.xerces.utils.regex
372 ORO 392-398 ORO, benchmark
results 376 ORO, comparative description 374 osmosis 293 /osmosis 293 \p{Other} 120 \p{Other_Letter} 121 \p{Other_Number} 121 \p{Other_Punctuation}
121 \p{Other_Symbol}
121 our 295,
336 overload pragma
342 \p{...}
119 \p{P}
120 \p{^...}
288 \p{All}
123 \p{All}, in
Perl 288 \p{all} 380 panic: top_env 332 \p{Any} 123 \p{Any}, in
Perl 288 Papen,
Jeffrey xxii PARAGRAPH
SEPARATOR 109, 121, 382 \p{Paragraph_Separator}
121 parentheses, as
\(...\) 86 parentheses, and
alternation 13 parentheses, balanced 328-331, 340-341,
430 parentheses, balanced,
difficulty 193-194 parentheses, capturing 135-136, 300 parentheses, capturing, introduced with egrep
20-22 parentheses, capturing,
and DFAs 150, 182 parentheses, capturing, mechanics 149 parentheses, capturing, in Perl 41 parentheses, capturing
only 152 parentheses,
counting 21 parentheses, elimination
optimization 248 parentheses, elimination optimization,
grouping-only (see
non-capturing parentheses) parentheses, limiting scope 18 parentheses, named
capture 137, 344-345, 402-403 parentheses, nested 328-331, 340-341, 430 parentheses, non-capturing 45, 136-137 parentheses, non-capturing, in Java 373 parentheses, non-participating 300 parentheses, with split, Java ORO 395 parentheses, with split, .NET 403, 420 parentheses, with split, Perl 326 \p{Arrows} 122 parsing regex 404 participate in match 139 Pascal 36, 59, 182 Pascal, matching comments
of 265 \p{Assigned} 123-124 \p{Assigned}, in
Perl 288 Pat (Java
Package), comparative
description 374 Pat (Java
Package), speed
377 patch 88 pathname example 190-192 Pattern, CANON_EQ 108, 380 Pattern, CASE_INSENSITIVE 95, 109,
380, 383 Pattern, COMMENTS 99, 218, 378, 380,
386 Pattern, compile() 383 Pattern, DOTALL 380, 382 Pattern, matcher() 384 Pattern, matches() 384, 390 Pattern, MULTILINE 81, 380,
382 Pattern, UNICODE_CASE 380,
383 Pattern, UNIX_LINES 380, 382 PatternSyntaxException 381,
383 \p{Basic_Latin}
122 \p{Box_Drawing}
122 \p{Pc} 121,
400 \p{C}
120 \p{Cc}
121 \p{Cf}
121 \p{Cherokee}
120 \p{Close_Punctuation}
121 \p{Cn} 121,
123-124, 380, 401 \p{Co} 121 \p{Connector_Punctuation}
121 \p{Control}
121 PCRE, lookbehind 132 PCRE, version
covered 91 \p{Currency} 122 \p{Currency_Symbol}
121 \p{Cyrillic}
120, 122 \p{Pd}
121 \p{Dash_Punctuation}
121 \p{Decimal_Digit_Number}
121 \p{Dingbats}
122 \p{Pe}
121 PeakWebhosting.com
xxii \p{Enclosing_Mark}
121 people, Aho,
Alfred 86, 180 people,
Balling, Derek xxii people, Barwise,
J. 85 people, Bennett, Mike xxi people, Clemens,
Sam 375 people, Click, Cliff xxii people, Constable,
Robert 85 people,
Conway, Damian 339 people, Cruise,
Tom 51 people, Flanagan, David xxii people, Friedl,
Alfred 176 people,
Friedl, brothers 33 people, Friedl,
Fumie xxi people,
Friedl, Fumie, birthday 11-12 people, Friedl,
Liz 33 people, Friedl, Stephen xxii people, George,
Kit xxii people, Goldberger, Ray xxii people, Gosling,
James 89 people, Gutierrez, David xxii people, Hietaniemi,
Jarkko xxii people,
Keisler, H. J. 85 people, Kleene,
Stephen 85 people,
Kunen, K. 85 people, Li,
Yadong xxii people,
Lord, Tom 182 people, Lunde,
Ken xxii, 29 people,
Maton, William xxii,
36 people, McCloskey,
Mike xxii people,
McCulloch, Warren 85 people, Mui,
Linda xxii people,
Nicholas, Ethan xxii people, Oram,
Andy xxii, 5 people,
Papen, Jeffrey xxii people, Perl
Porters 90 people,
Pinyan, Jeff 246 people, Pitts,
Walter 85 people,
Purcell, Shawn xxii people, Reed,
Jessamyn xxii people,
Reinhold, Mark xxii people, Rudkin,
Kristine xxii people,
Savarese, Daniel xxii people, Sethi,
Ravi 180 people, Spencer, Henry 88, 182-183,
243 people, Thompson,
Ken 85-86, 110 people,
Trapszo, Kasia xxii people, Tubby
264 people, Ullman,
Jeffrey 180 people,
Wall, Larry 88-90, 138,
363-364 people, Wilson,
Dean xxii people,
Woodward, Josh xxii people, Zawodny,
Jeremy xxii, 258 Perl,
$/ 35 Perl, flavor
overview 91, 287 Perl,
introduction 37-38 Perl, introduction, context (also see match,
context) Perl, introduction,
contorting 294 Perl, efficiency 347-363 Perl, greatest
weakness 286 Perl,
history 88-90, 308 Perl, in Java
375, 392 Perl, line
anchors 128 Perl,
modifiers 292-293 Perl, motto
348 Perl, option, -0 36 Perl, option, -c 361 Perl, option, -Dr 363 Perl, option, -e 36, 53, 361 Perl, option, -i 53 Perl, option, -M 361 Perl, option, -Mre=debug 363 Perl, option, -n 36 Perl, option, -p 53 Perl, option, -w 38, 296, 326,
361 Perl, regex
operators 285 Perl,
version covered 91 Perl, warnings
38 Perl, warnings, ($^W variable) 297 Perl, warnings, use warnings 326,
363 Perl Porters 90 Perl5Util 392, 396 perladmin 299 \p{Pf} 121, 400 \p{Final_Punctuation}
121 \p{Format}
121 \p{Gujarati}
120 \p{Han}
120 \p{Hangul_Jamo}
122 \p{Hebrew} 120,
122 \p{Hiragana}
120 PHP, after-match
data 136 PHP, line anchors 128 PHP, lookbehind 132 PHP, mode
modifiers 133 PHP,
strings 103 PHP, version
covered 91 PHP, word boundaries 132 \p{Pi} 121, 400 \p{InArrows} 122 \p{InBasic_Latin}
122 \p{InBox_Drawing}
122 \p{InCurrency}
122 \p{InCyrillic}
122 \p{InDingbats}
122 \p{InHangul_Jamo}
122 \p{InHebrew}
122 \p{Inherited}
122 \p{Initial_Punctuation}
121 \p{InKatakana}
122 \p{InTamil}
122 \p{InTibetan}
122 Pinyan, Jeff 246 \p{IsCherokee} 120 \p{IsCommon} 122 \p{IsCyrillic} 120 \p{IsGujarati} 120 \p{IsHan} 120 \p{IsHebrew} 120 \p{IsHiragana} 120 \p{IsKatakana} 120 \p{IsLatin} 120 \p{IsThai} 120 \p{IsTibetan} 122 Pitts, Walter 85 \p{Katakana} 120,
122 \p{L} 119-120,
131, 380, 390 \p{L&} 120-121,
123 \p{L&}, in
Perl 288 \p{Latin} 120 \p{Letter} 120, 288 \p{Letter_Number}
121 \p{Line_Separator}
121 \p{Ll} 121,
400 \p{Lm} 121,
400 \p{Lo} 121,
400 \p{Lowercase_Letter}
121 \p{Lt} 121,
400 \p{Lu} 121,
400 plus, as
\+ 139 plus,
introduced 18-20 plus, backtracking 162 plus, greedy
139 plus, lazy 140 plus, possessive 140 \p{M} 120, 125 \p{Mark} 120 \p{Math_Symbol} 121 \p{Mc} 121 \p{Me} 121 \p{Mn} 121 \p{Modifier_Letter}
121 \p{Modifier_Symbol}
121 \p{N} 120,
390 \p{Nd} 121, 380,
400 \p{Nl}
121 \p{No}
121 \p{Non_Spacing_Mark}
121 \p{Number}
120 \p{Po}
121 \p{Open_Punctuation}
121 population example
59 pos 128-131,
313-314, 316 pos, (also see
\G), positive
lookahead (see lookahead, positive) pos, (also see \G),
positive lookbehind (see
lookbehind, positive) POSIX, [:...:]
125 POSIX, [.....]
126 POSIX, Basic Regular
Expressions 87-88 POSIX, bracket
expressions 125 POSIX,
character class 125 POSIX, character class and
locale 126 POSIX,
character equivalent
126 POSIX, collating
sequences 126 POSIX,
dot 118 POSIX, empty
alternatives 138 POSIX, Extended Regular
Expressions 87-88 POSIX, superficial flavor
chart 88 POSIX, in Java 374 POSIX, locale
126 POSIX, locale, overview 87 POSIX, longest-leftmost
rule 177-179, 335 POSIX
NFA, backtracking
example 229 POSIX NFA,
testing for 146-147 possessive quantifiers 140,
172-173 possessive quantifiers, automatic 251 possessive quantifiers, for
efficiency 259, 268-270 possessive quantifiers, example 198, 201 possessive quantifiers, mimicking 343-344 possessive quantifiers, optimization 250-251 postal code example 208-212 postMatch() 397 \p{Other} 120 \p{Other_Letter} 121 \p{Other_Number} 121 \p{Other_Punctuation}
121 \p{Other_Symbol}
121 £ 122 \p{P} 120 \p{Paragraph_Separator}
121 \p{Pc} 121,
400 \p{Pd}
121 \p{Pe}
121 \p{Pf} 121,
400 \p{Pi} 121,
400 \p{Po}
121 \p{Private_Use}
121 \p{Ps}
121 \p{Punctuation}
120 pragma, charnames 290 pragma, overload 342 pragma, re 361, 363 pragma, strict 295, 336,
345 pragma, warnings 326, 363 pre-check of required character
244-246, 249, 251-252, 257-259, 361 pre-check of required
character, mimic
258-259 pre-check of required character, viewing 332 preMatch() 397 pre-match copy 355 prepending filename to line
79 price rounding example
51-52, 167-168 price rounding example, with alternation 175 price rounding example, with atomic
grouping 170 price rounding
example, with possessive
quantifier 169 Principles
of Compiler Design 180 printf 40 private vs. global Perl variables
295 \p{Private_Use}
121 procedural handling
95-97 procedural handling, compile
caching 243 procmail 94 procmail, version
covered 91 Programming
Perl 283, 286, 339 promote 294-295 properties 119-121, 123-124, 288,
380 \p{S}
120 PS 109, 121, 382 \p{Ps} 121 \p{Sc} 121-122 \p{Separator} 120 \p{Sk} 121 \p{Sm} 121 \p{So} 121 \p{Space_Separator}
121 \p{Spacing_Combining_Mark}
121 \p{Symbol}
120 \p{Tamil}
122 \p{Thai}
120 \p{Tibetan}
122 \p{Titlecase_Letter}
121 publication, Bulletin of Math.
Biophysics 85 publication, Communications of the
ACM 85 publication, Compilers -- Principles,
Techniques, and Tools 180 publication, Embodiments of
Mind 85 publication, The Kleene
Symposium 85 publication, “A logical calculus of the ideas
imminent in nervous activity” 85 publication, Object Oriented
Perl 339 publication, Principles of Compiler
Design 180 publication, Programming
Perl 283, 286, 339 publication, Regular Expression Search
Algorithm 85 publication, “The Role of Finite Automata in
the Development of Modern Computing Theory”
85 \p{Unassigned}
121, 123 \p{Unassigned}, in
Perl 288 \p{Punctuation} 120 \p{Uppercase_Letter}
121 Purcell, Shawn
xxii Python, after-match
data 136 Python, benchmarking 237 Python, line
anchors 128 Python,
mode modifiers 133 Python, regex
approach 97 Python,
strings 103-104 Python, version
covered 91 Python,
word boundaries 132 Python, \Z 111 \p{Z} 119-120, 380,
400 \p{Zl}
121 \p{Zp}
121 \p{Zs}
121 Qantas 11 \Q...\E 290 \Q...\E, inhibiting 292 \Q...\E, in
Java 373 qed 85 qed, introduced 76 qed, introduced, quantifier (also see: plus; star;
question mark; interval; lazy; greedy; possessive quantifiers) qed, and
backtracking 162 qed, factor
out 255 qed,
grouping for 18 qed, multiple
levels 265 qed,
optimization 247, 249 qed, and
parentheses 18 qed, possessive
quantifiers 140, 172-173 qed, possessive quantifiers,
for efficiency 259,
268-270 qed, possessive
quantifiers, mimicking, mimicking 343-344 qed, possessive quantifiers,
optimization, optimization 250-251 qed, possessive quantifiers,
automatic, automatic 251 qed, question mark, as \? 139 qed, question mark, introduced 17-18 qed, question mark, backtracking 160 qed, question mark, greedy 139 qed, question mark, lazy 140 qed, question mark, possessive 140 qed, smallest preceding
subexpression 29 question
mark, as \?
139 question mark, backtracking 160 question mark, greedy 139 question mark, lazy 140 question
mark, possessive
140 question mark, possessive,
quoted string (see
double-quoted string example) quotes multi-character
165-166 r"..." 103 $^R 302, 327 \r 49, 114-115 \r, machine-dependency 114 re 361 re 'debug' 363 re pragma 361, 363 reality check 226-228 red dragon 180 Reed, Jessamyn xxii Reflection 429 regex, balancing
needs 186 regex, compile 179-180, 350 regex, default
308 regex, delimiters 291-292 regex, delimiters, DFA (see DFA) regex, delimiters, encapsulation (see regex
objects) regex, engine
analogy 143-147 regex,
vs. English 275 regex, frame of
mind 6 regex, freeflowing design 277-281 regex, history
85-91 regex, library 76, 207 regex, longest-leftmost
match 177-179 regex,
longest-leftmost match, shortest-leftmost 183 regex, mechanics 241-242 regex, mechanics, NFA (see NFA) regex, nomenclature 27 regex, operands 288-292 regex, overloading 291, 328 regex, overloading, inhibiting 292 regex, overloading, problems 344 regex, subexpression, defined 29 regex
literal 288-292, 307 regex
literal, inhibiting
processing 292 regex
literal, locking in
352 regex literal, parsing
of 292 regex literal,
processing 350 regex literal, regex
objects 354 Regex
(.NET), CompileToAssembly 427,
429 Regex (.NET), creating, options 413-415 Regex (.NET), Escape 427 Regex (.NET), GetGroupNames
421-422 Regex (.NET), GetGroupNumbers
421-422 Regex (.NET), GroupNameFromNumber
421-422 Regex (.NET), GroupNumberFromName
421-422 Regex (.NET), IsMatch 407, 415,
425 Regex (.NET), Match 96, 408, 410, 415,
425 Regex (.NET), Matches 416, 425 Regex (.NET), object, creating 96, 410, 413-415 Regex (.NET), object, exceptions 413 Regex (.NET), object, using 96, 415 Regex (.NET), Options 421 Regex (.NET), Replace 408-409, 417-418,
425 Regex (.NET), RightToLeft 421 Regex (.NET), Split 419-420, 425 Regex (.NET), ToString 421 Regex (.NET), Unescape 427 regex objects 303-306 regex objects, efficiency 353-354 regex objects, /g 354 regex objects, match
modes 304-305 regex
objects, /o
354 regex objects, in regex
literal 354 regex
objects, viewing
305-306 regex overloading
292 regex overloading, example 341-345 http://regex.info/ xxii, 7,
345, 358 RegexCompilationInfo
429 regex-directed matching
153 regex-directed matching, and
backreferences 303 regex-directed matching, and
greediness 162 Regex.Escape 135 RegexOptions, Compiled 236, 402, 404, 414,
421-422, 429 RegexOptions, ECMAScript 400, 402,
406-407, 415, 421 RegexOptions, ExplicitCapture 402, 414,
421 RegexOptions, IgnoreCase 96, 99, 402, 413,
421 RegexOptions, IgnorePatternWhitespace 99,
402, 413, 421 RegexOptions, Multiline 402, 413-414,
421 RegexOptions, None 415, 421 RegexOptions, RightToLeft 402, 405-406,
414, 420-421, 423-424 RegexOptions, Singleline 402, 414,
421 Regexp (Java package), comparative description 375 Regexp (Java package), speed 376 regsub 100 regular expression origin of term
85 Regular Expression Search
Algorithm 85 regular
sets 85 Reinhold,
Mark xxii removing
whitespace 199-200 Replace (Regex object
method) 417-418 replaceAll 387 replaceFirst()
387-388 reproductive organs
5 required character
pre-check 244-246, 249, 251-252, 257-259, 332,
361 re-search-forward
100 reset()
387 Result (Match object
method) 423 RightToLeft (Regex
property) 421-422 RightToLeft
(.NET) 402, 405-406, 414, 420-421,
423-424 “The Role of Finite Automata in the Development of
Modern Computing Theory” 85 Ruby, $ and
^ 111 Ruby,
after-match data 136 Ruby, benchmarking 238 Ruby, \G 131 Ruby, line
anchors 128 Ruby,
mode modifiers 133 Ruby, version
covered 91 Ruby, word boundaries 132 Rudkin, Kristine xxii rule, earliest match
wins 148-149 rule,
standard quantifiers are greedy
151-153 rx
182 s/.../.../
50, 318-321 \S 49,
56, 119 \p{S}
120 \s 49,
119 \s, introduction 47 \s, in
Emacs 127 \s,
in Perl 288 \s, in Perl, (?s) (see: dot-matches-all
mode; mode modifier) /s 134 Savarese, Daniel xxii SawAmpersand 358 say what you mean 195, 274 SBOL 362 \p{Sc} 121-122 scalar context 294, 310,
312-316 scalar context, forcing 310 schaffkopf 33 scope lexical vs. dynamic 299 scripts 120-122, 288 search-and-replace, awk 99 search-and-replace, Java 387, 394 search-and-replace, .NET 408, 417-418 search-and-replace, Tcl 100 sed, after-match
data 136 sed, dot 110 sed, history
87 sed, version
covered 91 sed, word boundaries 132 \p{Separator} 120 server VM 234, 236, 375 Sethi, Ravi 180 shell 7 simple
quantifier optimization 247-248 single quotes delimiter 292,
319 Singleline
(.NET) 402, 414, 421 \p{Sk} 121 \p{Sm} 121 small quantifier equivalence
251-252 \p{So}
121 \p{Space_Separator}
121 \p{Spacing_Combining_Mark}
121 “special”
262-266 Spencer, Henry 88,
182-183, 243 split()
java.util.regex 390 split ORO 394-396 split, with capturing parentheses,
Java ORO 395 split, with capturing parentheses,
.NET 403, 420 split, with capturing parentheses,
Perl 326 split, chunk limit, Java ORO 395 split, chunk limit, java.util.regex
391 split, chunk limit, Perl 323 split, into
characters 322 split,
in Perl 321-326 split, trailing empty
items 324 split, whitespace 325 Split (Regex object
method) 419-420 standard
formula for matching delimited text 196 star, introduced 18-20 star, backtracking 162 star, greedy
139 star, lazy 140 star, possessive 140 start() 385 start-of-string anchor optimization
245-246, 255-256, 315 stclass
`list' 362 stock pricing example 51-52,
167-168 stock pricing example, with
alternation 175 stock pricing
example, with atomic
grouping 170 stock pricing
example, with possessive
quantifier 169 Strict (Option)
409 strict pragma
295, 336, 345 String, matches() 384 String, replaceAll 387 String, replaceFirst() 388 String, split() 390 String, split(),
string (also see
line) String, split(), double-quoted (see double-quoted
string example) String, initial string discrimination
244-246, 249, 251-252, 257-259, 332, 361 String, vs.
line 55 String, vs. line, match position (see
pos) String, vs.
line, pos (see
pos) StringBuffer 388 strings, C#
102 strings, Emacs 100 strings, Java
102 strings, PHP 103 strings, Python 103-104 strings, as
regex 101-105, 305 strings, Tcl
104 strings, VB.NET 102 stripping whitespace 199-200 study 359-360 study, when not to
use 359 subexpression defined 29 substitute() 394 substitution, delimiter 319 substitution, s/.../.../
50, 318-321 substring initial
substring discrimination 244-246, 249, 251-252, 257-259, 332, 361 subtraction class set operations
124 Success, Group
object method 424 Success, Match object
method 421 Success, Match object
method, Sun's regex
package (see java.util.regex) Success, Match object
method, super-linear
(see neverending match) super-linear
short-circuiting 250 \p{Symbol} 120 Synchronized Match object
method 424 syntax
class Emacs 127 System.currentTimeMillis()
236 System.Reflection
429 System.Text.RegularExpressions
407, 409 \t 49,
114-115 \t, introduced 44 tag matching 200-201 tag-team matching 130, 315 \p{Tamil} 122 Tcl, [:<:] 92 Tcl, flavor
overview 91 Tcl, benchmarking 239 Tcl, dot
111-112 Tcl, hand-tweaking 243, 259 Tcl, line
anchors 112, 128 Tcl,
mode modifiers 133 Tcl, regex
implementation 182 Tcl, regsub 100 Tcl, search-and-replace 100 Tcl, strings
104 Tcl, version
covered 91 Tcl, word boundaries 132 temperature conversion example, in
.NET 419 temperature
conversion example, in
Java 389 temperature
conversion example, in
Perl 37 temperature
conversion example, Perl
one-liner 283 temperature
conversion example, Perl one-liner, terminators (see line
terminators) testing engine
type 146-147 text-directed
matching 153 text-directed
matching, regex
appearance 162 text-to-HTML
example 67-77 \p{Thai} 120 theory of an NFA 180 There's more than one way to do
it 349 this|that
example 132, 138, 243, 245-246, 252, 255,
260-261 Thompson, Ken 85-86,
110 thread scheduling Java
benchmarking 236 \p{Tibetan} 122 tied variables 299 time() 232 time of day 26 Time::HiRes 232, 358,
360 Time.new
238 Timer()
237 title case 109 \p{Titlecase_Letter}
121 TiVo 3 tokenizer building 130, 315 toothpicks scattered 100 tortilla 126 ToString, Group object
method 424 ToString, Match object
method 422 ToString, Regex object
method 421 toString ORO 396 Traditional NFA testing for
146-147 trailing context
182 trailing context, optimizations 245-247 Trapszo, Kasia xxii Tubby 264 typographical conventions xix \u 116, 290, 400 \U 116 \U...\E 290 \U...\E, inhibiting 292 uc 290 U+C0B5 106 ucfirst 290 UCS-2 encoding 106 UCS-4 encoding 106 Ullman, Jeffrey 180 \p{Unassigned} 121,
123 \p{Unassigned}, in
Perl 288 unconditional
caching 350 underscore in
\w history 89 Unescape 427 Unicode, overview 106-108 Unicode, block
122 Unicode, block, Java 380 Unicode, block, .NET 400 Unicode, block, Perl 288 Unicode, block, Perl, categories (see Unicode,
properties) Unicode, character, combining 107, 122, 125,
288 Unicode, code point, introduced 106 Unicode, code point, beyond U+FFFF 108 Unicode, code point, multiple 107 Unicode, code point, unassigned in block 122 Unicode, combining
character 107, 122, 125, 288 Unicode, in
Java 380 Unicode,
line terminators 108-109,
111 Unicode, line terminators,
in Java 382 Unicode, line terminators, in Java, loose
matching (see case-insensitive mode) Unicode, in
.NET 401 Unicode,
properties 119, 288 Unicode, properties, java.util.regex
380 Unicode, properties, list 120-121 Unicode, properties, \p{All} 123, 288 Unicode, properties, \p{Any} 123, 288 Unicode, properties, \p{Assigned} 123-124,
288 Unicode, properties, \p{Unassigned} 121, 123,
288 Unicode, script 120-122, 288 Unicode, support in
Java 373 Unicode,
Version 3.1 108, 380,
401 Unicode, Version
3.2 288 Unicode, whitespace and /x
288 UNICODE_CASE (Pattern
flag) 380, 383 UnicodeData.txt 290 unicore 290 UNIX_LINES (Pattern
flag) 380, 382 unmatch 152, 161, 163 unmatch, .* 165 unmatch, atomic
grouping 171 unrolling the
loop 261-276 unrolling the
loop, example
270-271 unrolling the loop, general
pattern 264 \p{Uppercase_Letter}
121 URL encoding 320 URL example 74-77, 201-204, 208, 260,
303-304, 306, 320 URL example, egrep 25 URL example, Java 209 URL
example, plucking
205-208 use
charnames 290 use Config 290, 299 use English 357 use overload 342 use re 361,
363 use re 'debug'
361, 363 use re
'eval' 337 use
strict 295, 336, 345 use Time::HiRes 358,
360 use
warnings 326, 363 username example 73, 76, 98 username example, plucking from
text 71-73 username
example, in a URL
74-77 using
System.Text.RegularExpressions 410 UTF-16 encoding 106 UTF-8 encoding 106 \V 364 \v 114-115, 364 Value, Group object
method 424 Value, Match object
method 422 variable names
example 24 variables,
after match, pre-match
copy 355 variables,
binding 339 variables, fully
qualified 295 variables, interpolation 344 variables, naughty 356 variables, tied 299 variables, tied, VB.NET (also see .NET) variables, comments 99 variables, regex
approach 96-97 variables, strings 102 variables, URL parsing
example 204 verbatim
strings 102 Version 7
regex 182 Version 8
regex 182 versions covered in
this book 91 vertical
tab 109 vertical tab,
in Perl \s
288 vi after-match data
136 Vietnamese text
processing 29 virtual
machine 234-236, 375 Visual
Studio .NET 428 VM 234, 236, 375 VM, warming up
235 void context 294 VT 109 \W 49, 119 $^W 297 \w 49, 65, 119 \w, in
Emacs 127 \w,
many different interpretations
93 \w, in
Perl 288 Wall,
Larry 88-90, 138, 363-364 warming up Java VM 235 warnings 296 warnings, temporarily turning
off 297 warnings
pragma 326, 363 while vs. foreach vs.
if 320 whitespace, allowing
optional 18 whitespace, removing 199-200 wildcards filename 4 Wilson, Dean xxii Woodward, Josh xxii word anchor mechanics of matching
150 word boundaries
131 word boundaries, \<...\>, egrep 15 word boundaries, introduced 15 word
boundaries, in Java
373 word boundaries, many
programs 132 word
boundaries, mimicking
66, 132, 341-342 word boundaries, in
Perl 132, 288 www.cpan.org 358 www.PeakWebhosting.com
xxii www.regex.info
358 \X 107,
125 \x 116,
400 \x, in
Perl 286 \x,
in Perl, (?x) (see: comments and
free-spacing mode; mode modifier) /x 134, 288 /x, introduced 72 /x, history 90 Xerces
org.apache.xerces.utils.regex 372 -y old grep
86 ¥ 122 Yahoo! xxi, 74, 130, 190, 205, 207,
258, 314 \z 111,
127-128, 316 \z, in
Java 373 \z,
optimization 246 \Z 111, 127-128 \Z, in
Java 373 \Z,
optimization 246 \p{Z} 119-120, 380,
400 Zawodny, Jeremy xxii,
258 ZIP code example
208-212 \p{Zl}
121 \p{Zp}
121 \p{Zs} 121
|