6.4. Commenting Regular Expressions6.4.2. SolutionYou have several techniques at your disposal: electing alternate delimiters to avoid so many backslashes, placing comments outside the pattern or inside it using the /x modifier, and building up patterns piecemeal in named variables. 6.4.3. DiscussionThe piece of sample code in Example 6-1 uses the first couple techniques, and its initial comment describes the overall intent of the regular expression. For simple patterns, this may be all that is needed. More complex patterns, as in the example, require more documentation. Example 6-1. resname
For aesthetics, the example uses alternate delimiters. When you split your match or substitution over multiple lines, using matching braces aids readability. A more common use of alternate delimiters is for patterns and replacements that themselves contain slashes, such as in s/\/\//\/..\//g. Alternate delimiters, as in s!//!/../!g or s{//}{/../}g, avoid escaping the non-delimiting slashes with backslashes, again improving legibility. The /x pattern modifier makes Perl ignore whitespace in the pattern (outside a character class) and treat # characters and their following text as comments. The /e modifier changes the replacement portion from a string into code to run. Since it's code, you can put regular comments there, too. To include literal whitespace or # characters in a pattern to which you've applied /x, escape them with a backslash:
Remember that comments should explain what you're doing and why, not merely restate the code. Using "$i++ # add one to i" is apt to lose points in your programming course or at least get you talked about in substellar terms by your coworkers. The last technique for rendering patterns more legible (and thus, more maintainable) is to place each semantic unit into a variable given an appropriate name. We use single quotes instead of doubles so backslashes don't get lost.
Then use $number in further patterns:
We can even combine all of these techniques:
which is certainly a lot better than writing: /^\s*[-+]?\d+\.?\d*(?:\s+[-+]?\d+\.?\d*)*\s*/ Patterns that you put in variables should probably not contain capturing parentheses or backreferences, since a capture in one variable could change the numbering of those in others. Clustering parentheses—that is, /(?:...)/ instead of /(...)/—though, are fine. Not only are they fine, they're necessary if you want to apply a quantifier to the whole variable. For example:
Now you can say /$number+/ and have the plus apply to the whole number group. Without the grouping, the plus would have shown up right after the last star, which would have been illegal. One more trick with clustering parentheses is that you can embed a modifier switch that applies only to that cluster. For example: $hex_digit = '(?i:[0-9a-z])'; $hdr_line = '(?m:[^:]*:.*)'; The qr// construct does this automatically using cluster parentheses, enabling any modifiers you specified and disabling any you didn't for that cluster: $hex_digit = qr/[0-9a-z]/i; $hdr_line = qr/^[^:]*:.*/m; print "hex digit is: $hex_digit\n"; print "hdr line is: $hdr_line\n"; hex digit is: (?i-xsm:[0-9a-z]) hdr line is: (?m-xis:^[^:]*:.*) It's probably a good idea to use qr// in the first place:
Although the output can be a bit odd to read:
6.4.4. See AlsoThe /x modifier in perlre(1) and Chapter 5 of Programming Perl; the "Comments Within a Regular Expression" section of Chapter 7 of Mastering Regular Expressions
Copyright © 2003 O'Reilly & Associates. All rights reserved. |
|