6.4. Commenting Regular ExpressionsProblemYou want to make your complex regular expressions understandable and maintainable. Solution
You have four techniques at your disposal: comments outside the pattern, comments inside the pattern with the DiscussionThe piece of sample code in Example 6.1 uses all four techniques. The initial comment describes the overall intent of the regular expression. For relatively simple patterns, this may be all that is needed. More complex patterns, as in the example, will require more documentation. Example 6.1: resname#!/usr/bin/perl -p # resname - change all "foo.bar.com" style names in the input stream # into "foo.bar.com [204.148.40.9]" (or whatever) instead use Socket; # load inet_addr s{ # ( # capture the hostname in $1 (?: # these parens for grouping only (?! [-_] ) # lookahead for neither underscore nor dash [\w-] + # hostname component \. # and the domain dot ) + # now repeat that whole thing a bunch of times [A-Za-z] # next must be a letter [\w-] + # now trailing domain part ) # end of $1 capture }{ # replace with this: "$1 " . # the original bit, plus a space ( ($addr = gethostbyname($1)) # if we get an addr ? "[" . inet_ntoa($addr) . "]" # format it : "[???]" # else mark dubious ) }gex; # /g for global # /e for execute # /x for nice formatting
For aesthetics, the example uses alternate delimiters. When you split your match or substitution over multiple lines, it helps readability to have matching braces. Another common reason to use alternate delimiters is when your pattern or replacement contains slashes, as in
The s/ # replace \# # a pound sign (\w+) # the variable name \# # another pound sign /${$1}/xg; # with the value of the global variable
Remember that comments should explain the text, not just restate the code. Using
The final technique is
Doubling up the s/ # replace \# # a pound sign (\w+) # the variable name \# # another pound sign /'$' . $1/xeeg; # with the value of *any* variable
After a See Also
The |
|