1.16.3. Discussion
The substitution is straightforward. It removes leading whitespace
from the text of the here document. The /m
modifier lets the ^ character match at the start
of each line in the string, and the /g modifier
makes the pattern-matching engine repeat the substitution as often as
it can (i.e., for every line in the here document).
($definition = << 'FINIS') =~ s/^\s+//gm;
The five varieties of camelids
are the familiar camel, his friends
the llama and the alpaca, and the
rather less well-known guanaco
and vicuña.
FINIS
Be warned: all patterns in this recipe use \s,
meaning one whitespace character, which will also match newlines.
This means they will remove any blank lines in your here document. If
you don't want this, replace \s with
[^\S\n] in the patterns.
The substitution uses the property that the result of an assignment
can be used as the lefthand side of =~. This lets
us do it all in one line, but works only when assigning to a
variable. When you're using the here document directly, it would be
considered a constant value, and you wouldn't be able to modify it.
In fact, you can't change a here document's value
unless you first put it into a variable.
Not to worry, though, because there's an easy way around this,
particularly if you're going to do this a lot in the program. Just
write a subroutine:
sub fix {
my $string = shift;
$string =~ s/^\s+//gm;
return $string;
}
print fix( << "END");
My stuff goes here
END
# With function predeclaration, you can omit the parens:
print fix << "END";
My stuff goes here
END
As with all here documents, you have to place this here document's
target (the token that marks its end, END in this
case) flush against the lefthand margin. To have the target indented
also, you'll have to put the same amount of whitespace in the quoted
string as you use to indent the token.
($quote = << ' FINIS') =~ s/^\s+//gm;
...we will have peace, when you and all your works have
perished--and the works of your dark master to whom you would
deliver us. You are a liar, Saruman, and a corrupter of men's
hearts. --Theoden in /usr/src/perl/taint.c
FINIS
$quote =~ s/\s+--/\n--/; #move attribution to line of its own
If you're doing this to strings that contain code you're building up
for an eval, or just text to print out, you might
not want to blindly strip all leading whitespace, because that would
destroy your indentation. Although eval wouldn't
care, your reader might.
Another embellishment is to use a special leading string for code
that stands out. For example, here we'll prepend each line with
@@@, properly indented:
if ($REMEMBER_THE_MAIN) {
$perl_main_C = dequote << ' MAIN_INTERPRETER_LOOP';
@@@ int
@@@ runops( ) {
@@@ SAVEI32(runlevel);
@@@ runlevel++;
@@@ while ( op = (*op->op_ppaddr)( ) ) ;
@@@ TAINT_NOT;
@@@ return 0;
@@@ }
MAIN_INTERPRETER_LOOP
# add more code here if you want
}
Destroying indentation also gets you in trouble with poets.
sub dequote;
$poem = dequote << EVER_ON_AND_ON;
Now far ahead the Road has gone,
And I must follow, if I can,
Pursuing it with eager feet,
Until it joins some larger way
Where many paths and errands meet.
And whither then? I cannot say.
--Bilbo in /usr/src/perl/pp_ctl.c
EVER_ON_AND_ON
print "Here's your poem:\n\n$poem\n";
Here is its sample output:
Here's your poem:
Now far ahead the Road has gone,
And I must follow, if I can,
Pursuing it with eager feet,
Until it joins some larger way
Where many paths and errands meet.
And whither then? I cannot say.
--Bilbo in /usr/src/perl/pp_ctl.c
sub dequote {
local $_ = shift;
my ($white, $leader); # common whitespace and common leading string
if (/^\s*(?:([^\w\s]+)(\s*).*\n)(?:\s*\1\2?.*\n)+$/) {
($white, $leader) = ($2, quotemeta($1));
} else {
($white, $leader) = (/^(\s+)/, '');
}
s/^\s*?$leader(?:$white)?//gm;
return $_;
}
If that pattern makes your eyes glaze over, you could always break it
up and add comments by adding /x:
if (m{
^ # start of line
\s * # 0 or more whitespace chars
(?: # begin first non-remembered grouping
( # begin save buffer $1
[^\w\s] # one character neither space nor word
+ # 1 or more of such
) # end save buffer $1
( \s* ) # put 0 or more white in buffer $2
.* \n # match through the end of first line
) # end of first grouping
(?: # begin second non-remembered grouping
\s * # 0 or more whitespace chars
\1 # whatever string is destined for $1
\2 ? # what'll be in $2, but optionally
.* \n # match through the end of the line
) + # now repeat that group idea 1 or more
$ # until the end of the line
}x
)
{
($white, $leader) = ($2, quotemeta($1));
} else {
($white, $leader) = (/^(\s+)/, '');
}
s{
^ # start of each line (due to /m)
\s * # any amount of leading whitespace
? # but minimally matched
$leader # our quoted, saved per-line leader
(?: # begin unremembered grouping
$white # the same amount
) ? # optionalize in case EOL after leader
}{ }xgm;
There, isn't that much easier to read? Well, maybe not; sometimes it
doesn't help to pepper your code with insipid comments that mirror
the code. This may be one of those cases.