First, of course, the subpattern fred matches the
identical literal string. The next part of the pattern is the
.+, which matches any character except newline, at
least one time. But the plus quantifier is greedy; it prefers to
match as much as possible. So it immediately matches all of the rest
of the string, including the word night. (This may
surprise you, but the story isn't over yet.)
Now the subpattern barney would like to match, but
it can't -- we're at the end of the string. But since
the .+ could still be successful even if it
matched one fewer character, it reluctantly gives back the letter
t at the end of the string. (It's greedy,
but it wants the whole pattern to succeed even more than it wants to
match everything all by itself.)
The subpattern barney tries again to match, and
still can't. So the .+ gives back the letter
h and lets it try again. One character after
another, the .+ gives back what it matched until
finally it gives up all of the letters of barney.
Now, finally, the subpattern barney can match, and
the overall match succeeds.
For each of the greedy quantifiers, though, there's also a
non-greedy quantifier available. Instead of the plus
(+), we can use the non-greedy quantifier
+?, which matches one or more times (just as the
plus does), except that it prefers to match as few times as possible,
rather than as many as possible. Let's see how that new
quantifier works when the pattern is rewritten as
/fred.+?barney/.
Once again, fred matches right at the start. But
this time the next part of the pattern is .+?,
which would prefer to match no more than one character, so it matches
just the space after fred. The next subpattern is
barney, but that can't match here (since the
string at the current position begins with and
barney...). So the .+? reluctantly
matches the a and lets the rest of the pattern try
again. Once again, barney can't match, so
the .+? accepts the letter n
and so on. Once the .+? has matched five
characters, barney can match, and the pattern is a
success.
There was still some backtracking, but since the engine had to go
back and try again just a few times, it should be a big improvement
in speed. Well, it's an improvement if you'll generally
find barney near fred. If your
data often had fred near the start of the string
and barney only at the end, the greedy quantifier
might be a faster choice. In the end, the speed of the regular
expression depends upon the data.
I thought you said Fred and <BOLD>Velma</BOLD>, not <BOLD>Wilma</BOLD>
In that case, the pattern would match from the first
<BOLD> to the last
</BOLD>, leaving intact the ones in the
middle of the line. Oops! Instead, we want a non-greedy quantifier.
The non-greedy form of star is *?, so the
substitution now looks like this:
s#<BOLD>(.*?)</BOLD>#$1#g;
And it does the right thing.