home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam  


Perl CookbookPerl CookbookSearch this book

6.11. Testing for a Valid Pattern

6.11.3. Discussion

There's no limit to the number of invalid, uncompilable patterns. The user could mistakenly enter "<I\s*[^">, "*** GET RICH ***", or "+5-i". If you blindly use the proffered pattern in your program, it raises an exception, normally a fatal event.

The tiny program in Example 6-6 demonstrates this.

Example 6-6. paragrep

  #!/usr/bin/perl
  # paragrep - trivial paragraph grepper
  die "usage: $0 pat [files]\n" unless @ARGV;
  $/ = '';
  $pat = shift;
  eval { "" =~ /$pat/; 1 }      or die "$0: Bad pattern $pat: $@\n";
  while (<>) {
      print "$ARGV $.: $_" if /$pat/o;
  }

That /o means to interpolate variables once only, even if their contents later change.

You could encapsulate this in a function call that returns 1 if the block completes and 0 if not, as shown in the Solution. The simpler eval "/$pat/" would also work to trap the exception, but has two other problems. One is that any slashes (or whatever your chosen pattern delimiter is) in the string the user entered would raise an exception. More importantly, it would open a drastic security hole that you almost certainly want to avoid. Strings like this could ruin your day:

$pat = "You lose @{[ system('rm -rf *')]} big here";

If you don't want to let the user provide a real pattern, you can always metaquote the string first:

$safe_pat = quotemeta($pat);
something( ) if /$safe_pat/;

Or, even easier, use:

something( ) if /\Q$pat/;

But if you're going to do that, why are you using pattern matching at all? In that case, a simple use of index would be enough. But sometimes you want a literal part and a regex part, such as:

something( ) if /^\s*\Q$pat\E\s*$/;

Letting the user supply a real pattern gives them power enough for many interesting and useful operations. This is a good thing. You just have to be slightly careful. Suppose they wanted to enter a case-insensitive pattern, but you didn't provide the program with an option like grep's -i option. By permitting full patterns, the user can enter an embedded /i modifier as (?i), as in /(?i)stuff/.

What happens if the interpolated pattern expands to nothing? If $pat is the empty string, what does /$pat/ match—that is, what does a blank // match? It doesn't match the start of all possible strings. Surprisingly, matching the null pattern exhibits the dubiously useful semantics of reusing the previous successfully matched pattern. In practice, this is hard to make good use of in Perl.



Library Navigation Links

Copyright © 2003 O'Reilly & Associates. All rights reserved.