home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam  


6.9. Matching Shell Globs as Regular Expressions

Problem

You want to allow users to specify matches using traditional shell wildcards, not full Perl regular expressions. Wildcards are easier to type than full regular expressions for simple cases.

Solution

Use the following subroutine to convert four shell wildcard characters into their equivalent regular expression; all other characters will be quoted to render them literals.

sub glob2pat {
    my $globstr = shift;
    my %patmap = (
        '*' => '.*',
        '?' => '.',
        '[' => '[',
        ']' => ']',
    );
    $globstr =~ s{(.)} { $patmap{$1} || "\Q$1" }ge;
    return '^' . $globstr . '$';
}

Discussion

A Perl pattern is not the same as a shell wildcard pattern. The shell's *.* is not a valid regular expression. Its meaning as a pattern would be /^.*\..*$/ , which is admittedly much less fun to type.

The function given in the Solution makes these conversions for you, following the standard wildcard rules used by the glob built-in.

Shell

Perl

list.?

^list\..$

project.*

^project\..*$

*old

^.*old$

type*.[ch]

^type.*\.[ch]$

*.*

^.*\..*$

*

^.*$

In the shell, the rules are different. The entire pattern is implicitly anchored at the ends. A question mark maps into any character, an asterisk is any amount of anything, and brackets are character ranges. Everything else is normal.

Most shells do more than simple one-directory globbing. For instance, you can say */* to mean "all the files in all the subdirectories of the current directory." Furthermore, most shells don't list files whose names begin with a period, unless you explicitly put that leading period into your glob pattern. Our glob2pat function doesn't do these things  - if you need them, use the File::KGlob module from CPAN.

See Also

Your system's csh (1) and ksh (1) manpages; the glob function in perlfunc (1) and Chapter 3 of Programming Perl ; the documentation for the CPAN module Glob::DosGlob; the "I/O Operators" section of perlop (1) and the "Filename globbing operator" section of Chapter 2 of Programming Perl ; we talk more about globbing in Recipe 9.6