9.6. Globbing, or Getting a List of Filenames Matching a Pattern

Problem

You want to get a list of filenames similar to MS-DOS's *.* and Unix's *.h (this is called globbing ).

Solution

Perl provides globbing with the semantics of the Unix C shell through the glob keyword and < >:

@list = <*.c>;
@list = glob("*.c");

You can also use readdir to extract the filenames manually:

opendir(DIR, $path);
@files = grep { /\.c$/ } readdir(DIR);
closedir(DIR);

The CPAN module File::KGlob does globbing without length limits:

use File::KGlob;

@files = glob("*.c");

Discussion

Perl's built-in glob and <WILDCARD> notation (not to be confused with <FILEHANDLE>) currently use an external program to get the list of filenames on most platforms. This program is csh on Unix,[ 2 ] and a program called dosglob.exe on Windows. On VMS and the Macintosh, file globs are done internally without an external program. Globs are supposed to give C shell semantics on non-Unix systems to encourage portability. The use of the shell on Unix also makes this inappropriate for setuid scripts.

[2] Usually. If tcsh is installed, Perl uses that because it's safer. If neither is installed, /bin/sh is used.

To get around this, you can either roll your own selection mechanism using the built-in opendir or CPAN's File::KGlob, neither of which uses external programs. File::KGlob provides Unix shell-like globbing semantics, whereas opendir lets you select files with Perl's regular expressions.

At its simplest, an opendir solution uses grep to filter the list returned by readdir :

@files = grep { /\.[ch]$/i } readdir(DH);

You could also do this with the DirHandle module:

use DirHandle;

$dh = DirHandle->new($path)   or die "Can't open $path : $!\n";
@files = grep { /\.[ch]$/i } $dh->read();

As always, the filenames returned don't include the directory. When you use the filename, you'll need to prepend the directory name:

opendir(DH, $dir)        or die "Couldn't open $dir for reading: $!";

@files = ();
while( defined ($file = readdir(DH)) ) {
    next unless /\.[ch]$/i;

    my $filename = "$dir/$file";
    push(@files, $filename) if -T $file;
}

The following example combines directory reading and filtering with the Schwartzian Transform from Chapter 4, Arrays , for efficiency. It sets @dirs to a sorted list of the subdirectories in a directory whose names are all numeric:

@dirs = map  { $_->[1] }                # extract pathnames
        sort { $a->[0] <=> $b->[0] }    # sort names numeric
        grep { -d $_->[1] }             # path is a dir
        map  { [ $_, "$path/$_" ] }     # form (name, path)
        grep { /^\d+$/ }                # just numerics
        readdir(DIR);                   # all files

Recipe 4.15 explains how to read these strange-looking constructs. As always, formatting and documenting your code can make it much easier to read and understand.

See Also

The opendir , readdir , closedir , grep , map , and sort functions in perlfunc (1) and in Chapter 3 of Programming Perl ; documentation for the standard DirHandle module (also in Chapter 7 of Programming Perl ); the "I/O Operators" section of perlop (1), and the "Filename Globbing Operator" section of Chapter 2 of Programming Perl ; we talk more about globbing in Recipe 6.9 ; Recipe 9.7


Previous: 9.5. Processing All Files in a Directory Perl Cookbook Next: 9.7. Processing All Files in a Directory Recursively
9.5. Processing All Files in a Directory Book Index 9.7. Processing All Files in a Directory Recursively