[Chapter 10] 10.3 Using Pathnames and Filenames

10.3 Using Pathnames and Filenames

When working with files and pathnames, you're faced with an interesting choice: what's the best way of specifying pathnames? Perl accepts either a slash or a backslash as a path delimiter.[ 1 ] The slash is typically used by UNIX systems to delimit paths while the backslash is the traditional MS-DOS path delimiter. The slash is also used as a path delimiter when specifying URLs. The following statements all evaluate to the same thing, as far as Perl is concerned:[ 2 ]

[1] Acutally, pathnames are just passed to the operating system, which accepts either a slash or a backslash.

[2] The only portable delimiter is the slash. Of course, if you're using drive letters, your script isn't really portable anyway.

"c:\\temp"    # backslash (escaped for double quoted string)
'c:\temp'     # backslash (single quoted string)
"c:/temp"     # slash - no escape needed

There are a couple of tradeoffs associated with either approach. First we look at the backslash: if you use the backslash to delimit paths, you have compatibilty problems with scripts that need to run on UNIX systems. You also need to remember to escape the backslash inside of double-quoted strings (or use single-quoted strings, because they are not interpolated). Finally, you need to remember to use a slash if you're outputting URL paths.

If you decide to use a slash, you need to consider the following issues: although some Windows NT programs and utilities accept slashes as a delimiter, many do not. Traditionally, the slash is used to specify command-line options to MS-DOS programs, so many programs interpret slashes as command switches. Generally speaking, if your script is self contained, you won't run into any difficulties using slashes. However, if you need to pass pathnames to external programs, you'll probably need to use backslashes (unless you know that the program you're using accepts slashes).

Our practice is to use slashes unless we're passing a path to an external program, in which case we use backslashes. If you're using one style of delimiter, you could easily switch to the other style by doing a simple substitution. You must exercise caution if you're writing code that parses a path to extract components; make sure that your code either regularizes paths to use the same delimiter, or that it handles both delimiters when extracting components.[ 3 ]

[3] Or consider using File::Basename , which does portable parsing of path components.

Another issue to consider is the use of long filenames versus the traditional MS-DOS 8.3 filename (a maximum of eight characters, followed by an optional extension of up to three characters). You'll find that some programs do not handle long filenames gracefully (particularly those with embedded spaces in them). In fact, if you're communicating with 16-bit programs (of either the Windows 3.x or DOS variety), the odds are very high that they won't understand long filenames.

To convert a long filename to an 8.3 filename, use the Win32::GetShortPathName [ 4 ] function:

[4] For a discussion of the Win32 extensions, see Appendix B, Libraries and Modules .

use Win32;
$longname = 'words.secret';
$shortname = Win32::GetShortPathName($longname);
   # $shortname has WORDS~1.SEC

Perl can also be used to open files using UNC (Universal Naming Convention) pathnames. A UNC path consists of two backslashes (or slashes) followed by a machine name and a share. The following example opens a file using a UNC pathname:

open(F, '//someserver/share/somefile') ||
  die "open: $!";
$cnt = 0;
while(<F>) {$cnt++;} # count the number of lines
close(F) || die "close: $!";
print "$cnt lines\n";

If you use backslashes, make sure that they're properly escaped:

open(F, "\\\\someserver\\share\\somefile") ||
  die "open: $!";
$cnt = 0;
while(<F>) {$cnt++;} # count the number of lines
close(F) || die "close: $!";
print "$cnt lines\n";