7. File Access
Contents:
I the heir of all the ages, in the foremost files of time. - Alfred, Lord Tennyson Locksley Hall 7.0. IntroductionNothing is more central to data processing than the file. As with everything else in Perl, easy things are easy and hard things are possible. Common tasks (opening, reading data, writing data) use simple I/O functions and operators, whereas fancier functions do hard things like non-blocking I/O and file locking. This chapter deals with the mechanics of file access : opening a file, telling subroutines which files to work with, locking files, and so on. Chapter 8, File Contents , deals with techniques for working with the contents of a file: reading, writing, shuffling lines, and other operations you can do once you have access to the file.
Here's Perl code for printing all lines in the file
/usr/local/widgets/data
that contain the word open(INPUT, "< /usr/local/widgets/data") or die "Couldn't open /usr/local/widgets/data for reading: $!\n"; while (<INPUT>) { print if /blue/; } close(INPUT); Getting a Handle on the File
Central to Perl's file access is the
filehandle
, like INPUT in the preceding program. This is a symbol you use to represent the file when you read and write. Because filehandles aren't variables (they don't have a $var = *STDIN; mysub($var, *LOGFILE); When you store filehandles in variables like this, you don't use them directly. They're called indirect filehandles because they indirectly refer to the real filehandle. Two modules, IO::File (standard since 5.004) and FileHandle (standard since 5.000), can create anonymous filehandles. When we use IO::File or IO::Handle in our examples, you could obtain identical results by using FileHandle instead, since it's now just a wrapper module.
Here's how we'd write the use IO::File; $input = IO::File->new("< /usr/local/widgets/data") or die "Couldn't open /usr/local/widgets/data for reading: $!\n"; while (defined($line = $input->getline())) { chomp($line); STDOUT->print($line) if $line =~ /blue/; } $input->close(); As you see, it's much more readable to use filehandles directly. It's also a lot faster.
But here's a little secret for you: you can skip all that arrow and method-call business altogether. Unlike most objects, you don't actually
have
to use IO::File objects in an object-oriented way. They're essentially just anonymous filehandles, so you can use them anywhere you'd use a regular indirect filehandle.
Recipe 7.16
covers these modules and the Standard FileHandlesEvery program starts out with three global filehandles already opened: STDIN, STDOUT, and STDERR. STDIN ( standard input ) is the default source of input, STDOUT ( standard output ) is the default destination for output, and STDERR ( standard error ) is the default place to send warnings and errors. For interactive programs, STDIN is the keyboard, STDOUT and STDERR are the screen: while (<STDIN>) { # reads from STDIN unless (/\d/) { warn "No digit found.\n"; # writes to STDERR } print "Read: ", $_; # writes to STDOUT } END { close(STDOUT) or die "couldn't close STDOUT: $!" }
Filehandles live in packages. That way, two packages can have filehandles with the same name and be separate, just as they can with subroutines and variables. The
Files are accessed at the operating system through numeric file descriptors. You can learn a filehandle's descriptor number using the I/O Operations
Perl's most common operations for file interaction are
The most important I/O function is open(LOGFILE, "> /tmp/log") or die "Can't write /tmp/log: $!";
The three most common access modes are < for reading, > for overwriting, and >> for appending. The
When opening a file or making virtually any other system call,[
1
] checking the return value is indispensable. Not every
To read a record in Perl, use the
circumfix operator
Abstractly, files are simply streams of bytes. Each filehandle has associated with it a number representing the current byte position in the file, returned by the
When you no longer have use for a filehandle,
These implicit closes are for convenience, not stability, because they don't tell you whether the system call succeeded or failed. Not all closes succeed. Even a close(FH) or die "FH didn't close: $!";
The prudent programmer even checks the Checking standard error, though, is probably of dubious value. After all, if STDERR fails to close, what are you planning to do about it?
STDOUT is the default destination for output from the $old_fh = select(LOGFILE); # switch to LOGFILE for output print "Countdown initiated ...\n"; select($old_fh); # return to original output print "You have 30 seconds to reach minimum safety distance.\n";
Some of Perl's special variables change the behavior of the currently selected output filehandle. Most important is
Perl provides functions for buffered and unbuffered input and output. Although there are some exceptions, you shouldn't mix calls to buffered and unbuffered I/O functions. The following table shows the two sets of functions you should not mix. Functions on a particular row are only loosely associated; for instance,
Repositioning is addressed in Chapter 8 , but we also use it in Recipe 7.10 . | ||||||||||||||||||
|