Filehandles (Programming Perl)

1.4. Filehandles

Unless you're using artificial intelligence to model a solipsistic philosopher, your program needs some way to communicate with the outside world. In lines 3 and 4 of our Average Example you'll see the word GRADES, which exemplifies another of Perl's data types, the filehandle. A filehandle is just a name you give to a file, device, socket, or pipe to help you remember which one you're talking about, and to hide some of the complexities of buffering and such. (Internally, filehandles are similar to streams from a language like C++ or I/O channels from BASIC.)

Filehandles make it easier for you to get input from and send output to many different places. Part of what makes Perl a good glue language is that it can talk to many files and processes at once. Having nice symbolic names for various external objects is just part of being a good glue language.[13]

[13] Some of the other things that make Perl a good glue language are: it's 8-bit clean, it's embeddable, and you can embed other things in it via extension modules. It's concise, and it "networks" easily. It's environmentally conscious, so to speak. You can invoke it in many different ways (as we saw earlier). But most of all, the language itself is not so rigidly structured that you can't get it to "flow" around your problem. It comes back to that TMTOWTDI thing again.

You create a filehandle and attach it to a file by using open. The open function takes at least two parameters: the filehandle and filename you want to associate it with. Perl also gives you some predefined (and preopened) filehandles. STDIN is your program's normal input channel, while STDOUT is your program's normal output channel. And STDERR is an additional output channel that allows your program to make snide remarks off to the side while it transforms (or attempts to transform) your input into your output.[14]

[14] These filehandles are typically attached to your terminal, so you can type to your program and see its output, but they may also be attached to files (and such). Perl can give you these predefined handles because your operating system already provides them, one way or another. Under Unix, processes inherit standard input, output, and error from their parent process, typically a shell. One of the duties of a shell is to set up these I/O streams so that the child process doesn't need to worry about them.

Since you can use the open function to create filehandles for various purposes (input, output, piping), you need to be able to specify which behavior you want. As you might do on the command line, you simply add characters to the filename.

open(SESAME, "filename")               # read from existing file
open(SESAME, "<filename")              #   (same thing, explicitly)
open(SESAME, ">filename")              # create file and write to it
open(SESAME, ">>filename")             # append to existing file
open(SESAME, "| output-pipe-command")  # set up an output filter
open(SESAME, "input-pipe-command |")   # set up an input filter

As you can see, the name you pick for the filehandle is arbitrary. Once opened, the filehandle SESAME can be used to access the file or pipe until it is explicitly closed (with, you guessed it, close(SESAME)), or until the filehandle is attached to another file by a subsequent open on the same filehandle.[15]

[15]Opening an already opened filehandle implicitly closes the first file, making it inaccessible to the filehandle, and opens a different file. You must be careful that this is what you really want to do. Sometimes it happens accidentally, like when you say open($handle,$file), and $handle happens to contain a constant string. Be sure to set $handle to something unique, or you'll just open a new file on the same filehandle. Or you can leave $handle undefined, and Perl will fill it in for you.

Once you've opened a filehandle for input, you can read a line using the line reading operator, <>. This is also known as the angle operator because it's made of angle brackets. The angle operator encloses the filehandle (<SESAME>) you want to read lines from. The empty angle operator, <>, will read lines from all the files specified on the command line, or STDIN, if none were specified. (This is standard behavior for many filter programs.) An example using the STDIN filehandle to read an answer supplied by the user would look something like this:

print STDOUT "Enter a number: ";          # ask for a number
$number = <STDIN>;                        # input the number
print STDOUT "The number is $number.\n";  # print the number

Did you see what we just slipped by you? What's that STDOUT doing there in those print statements? Well, that's just one of the ways you can use an output filehandle. A filehandle may be supplied as the first argument to the print statement, and if present, tells the output where to go. In this case, the filehandle is redundant, because the output would have gone to STDOUT anyway. Much as STDIN is the default for input, STDOUT is the default for output. (In line 18 of our Average Example, we left it out to avoid confusing you up till now.)

If you try the previous example, you may notice that you get an extra blank line. This happens because the line-reading operation does not automatically remove the newline from your input line (your input would be, for example, "9\n"). For those times when you do want to remove the newline, Perl provides the chop and chomp functions. chop will indiscriminately remove (and return) the last character of the string, while chomp will only remove the end of record marker (generally, "\n") and return the number of characters so removed. You'll often see this idiom for inputting a single line:

chop($number = <STDIN>);    # input number and remove newline

which means the same thing as:

$number = <STDIN>;          # input number
chop($number);              # remove newline