Recipe 7.12. Flushing Output (Perl Cookbook)

7.12. Flushing Output

Problem

When printing to a filehandle, output doesn't appear immediately. This is a problem in CGI scripts running on some programmer-hostile web servers where, if the web server sees warnings from Perl before it sees the (buffered) output of your script, it sends the browser an uninformative 500 Server Error . These buffering problems arise with concurrent access to files by multiple programs and when talking with devices or sockets.

Solution

Disable buffering by setting the per-filehandle variable $| to a true value, customarily 1 :

$old_fh = select(OUTPUT_HANDLE);
$| = 1;
select($old_fh);

Or, if you don't mind the expense, disable it by calling the autoflush method from the IO modules:

use IO::Handle;
OUTPUT_HANDLE->autoflush(1);

Discussion

In most stdio implementations, buffering varies with the type of output device. Disk files are block buffered, often with a buffer size of more than 2K. Pipes and sockets are often buffered with a buffer size between ¹ /₂ and 2K. Serial devices, including terminals, modems, mice, and joysticks, are normally line-buffered; stdio sends the entire line out only when it gets the newline.

Perl's print function does not support truly unbuffered output - a physical write for each individual character. Instead, it supports command buffering , in which one physical write is made after every separate output command. This isn't as hard on your system as no buffering at all, and it still gets the output where you want it, when you want it.

Control output buffering through the $| special variable. Enable command buffering by setting it to a true value. It has no effect upon input; see Recipes Recipe 15.6 and Recipe 15.8 for unbuffered input. Set this variable to a false value to use default stdio buffering. Example 7.6 illustrates the difference.

Example 7.6: seeme

#!/usr/bin/perl -w
# 

seeme - demo stdio output buffering
$| = (@ARGV > 0);      # command buffered if arguments given
print "Now you don't see it...";
sleep 2;
print "now you do\n";

If you call this program with no arguments, STDOUT is not command buffered. Your terminal (console, window, telnet session, whatever) doesn't receive output until the entire line is completed, so you see nothing for two seconds and then get the full line "Now you don't see it ... now you do" . If you call the program with at least one argument, STDOUT is command buffered. That means you first see "Now you don't see it..." , and then after two seconds you finally see "now you do" .

The dubious quest for increasingly compact code has led programmers to use the return value of select , the filehandle that was currently selected, as part of the second select :

    select((select(OUTPUT_HANDLE), $| = 1)[0]);

There's another way. The FileHandle and IO modules provide a class method called autoflush . Call it with true or false values (the default value is true) to control autoflushing on a particular output handle:

use FileHandle;

STDERR->autoflush;          # already unbuffered in stdio
$filehandle->autoflush(0);

If you're willing to accept the oddities of indirect object notation covered in Chapter 13, Classes, Objects, and Ties , you can even write something reasonably close to English:

use IO::Handle;
# assume REMOTE_CONN is an interactive socket handle,
# but DISK_FILE is a handle to a regular file.
autoflush REMOTE_CONN  1;           # unbuffer for clarity
autoflush DISK_FILE    0;           # buffer this for speed

This avoids the bizarre select business, and makes your code much more readable. Unfortunately, your program takes longer to compile because you're now including the IO::Handle module, so thousands and thousands of lines must first be read and compiled. Learn to manipulate $| directly, and you'll be happy.

To ensure that your output gets where you want it, when you want it, buffer flushing is important. It's particularly important with sockets, pipes, and devices, because you may be trying to do interactive I/O with these - more so, in fact, because you can't assume line-buffering. Consider the program in Example 7.7 .

Example 7.7: getpcomidx

#!/usr/bin/perl
# 

getpcomidx - fetch www.perl.com's index.html document
use IO::Socket;
$sock = new IO::Socket::INET (PeerAddr => 'www.perl.com',
                              PeerPort => 'http(80)');
die "Couldn't create socket: $@" unless $sock;
# the library doesn't support $! setting; it uses $@

$sock->autoflush(1);

# Mac *must* have \015\012\015\012 instead of \n\n here.
# It's a good idea for others, too, as that's the spec,
# but implementations are encouraged to accept "\cJ\cJ" too,
# and as far as we've seen, they do.
$sock->print("GET /index.html http/1.1\n\n");
$document = join('', $sock->getlines());
print "DOC IS: $document\n";

There's no way to control input buffering using any kind of flushing discussed so far. For that, you need to see Recipes Recipe 15.6 and Recipe 15.8 .