16.4. Reading or Writing to Another Program

Problem

You want to run another program and either read its output or supply the program with input.

Solution

Use open with a pipe symbol at the beginning or end. To read from a program, put the pipe symbol at the end:

$pid = open(README, "program arguments |")  or die "Couldn't fork: $!\n";
while (<README>) {
    # ...
}
close(README)                               or die "Couldn't close: $!\n";

To write to the program, put the pipe at the beginning:

$pid = open(WRITEME, "| program arguments") or die "Couldn't fork: $!\n";
print WRITEME "data\n";
close(WRITEME)                              or die "Couldn't close: $!\n";

Discussion

In the case of reading, this is similar to using backticks, except you have a process ID and a filehandle. As with the backticks, open uses the shell if it sees shell-special characters in its argument, but it doesn't if there aren't any. This is usually a welcome convenience, because it lets the shell do filename wildcard expansion and I/O redirection, saving you the trouble.

However, sometimes this isn't desirable. Piped open s that include unchecked user data would be unsafe while running in taint mode or in untrustworthy situations. Recipe 19.6 shows how to get the effect of a piped open without risking using the shell.

Notice how we specifically call close on the filehandle. When you use open to connect a filehandle to a child process, Perl remembers this and automatically waits for the child when you close the filehandle. If the child hasn't exited by then, Perl waits until it does. This can be a very, very long wait if your child doesn't exit:

$pid = open(F, "sleep 100000|");    # child goes to sleep
close(F);                           # and the parent goes to lala land

To avoid this, you can save the PID returned by open to kill your child, or use a manual pipe - fork - exec sequence as described in Recipe 16.10 .

If you attempt to write to a process that has gone away, your process will receive a SIGPIPE. The default disposition for this signal is to kill your process, so the truly paranoid install a SIGPIPE handler just in case.

If you want to run another program and be able to supply its STDIN yourself, a similar construct is used:

$pid = open(WRITEME, "| program args");
print WRITEME "hello\n";            # program will get hello\n on STDIN
close(WRITEME);                     # program will get EOF on STDIN

The leading pipe symbol in the filename argument to open tells Perl to start another process instead. It connects the open ed filehandle to the process's STDIN. Anything you write to the filehandle can be read by the program through its STDIN. When you close the filehandle, the open ed process will get an eof when it next tries to read from STDIN.

You can also use this technique to change your program's normal output path. For example, to automatically run everything through a pager, use something like:

$pager = $ENV{PAGER} || '/usr/bin/less';  # XXX: might not exist
open(STDOUT, "| $pager");

Now, without changing the rest of your program, anything you print to standard output will go through the pager automatically.

As with open ing a process for reading, text passed to the shell here is subject to shell metacharacter interpretation. To avoid the shell, a similar solution is called for. As before, the parent should also be wary of close . If the parent closes the filehandle connecting it to the child, the parent will block while waiting for the child to finish. If the child doesn't finish, neither will the close. The workaround as before is either to kill your child process prematurely, or else use the low-level pipe - fork - exec scenario.

When using piped opens, always check return values of both open and close , not just of open . That's because the return value from open does not indicate whether the command was succesfully launched. With a piped open, you fork a child to execute the command. Assuming the system hadn't run out of processes, the fork immediately returns the PID of the child it just created.

By the time the child process tries to exec the command, it's a separately scheduled process. So if the command can't be found, there's effectively no way to communicate this back to the open function, because that function is in a different process!

Check the return value from close to see whether the command was successful. If the child process exits with non-zero status - which it will do if the command isn't found - the close returns false and $? is set to the wait status of that process. You can interpret its contents as described in Recipe 16.19 .

In the case of a pipe opened for writing, you should also install a SIGPIPE handler, since writing to a child that isn't there will trigger a SIGPIPE.


Previous: 16.3. Replacing the Current Program with a Different One Perl Cookbook Next: 16.5. Filtering Your Own Output
16.3. Replacing the Current Program with a Different One Book Index 16.5. Filtering Your Own Output