Process Management (Learning Perl, 3rd Edition)

One of the best parts of being a programmer is launching someone else's code so that you don't have to write it yourself. It's time to learn how to manage your children[304] by launching other programs directly from Perl.

And like everything else in Perl, There's More Than One Way To Do It, with lots of overlap, variations, and special features. So if you don't like the first way, just read on for another page or two for a solution more to your liking.

Perl is very portable; most of the rest of this book doesn't need many notes saying that it works this way on Unix systems and that way on Windows and the other way on VMS. But when you're starting other programs on your machine, different programs are available on a Macintosh than you'll likely find on a Cray. The examples in this chapter are primarily Unix-based; if you have a non-Unix system, you can expect to see some differences.

14.1. The system Function

The simplest way to launch a child process in Perl to run a program is the system function. For example, to invoke the Unix date command from within Perl, it looks like:

system "date";

The child process runs the date command, which inherits Perl's standard input, standard output, and standard error. This mean that the normal short date-and-time string generated by date ends up wherever Perl's STDOUT was already going.

The parameter to the system function is generally whatever you'd normally type at the shell. So if it were a more complicated command, like "ls -l $HOME ", we'd just have put all that into the parameter:

system 'ls -l $HOME';

Note that we had to switch here from double quotes to single quotes, since $HOME is the shell's variable. Otherwise, the shell would never have seen the dollar sign, since that's also an indicator for Perl to interpolate. Alternatively, we could write:

system "ls -l \$HOME";

But that can get quickly unwieldly.

Now, the date command is output-only, but let's say it had been a chatty command, asking first "for which time zone do you want the time?"[305] That'll end up on standard output, and then the program will listen on standard input (inherited from Perl's STDIN) for the response. You'll see the question, and type in the answer (like "Zimbabwe time"), and then date will finish its duty.

[305]As far as we know, no one has made a date command that works like this.

While the child process is running, Perl is patiently waiting for it to finish. So if the date command took 37 seconds, then Perl is paused for those 37 seconds. You can use the shell's facility to launch a background process,[306] however:

[306]See what we mean about this depending upon your system? The Unix shell (/bin/sh) lets you use the ampersand on this kind of command to make a background process. If your non-Unix system doesn't support this way to launch a background process, then you can't do it this way, that's all.

system "long_running_command with parameters &";

Here, the shell gets launched, which then notices the ampersand at the end of the command line, causing the long_running_command to be made into a background process. And then the shell exits rather quickly, which Perl notices and moves on. In this case, the long_running_command is really a grandchild of the Perl process, to which Perl really has no direct access or knowledge.

When the command is "simple enough," no shell gets involved, so for the date and ls commands earlier, the requested command is launched directly by Perl, which searches the inherited PATH[307] to find the command, if necessary. But if there's anything weird in the string (such as shell metacharacters like the dollar sign, semicolon, or vertical bar), then the standard Bourne Shell (/bin/sh[308]) gets invoked to work through the complicated stuff. In that case, the shell is the child process, and the requested commands are grandchildren (or further offspring). For example, you can write an entire little shell script in the argument:

[307]The PATH can be changed by adjusting $ENV{'PATH'} at any time. Initially, this is the environment variable inherited from the parent process (usually the shell). Changing this value affects new child processes, but cannot affect any preceding parent processes. The PATH is the list of directories where executable programs (commands) are found, even on some non-Unix systems.

[308]Or whatever was determined when Perl was built. Practically always, this is just /bin/sh on Unix-like systems.

system 'for in *; do echo == $i ==; cat $i; done';

Here again, we're using single quotes, because the dollar signs here are meant for the shell and not for Perl. Double quotes would have permitted Perl to expand $i to its current Perl value, and not let the shell expand it to its own value.[309] By the way, that little shell script goes through all of the normal files in the current directory, printing out each one's name and contents; you can try it out yourself if you don't believe us.

[309]Of course, if you set $i = '$i', then it would work anyway, until a maintenance programmer came along and "fixed" that line out of existence.

14.1.1. Avoiding the Shell

The system operator may also be invoked with more than one argument,[310] in which case a shell doesn't get involved, no matter how complicated the text:

[310]Or with a parameter in the indirect-object slot, like system { 'fred' } 'barney';, which runs the program barney, but lies to it so it thinks that it's called 'fred'. See the perlfunc manpage.

my $tarfile = "something*wicked.tar";
my @dirs = qw(fred|flintstone <barney&rubble> betty );
system "tar", "cvf", $tarfile, @dirs;

In this case, the first parameter ("tar" here) gives the name of a command found in the normal PATH-searching way, while the remaining arguments are passed, one by one, directly to that command. Even if the arguments have shell-significant characters, such as the name in $tarfile or the directory names in @dirs, the shell never gets a chance to mangle the string. So that tar command will get precisely five parameters. Compare this with:

system "tar cvf $tarfile @dirs";  # Oops!

Here, we've now piped a bunch of stuff into a flintstone command and put it into the background, and opened betty for output.

And that's a bit scary,[311] especially if those variables are from user input -- such as from a web form or something. So, if you can arrange things so that you can use the multiple-argument version of system, you probably should use that way to launch your subprocess. (You'll have to give up the ability to have the shell do the work for you to set up I/O redirection, background processes, and the like, though. There's no such thing as a free launch.)

[311]Unless you're using taint checking and have done all the right things to prescan your data to ensure that the user isn't trying to pull a fast one on you.

Note that redundantly, a single argument invocation of system is nearly equivalent to the proper multiple-argument version of system:

system $command_line;
system "/bin/sh", "-c", $command_line;

But nobody writes the latter, unless you want things to be processed by a different shell, like the C-shell:

system "/bin/csh", "-fc", $command_line;

Even this is pretty rare, since the One True Shell[312] seems to have a lot more flexibility, especially for scripted items.

[312]That's /bin/sh, or whatever your Unix system has installed as the most Bourne-like shell. If you don't have a One True Shell, Perl figures out how to invoke some other command-line interpreter, with notable consequences -- noted, that is, in the documentation for that Perl port.

The return value of the system operator is based upon the exit status of the child command[313]. In Unix, an exit value of 0 means that everything is OK, and a non-zero exit value usually indicates that something went wrong:

[313]It's actually the "wait" status, which is the child exit code times 256, plus 128 if core was dumped, plus the signal number triggering termination, if any. But we rarely check the specifics of that, and a simple true/false value suffices for nearly all applications.

unless (system "date") {
  # Return was zero - meaning success
  print "We gave you a date, OK!\n";
}

Note that this is backward from the normal "true is good -- false is bad" strategy for most of the operators, so to write a typical "do this or die" style, we'll need to flip false and true. The easiest way is to simply prefix the system operator with a bang (the logical-not operator):

!system "rm -rf files_to_delete" or die "something went wrong";

In this case, including $! in the error message would not be appropriate, because the failure is most likely somewhere within the experience of the rm command, and it's not a system-call related error within Perl that $! can reveal.

Chapter 14. Process Management

Contents:

14.1. The system Function

14.1.1. Avoiding the Shell