Don't Need a Shell for Your Script? Don't Use One (Unix Power Tools, 3rd Edition)

36.3. Don't Need a Shell for Your Script? Don't Use One

If your Unix understands files that start with:

#!/interpreter/program

(and nearly all of them do by now) you don't have to use those lines to start a shell, such as #!/bin/sh. If your script is just starting a program like awk, Unix can start the program directly and save execution time. This is especially useful on small or overloaded computers, or when your script has to be called over and over (such as in a loop).

First, here are two scripts. Both scripts print the second word from each line of text files. One uses a shell; the other runs awk directly:

% cat with_sh
#!/bin/sh
awk '
{ print $2 }
' $*
% cat no_sh
#!/usr/bin/awk -f
{ print $2 }
% cat afile
one two three four five

Let's run both commands and time (Section 26.2) them. (This is running on a very slow machine. On faster systems, this difference may be harder to measure -- though the difference can still add up over time.)

% time with_sh afile
two
0.1u 0.2s 0:00 26%
% time no_sh afile
two
0.0u 0.1s 0:00 13%

One of the things that's really important to understand here is that when the kernel runs the program on the interpreter line, it is given the script's filename as an argument. If the intepreter program understands a file directly, like /bin/sh does, nothing special needs to be done. But a program like awk or sed requires the -f option to read its script from a file. This leads to the seemingly odd syntax in the example above, with a call to awk -f with no following filename. The script itself is the input file!

One implication of this usage is that the interpreter program needs to understand # as a comment, or the first interpreter-selection line itself will be acted upon (and probably rejected by) the interpreter. (Fortunately, the shells, Perl, sed, and awk, among other programs, do recognize this comment character.)

[One last comment: if you have GNU time or some other version that has a verbose mode, you can see that the major difference between the two invocations is in terms of the page faults each requires. On a relatively speedy Pentium III/450 running RedHat Linux, the version using a shell as the interpreter required more than twice the major page faults and more than three times as many minor page faults as the version calling awk directly. On a system, no matter how fast, that is using a large amount of virtual memory, these differences can be crucial. So opt for performance, and skip the shell when it's not needed. -- SJC]

--JP and SJC