home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam  


Unix Power ToolsUnix Power ToolsSearch this book

35.19. Shell Script "Wrappers" for awk, sed, etc.

Although most scripts for most languages can execute directly (Section 36.3) without needing the Bourne shell, it's common to "wrap" other scripts in a shell script to take advantage of the shell's strengths. For instance, sed can't accept arbitrary text on its command line, only commands and filenames. So you can let the shell handle the command line (Section 35.20) and pass information to sed via shell variables, command substitution, and so on. Simply use correct quoting (Section 27.12) to pass information from the shell into the "wrapped" sed script:

|| Section 35.14

#!/bin/sh
# seder - cd to directory in first command-line argument ($1),
# read all files and substitute $2 with $3, write result to stdout
cd "$1" || exit
sed "s/$2/$3/g" *

In SunExpert magazine, in his article on awk (January, 1991), Peter Collinson suggests a stylization similar to this for awk programs in shell scripts (Section 35.2):

#!/bin/sh
awkprog='
/foo/{print $3}
/bar/{print $4}'

awk "$awkprog" $*

He argues that this is more intelligible in long pipelines because it separates the program from the command. For example:

grep foo $input | sed .... | awk "$awkprog" - | ...

Not everyone is thrilled by the "advantages" of writing awk this way, but it's true that there are disadvantages to writing awk the standard way.

Here's an even more complex variation:

<<\ Section 27.16

#!/bin/sh
temp=/tmp/awk.prog.$$
cat > $temp <<\END
/foo/{print $3}
/bar/{print $4}
END
awk -f $temp $1
rm -f $temp

This version makes it a bit easier to create complex programs dynamically. The final awk command becomes the equivalent of a shell eval (Section 27.8); it executes something that has been built up at runtime. The first strategy (program in shell variable) could also be massaged to work this way.

As another example, a program that I used once was really just one long pipeline, about 200 lines long. Huge awk scripts and sed scripts intervened in the middle. As a result, it was almost completely unintelligible. But if you start each program with a comment block and end it with a pipe, the result can be fairly easy to read. It's more direct than using big shell variables or temporary files, especially if there are several scripts.

#
# READ THE FILE AND DO XXX WITH awk:
#
awk '
   ...the indented awk program...
   ...
   ...
' |
#
# SORT BY THE FIRST FIELD, THEN BY XXX:
#
sort +0n -1 +3r |
#
# MASSAGE THE LINES WITH sed AND XXX:
#
sed '
   ...

Multiline pipes like that one are uglier in the C shell because each line has to end with a backslash (\) (Section 27.13). Section 27.12 and Section 27.13 have more about quoting.

--ML and JP



Library Navigation Links

Copyright © 2003 O'Reilly & Associates. All rights reserved.