Sed is easily seen as the flip side of interactive editing. A sed
procedure corresponds closely enough to how you would apply the
editing commands manually. Sed limits you to the methods you use in a
text editor. Awk offers a more general computational model for
processing a file.
A typical example of an awk program is one that transforms data into a
formatted report. The data might be a log file generated by a UNIX
program such as uucp, and the report might
summarize the data in a format useful to a system administrator.
Another example is a data processing application consisting of
separate data entry and data retrieval programs. Data entry is the
process of recording data in a structured way. Data retrieval is the
process of extracting data from a file and generating a report.
The key to all of these operations is that the data has some kind of
structure. Let us illustrate this with the analogy of a bureau. A
bureau consists of multiple drawers, and each drawer has a certain set
of contents: socks in one drawer, underwear in another, and sweaters
in a third drawer. Sometimes drawers have compartments allowing
different kinds of things to be stored together. These are all
structures that determine where things go--when you are sorting
the laundry--and where things can be found--when you are
getting dressed. Awk allows you to use the structure of a text file
in writing the procedures for putting things in and taking things out.
Thus, the benefits of awk are best realized when the data has some
kind of structure. A text file can be loosely or tightly structured.
A chapter containing major and minor sections has some structure.
We'll look at a script that extracts section headings and numbers
them to produce an outline. A table consisting of tab-separated items
in columns might be considered very structured. You could use an awk
script to reorder columns of data, or even change columns into rows
and rows into columns.
Like sed scripts, awk scripts are typically invoked by means of a
shell wrapper. This is a shell script that usually contains the
command line that invokes awk as well as the script that awk
interprets. Simple one-line awk scripts can be entered from the
command line.
Some of the things awk allows you to do are:
View a text file as a textual database made up of records and fields.
Use variables to manipulate the database.
Use arithmetic and string operators.
Use common programming constructs such as loops and conditionals.
Generate formatted reports.
Define functions.
Execute UNIX commands from a script.
Process the result of UNIX commands.
Process command-line arguments more gracefully.
Work more easily with multiple input streams.
Because of these features, awk has the power and range that users
might rely upon to do the kinds of tasks performed by shell scripts.
In this book, you'll see examples of a menu-based command generator,
an interactive spelling checker, and an index processing program, all
of which use the features outlined above.
The capabilities of awk extend the idea of text editing into
computation, making it possible to perform a variety of data
processing tasks, including analysis, extraction, and reporting of
data. These are, indeed, the most common uses of awk but there are
also many unusual applications: awk has been used to write a Lisp
interpreter and even a compiler!
 |  |  |
1.2. A Stream Editor |  | 1.4. Four Hurdles to Mastering sed and awk |