Initially, using sed and awk will seem like the long way
to accomplish a task. After several attempts you may conclude
that the task would have been easier to do manually.
Be patient. You not only have to learn how to use sed and awk but
you also need to learn to recognize situations where using them
pays off. As you become more proficient, you will solve problems
more quickly and solve a broader range of problems.
You will also begin to see opportunities to find general solutions to
specific problems. There is a way of looking at a problem so you see
it related to a class of problems. Then you can devise a solution
that can be reused in other situations.
\*[CHerrorhand]
"CHerrorhand" is the name giving the reference and "\*[" and
"]" are calling sequences that distinguish the reference from other
text. In a central file, the names used for cross references in the
document are defined as sqtroff strings. For
instance, "CHerrorhand" is defined to be "Chapter 16, Error Handling."
(The advantage of using a symbolic cross-referencing scheme like this,
instead of explicit referencing, is that if chapters are added or
deleted or reordered, only the central file needs to be edited to
reflect the new organization.) When the formatting software processes
the document, the references are properly resolved and expanded.
The problem we faced was that we had to use the same files to create
an online version of the book. Because our sqtroff
formatting software would not be used, we needed some way to expand
the cross references in the files. In other words, we did not want
files containing "\*[CHerrorhand]"; instead we wanted what
"CHerrorhand" referred to.
There were three possible ways to solve this problem:
Use a text editor to search for all references and replace
each of them with the appropriate literal string.
Use sed to make the edits. This is similar to making the edits manually,
only faster.
Use awk to write a program that (a) reads the central file to
make a list of reference names and their definitions, (b) reads
the document searching for the reference calling sequence,
and (c) looks up the name of the reference on the list and
replaces it with its definition.
The first method is obviously time-consuming (and not very
interesting!). The second method, using sed, has an advantage in that
it creates a tool to do the job. It is pretty simple to write a sed
script that looks for "\*[CHerrorhand]" and replaces it with
"Chapter 16, Error Handling" for instance. The same script can be
used to modify each of the files for the document. The disadvantage
is that the substitutions are hard-coded; that is, for each cross
reference, you need to write a command that makes the replacement.
The third method, using awk, builds a tool that works for
any cross reference that follows this syntax.
This script could be used to expand cross references in other books as
well. It spares you from having to compile a list of specific
substitutions. It is the most general solution of the three and
designed for the greatest possible reuse as a tool.
Part of solving a problem is knowing which tool to build. There are
times when a sed script is a better choice because the problem does
not lend itself to, or demand, a more complex awk script. You have to
keep in mind what kinds of applications are best suited for sed and
awk.