On a typewriter-like device (including a CRT), an em-dash
is typed as a pair of hyphens (
Similarly, a typesetter provides "curly" quotation marks ("
as opposed to a typewriter's straight quotes (
A peculiarity of
is that it generates the space before each word in the
font used at the beginning of that word. This means that when we
mix a constant-width font such as Courier within text, we get a
noticeably large space before each word, which can be distracting
for readers - for example:
The solution for each of these problems is to preprocess troff input with. This is an application that shows sed in its role as a true stream editor, making edits in a pipeline - edits that are never written back into a file.
We almost never invoke troff directly. Instead, we invoke it with a script that strings together a pipeline including the standard preprocessors (when appropriate) as well as doing this special preprocessing with sed .
The sed commands themselves are fairly simple.
The following command changes two consecutive dashes into an em-dash:
We double the backslashes in the replacement string
However, there may be cases in which we don't want this substitution command to be applied. What if someone is using hyphens to draw a horizontal line? We can refine the script to exclude lines containing three or more consecutive hyphens. To do this, we use the:
It may take a moment to penetrate this syntax. What's different is that we use a pattern address to restrict the lines that are affected by the substitute command, and we use ! to reverse the sense of the pattern match. It says, simply, "If you find a line containing three consecutive hyphens, don't apply the edit." On all other lines, the substitute command will be applied.
Similarly, to deal with the font change problem, we can use
to search for all strings matching
To deal with the open and closed quote problem, the script needs to be more involved because there are many separate cases that must be accounted for. You need to make sed smart enough to change double quotes to open quotes only at the beginning of words and to change them to closed quotes only at the end of words. Such a script might look like the one below, which obviously could be shortened by judicious application ofregular expression syntax, but it is shown in its long form for effect.
s/^"/``/ s/"$/''/ s/"? /''? /g s/"?$/''?/ s/ "/ ``/g s/" /'' /g s/ [TAB] "/ [TAB] ``/g s/" [TAB] /'' [TAB] /g s/")/'')/g s/"]/'']/g s/("/(``/g s/\["/\[``/g s/";/'';/g s/":/'':/g s/,"/,''/g s/",/'',/g s/\."/.\\\&''/g s/"\./''.\\\&/g s/"\\(em/''\\(em/g s/\\(em"/\\(em``/g