21.2. Neatening Text with fmt
One of the problems
with fold is that it breaks text at an arbitrary
column position -- even if that position happens to be in the
middle of a word. It's a pretty primitive utility,
designed to keep long lines from printing off the edge of a line
printer page, and not much more.
fmt can do a better job because it thinks in terms
of language constructs like paragraphs.
fmt wraps lines continuously, rather than just
folding the long ones. It assumes that paragraphs end at blank lines.
You can use fmt for things like neatening lines of
a mail message or a file that you're editing
with vi (Section 17.28). (Emacs has
its own built-in line-neatener.) It's also great for
shell programming and almost any place you have lines that are too
long or too short for your screen.
To make this discussion more concrete, let's imagine
that you have the following paragraph:
Most people take their Emo Phillips for granted. They figure, and not
without some truth, that he is a God-given right and any government that
considers itself a democracy would naturally provide
its citizens with this
sort of access. But what too many of this Gap-wearing,
Real World-watching generation fail to realize
is that our American
forefathers, under the tutelage of Zog, the wizened master sage from
Zeta-Reticuli, had to fight not only the godless and effete British
for our system of self-determined government, but also avoid the terrors
of hynpo-death from the dark and
unclean Draco-Repitilians.
To prepare this text for printing, you'd like to
have all the lines be about 60 characters wide and remove the extra
space in the lines. Although you could format this text by hand, GNU
fmt can do this for you with the following command
line:
% fmt -tuw 60 my_file
The -t option,
short for --tagged-paragraph mode, tells
fmt to preserve the paragraph's
initial indent but align the rest of the lines with the left margin
of the second line. The -u option, short for
--uniform-spacing, squashes all the inappropriate
whitespace in the lines. The final option, -w,
sets the width of the output in characters. Like most UNIX commands,
fmt sends its output to
stdout. For our test paragraph,
fmt did this:
Most people take their Emo Phillips for granted.
They figure, and not without some truth, that he is a
God-given right and any government that considers itself a
democracy would naturally provide its citizens with this
sort of access. But what too many of this Gap-wearing,
Real World-watching generation fail to realize is that
our American forefathers, under the tutelage of Zog,
the wizened master sage from Zeta-Reticuli, had to fight
not only the godless and effete British for our system of
self-determined government, but also avoid the terrors of
hynpo-death from the dark and unclean Draco-Repitilians.
There is one
subtlety to fmt to be aware of:
fmt expects sentences to end with a period,
question mark, or exclamation point followed by two spaces. If your
document isn't marked up according to this
convention, fmt can't
differentiate between sentences and abbreviations. This is a common
"gotcha" that appears frequently on
Usenet.
WARNING:
On at least one version of Unix,
fmt is a disk initializer (disk formatter)
command. Don't run that command
accidentally! Check your online manual page and see the
fmt equivalents that follow.
There are a few different versions of fmt, some
fancier than others. In general, the program assumes the following:
-
Paragraphs have blank lines between them.
-
If a line is indented, the indentation should be preserved.
-
The output lines should be about 70 characters wide. Some have a
command-line option to let you set this. For example, fmt
-132 (or on some versions, fmt -l 132)
would reformat your file to have lines with no more than 132
characters on each.
-
It reads files or standard input. Lines will be written to standard
output.
Go to http://examples.oreilly.com/upt3 for more information on: fmt
The GNU
fmt is on the CD-ROM [see http://examples.oreilly.com/upt3]. There are also a couple of
free versions available. Many versions of fmt have
options for other structured data. The -p option (Section 21.4)
reformats program source code. (If your fmt
doesn't have -p, the recomment (Section 21.4)
script uses standard fmt with
sed to do the same thing.) The -s
option breaks long lines at whitespace but doesn't
join short lines to form longer ones.
Alternatively, you can make your own
(Section 21.3) simple (and a little slower) version
with sed and nroff. If you want
to get fancy (and use some nroff and/or
tbl coding), this will let you do automatically
formatted text tables, bulleted lists, and much more.
--JP, TOR, and JJ
 |  |  | 21. You Can't Quite Call This Editing |  | 21.3. Alternatives to fmt |
Copyright © 2003 O'Reilly & Associates. All rights reserved.
|
|