home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam  


UNIX Power Tools

UNIX Power ToolsSearch this book
Previous: 43.18 How nroff Makes Bold and Underline; How to Remove It Chapter 43
Printing
Next: 43.20 Displaying a troff Macro Definition
 

43.19 Removing Leading Tabs and Other Trivia

In article 43.18 we discussed several techniques for removing overstriking and underlining from nroff output. Of course, that's not the only problem you'll face when you're working with nroff . Here are some more postprocessing tricks for nroff files.

You may also want to remove strange escape sequences that produce formfeeds or various other printer functions. For example, you sometimes see the sequence ^[9 at the top of the formatted manual page. This escape sequence can be removed with the sed command:

s/^[9//g

The ESC character is entered in vi by typing CTRL-v ( 31.6 ) followed by the ESC key. In Emacs, use CTRL-q ESC ( 32.10 ) . The number 9 is literal.

The typical manual page also uses leading spaces to establish the left margin and to indent most of the text. On further inspection, you'll see that leading spaces precede headings (such as "NAME"), but a single tab precedes each line of text. Tabs may also appear unexpectedly in the text. Of course, using TABs wherever possible is a good idea on the whole; on a mechanical printer, and even on modern CRT displays, it's much quicker to print a TAB than to move the cursor over several spaces. However, the TABs can cause trouble if your printer (or terminal) isn't set correctly, or when you're trying to search for something in the text.

To eliminate the left margin and the unwanted TABs, use the following two sed commands:

s/^[ 
[TAB]
]*//
s/
[TAB]
/ /g

The first command looks for any number of TABs or spaces at the beginning of a line. The second command looks for a tab and replaces it with a single space.

Now, let's put all these pieces together - including the script to strip underlines and overstrikes (from article 43.18 ). Here's a script called sedman that incorporates all of these tricks.


#!/bin/sed -f
#sedman - deformat nroff-formatted man page
s/.^H//g
s/^[9//g
s/^[ 
[TAB]
]*//
s/
[TAB]
/ /g

Running this script on a typical manual page produces a file that looks like this:

who                                                     who


NAME
who - who is on the system?

SYNOPSIS
who [-a] [-b] [-d] [-H] [-l] [-p] [-q] [-r] [-s] [-t] [-T]
[-u] [file]


who am i

DESCRIPTION
who can list the user's name, terminal line, login time,
elapsed time since activity occurred on the line, and the
...

This doesn't eliminate the unnecessary blank lines caused by paging. See articles 34.18 , 25.11 , and 25.10 for help with that.

- DD , ML


Previous: 43.18 How nroff Makes Bold and Underline; How to Remove It UNIX Power Tools Next: 43.20 Displaying a troff Macro Definition
43.18 How nroff Makes Bold and Underline; How to Remove It Book Index 43.20 Displaying a troff Macro Definition

The UNIX CD Bookshelf Navigation The UNIX CD BookshelfUNIX Power ToolsUNIX in a NutshellLearning the vi Editorsed & awkLearning the Korn ShellLearning the UNIX Operating System