home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam  


UNIX Power Tools

UNIX Power ToolsSearch this book
Previous: 43.17 Don't Have nroff?  Try gnroff or awf Chapter 43
Printing
Next: 43.19 Removing Leading Tabs and Other Trivia
 

43.18 How nroff Makes Bold and Underline; How to Remove It

The UNIX formatter nroff produces output for line printers and CRT displays. To achieve such special effects as emboldening, it outputs the character followed by a backspace and then outputs the same character again. A sample of it viewed with a text editor or cat -v (25.7 ) might look like:

N^HN^HN^HNA^HA^HA^HAM^HM^HM^HME^HE^HE^HE

which emboldens the word "NAME." There are three overstrikes for each character output. Similarly, underlining is achieved by outputting an underscore, a backspace, and then the character to be underlined. Some pagers, such as less (25.4 ) , take advantage of overstruck text. But there are many times when it's necessary to strip these special effects; for example, if you want to grep through formatted man pages (as we do in article 50.3 ). There are a number of ways to get rid of these decorations. The easiest way to do it is to use a utility like col , colcrt , or ul :

  • With col , use the command:

    % col -b <
    
     nroffoutput 
    
    > 
    
    strippedoutput
    
    

    The -b option tells col to strip all backspaces (and the character preceding the backspace) from the file. col doesn't read from files; you need to redirect input from a pipe-or, as above, with the shell < (13.1 ) file-redirection character. col is available on System V and BSD UNIX. Under System V, add the -x option to avoid changing spaces to TABs.

  • With colcrt , use a command like:

    % colcrt - 
    
    nroffoutput 
    
    >
    
     strippedoutput
    
    

    The - (dash) option (yes, that's an option) says "ignore underlining." If you omit it, colcrt tries to save underlining by putting the underscores on a separate line. For example:

    Refer to Installing System V for information about
             ---------- ------ -
    installing optional software.

    colcrt is only available under BSD; in any case, col is probably preferable.

  • ul reads your TERM environment variable, and tries to translate backspace (underline and overstrike) into something your terminal can understand. It's used like this:

    % ul 
    
    nroffoutput
    
    

    The -t  term option lets you specify a terminal type; it overrides the TERM (5.10 ) variable. I think that ul is probably the least useful of these commands; it tries to be too intelligent, and doesn't always do what you want.

Both col and colcrt attempt to handle "half linefeeds" (used to print superscripts and subscripts) reasonably. Many printers handle half linefeeds correctly, but most terminals can't deal with them.

Here's one other solution to the problem: a simple sed (34.24 ) script. The virtue of this solution is that you can elaborate on it, adding other features that you'd like, or integrating it into larger sed scripts. The following sed command removes the sequences for emboldening and underscoring:

s/.^H//g

It removes any character preceding the backspace along with the backspace itself. In the case of underlining, "." matches the underscore; for emboldening, it matches the overstrike character. Because it is applied repeatedly, multiple occurrences of the overstrike character are removed, leaving a single character for each sequence. Note that ^H is the single character CTRL-h. If you're a vi user, enter this character by typing CTRL-v followed by CTRL-h (31.6 ) . If you're an emacs user, type CTRL-q followed by CTRL-h (32.10 ) .

- DD , ML


Previous: 43.17 Don't Have nroff?  Try gnroff or awf UNIX Power Tools Next: 43.19 Removing Leading Tabs and Other Trivia
43.17 Don't Have nroff? Try gnroff or awf Book Index 43.19 Removing Leading Tabs and Other Trivia

The UNIX CD Bookshelf NavigationThe UNIX CD BookshelfUNIX Power ToolsUNIX in a NutshellLearning the vi Editorsed & awkLearning the Korn ShellLearning the UNIX Operating System