Advanced Flow Control Commands (sed & awk, Second Edition)

6.4. Advanced Flow Control Commands

You have already seen several examples of changes in sed's normal flow control. In this section, we'll look at two commands that allow you to direct which portions of the script get executed and when. The branch (b) and test (t) commands transfer control in a script to a line containing a specified label. If no label is specified, control passes to the end of the script. The branch command transfers control unconditionally while the test command is a conditional transfer, occurring only if a substitute command has changed the current line.

:mylabel

There are no spaces permitted between the colon and the label. Spaces at the end of the line will be considered part of the label. When you specify the label in a branch or test command, a space is permitted between the command and the label itself:

b mylabel

Be sure you don't put a space after the label.

6.4.3. One More Case

Remember Lenny? He was the fellow given the task of converting Scribe documents to troff. We had sent him the following script:

# Scribe font change script. 
s/@f1(\([^)]*\))/\\fB\1\\fR/g
/@f1(.*/{
N
s/@f1(\(.*\n[^)]*\))/\\fB\1\\fR/g
P
D
}

He sent the following mail after using the script:

Thank you so much!  You've not only fixed the script but shown me
where I was confused about the way it works.  I can repair the
conversion script so that it works with what you've done, but to be
optimal it should do two more things that I can't seem to get working
at all--maybe it's hopeless and I should be content with what's
there.  

First, I'd like to reduce multiple blank lines down to one.
Second, I'd like to make sed match the pattern over more than two
(say, even only three) lines.  

Thanks again.  

Lenny

The first request to reduce a series of blank lines to one has already been shown in this chapter. The following four lines perform this function:

/^$/{
N
/^\n$/D
}

We want to look mainly at accomplishing the second request. Our previous font-change script created a two-line pattern space, tried to make the match across those lines, and then output the first line. The second line became the first line in the pattern space and control passed to the top of the script where another line was read in.

We can use labels to set up a loop that reads multiple lines and makes it possible to match a pattern across multiple lines. The following script sets up two labels: begin at the top of the script and again near the bottom. Look at the improved script:

# Scribe font change script.  New and Improved.
:begin
/@f1(\([^)]*\))/{
s//\\fB\1\\fR/g
b begin
}
/@f1(.*/{
N
s/@f1(\([^)]*\n[^)]*\))/\\fB\1\\fR/g
t again
b begin
}
:again
P
D

Let's look more closely at this script, which has three parts. Beginning with the line that follows :begin, the first part attempts to match the font change syntax if it is found completely on one line. After making the substitution, the branch command transfers control back to the label begin. In other words, once we have made a match, we want to go back to the top and look for other possible matches, including the instruction that has already been applied--there could be multiple occurrences on the line.

The second part attempts to match the pattern over multiple lines. The Next command builds a multiple line pattern space. The substitution command attempts to locate the pattern with an embedded newline. If it succeeds, the test command passes control to the line following the again label. If no substitution is made, control is passed to the line following the label begin so that we can read in another line. This is a loop that goes into effect when we've matched the beginning sequence of a font change request but have not yet found the ending sequence. Sed will loop back and keep appending lines into the pattern space until a match has been found.

The third part is the procedure following the label again. The first line in the pattern space is output and then deleted. Like the previous version of this script, we deal with multiple lines in succession. Control never reaches the bottom of the script but is redirected by the Delete command to the top of the script.

6.4. Advanced Flow Control Commands

6.4.1. Branching

6.4.2. The Test Command

6.4.3. One More Case