The Next command differs from the next command, which outputs the
contents of the pattern space and then reads a new line of input. The
next command does not create a multiline pattern space.
For our first example, let's suppose that we wanted to change "Owner
and Operator Guide" to "Installation Guide" but we found that it
appears in the file on two lines, splitting between "Operator" and
"Guide."
For instance, here are a few lines of sample text:
Consult Section 3.1 in the Owner and Operator
Guide for a description of the tape drives
available on your system.
The following script looks for "Operator" at the end of a line, reads
the next line of input and then makes the replacement.
/Operator$/{
N
s/Owner and Operator\nGuide/Installation Guide/
}
In this example, we know where the two lines split and where to
specify the embedded newline. When the script is run on the sample
file, it produces the two lines of output, one of which combines the
first and second lines and is too long to show here. This happens
because the substitute command matches the embedded newline but does
not replace it. Unfortunately, you cannot use "\n" to
insert a newline in the replacement string. You must use
a backslash to escape the newline, as follows:
s/Owner and Operator\nGuide /Installation Guide\
/
This command restores the newline after "Installation Guide". It is
also necessary to match a space following "Guide" so the new
line won't begin with a space. Now we can show the output:
Consult Section 3.1 in the Installation Guide
for a description of the tape drives
available on your system.
Remember, you don't have to replace the newline but if you don't it
can make for some long lines.
What if there are other occurrences of "Owner and Operator Guide" that
break over multiple lines in different places? You could modify the
regular expression to look for a space or a newline between words, as
shown below:
/Owner/{
N
s/Owner *\n*and *\n*Operator *\n*Guide/Installation Guide/
}
The asterisk indicates that the space or newline is optional. This
seems like hard work, though, and indeed there is a more general way.
We have also changed the address to match "Owner," the first word in
the pattern instead of the last. We can read the newline into the
pattern space and then use a substitute command to remove the embedded
newline, wherever it is.
s/Owner and Operator Guide/Installation Guide/
/Owner/{
N
s/ *\n/ /
s/Owner and Operator Guide */Installation Guide\
/
}
The first line matches "Owner and Operator Guide" when it appears on a
line by itself. (See the discussion after the example about why this
is necessary.) If we match the string "Owner," we read the next line
into the pattern space, and replace the embedded newline with a space.
Then we attempt to match the whole pattern and make the replacement
followed by a newline. This script will match "Owner and Operator
Guide" regardless of how it is broken across two lines. Here's our
expanded test file:
Consult Section 3.1 in the Owner and Operator
Guide for a description of the tape drives
available on your system.
Look in the Owner and Operator Guide shipped with your system.
Two manuals are provided including the Owner and
Operator Guide and the User Guide.
The Owner and Operator Guide is shipped with your system.
Running the above script on the sample file produces the following
result:
$ sed -f sedscr sample
Consult Section 3.1 in the Installation Guide
for a description of the tape drives
available on your system.
Look in the Installation Guide shipped with your system.
Two manuals are provided including the Installation Guide
and the User Guide.
The Installation Guide is shipped with your system.
In this sample script, it might seem redundant to have two substitute
commands that match the pattern. The first one matches it when the
pattern is found already on one line and the second matches the
pattern after two lines have been read into the pattern space. Why
the first command is necessary is perhaps best demonstrated by
removing that command from the script and running it on the sample
file:
$ sed -f sedscr2 sample
Consult Section 3.1 in the Installation Guide
for a description of the tape drives
available on your system.
Look in the Installation Guide
shipped with your system.
Two manuals are provided including the Installation Guide
and the User Guide.
Do you see the two problems? The most obvious problem is that the
last line did not print. The last line matches "Owner" and when
N is executed, there is not another input line to
read, so sed quits (immediately, without even outputting the line). To
fix this, the Next command should be used as follows to be safe:
$!N
It excludes the last line ($) from the Next command. As it is in our
script, by matching "Owner and Operator Guide" on the last line, we
avoid matching "Owner" and applying the N command.
However, if the word "Owner" appeared on the last line we'd have the
same problem unless we use the "$!N" syntax.
The second problem is a little less conspicuous. It has to do with
the occurrence of "Owner and Operator Guide" in the second paragraph.
In the input file, it is found on a line by itself:
Look in the Owner and Operator Guide shipped with your system.
In the output shown above, the blank line following "shipped with your
system." is missing. The reason for this is that this line matches
"Owner" and the next line, a blank line, is appended to the pattern
space. The substitute command removes the embedded newline and the
blank line has in effect vanished. (If the line were not blank, the
newline would still be removed but the text would appear on the same
line with "shipped with your system.") The best solution seems to be
to avoid reading the next line when the pattern can be matched on one
line. So, that is why the first instruction attempts to match the
case where the string appears all on one line.