34.15 Making Edits Across Line BoundariesMost programs that use regular expressions (26.4 ) are able to match a pattern only on a single line of input. This makes it difficult to find or change a phrase, for instance, because it can start near the end of one line and finish near the beginning of the next line. Other patterns might be significant only when repeated on multiple lines. sed has the ability to load more than one line into the pattern space. This allows you to match (and change) patterns that extend over multiple lines. In this article, we show how to create a multiline pattern space and manipulate its contents. The multiline Next command, N
, creates a multiline pattern space
by reading a new line of input and appending it to the
contents of the pattern space.
The original contents of the pattern space and the new input line
are separated by a newline.
The embedded newline character can be matched in patterns
by the escape sequence The Next command differs from the next command, n , which outputs the contents of the pattern space and then reads a new line of input. The next command does not create a multiline pattern space. For our first example, let's suppose that we wanted to
change "Owner and Operator Guide" to "Installation Guide"
but we found that it appears in the file on two lines,
splitting between Consult Section 3.1 in the Owner and Operator Guide for a description of the tape drives available on your system. The following script looks for /Operator$/{ N s/Owner and Operator\nGuide/Installation Guide/ }
In this example, we know where the two lines split and
where to specify the embedded newline.
When the script is run on the sample file, it produces
the two lines of output, one of which combines
the first and second lines and is too long
to show here.
This happens because the substitute command matches
the embedded newline but does not replace it.
Unfortunately, you cannot use s/Owner and Operator\nGuide /Installation Guide\ / or use the
s/Owner and Operator\(\n\)Guide /Installation Guide\1/ This command restores the newline after Consult Section 3.1 in the Installation Guide for a description of the tape drives available on your system. Remember, you don't have to replace the newline, but if you don't, it can make for some long lines. What if there are other occurrences of "Owner and Operator Guide" that break over
multiple lines in different places? You could
change the address to match /Owner/{ N s/Owner *\n*and *\n*Operator *\n*Guide/Installation Guide/ } The asterisk ( s/Owner and Operator Guide/Installation Guide/ /Owner/{ N s/ *\n/ / s/Owner and Operator Guide */Installation Guide\ / } The first line of the script matches Consult Section 3.1 in the Owner and Operator Guide for a description of the tape drives available on your system. Look in the Owner and Operator Guide shipped with your system. Two manuals are provided, including the Owner and Operator Guide and the User Guide. The Owner and Operator Guide is shipped with your system. Running the above script on the sample file produces the following result: % In this sample script, it might seem redundant to have two substitute commands that match the pattern. The first command matches it when the pattern is found already on one line, and the second matches the pattern after two lines have been read into the pattern space. Why the first command is necessary is perhaps best demonstrated by removing that command from the script and running it on the sample file: % Do you see the two problems?
The most obvious problem is that the last line
did not print. The last line matches $!N It excludes the last line ( The second problem is a little less conspicuous. It has
to do with the occurrence of Look in the Owner and Operator Guide shipped with your system. In the output shown above, the blank line following
- from O'Reilly & Associates' sed & awk , Chapter 6 |
|