35.10 Splitting Files by Context: csplit
Let's look at search patterns first. Suppose you have an outline consisting of three main sections. You could create a separate file for each section by typing:
%
This command creates four new files (
outline
remains intact).
csplit
displays the character counts for each file. Note that
the first file (
xx00
) contains any text up to
but not including
the first pattern, and that
xx01
contains the first section, as you'd
expect. This is why the naming scheme begins with
00
.
(Even if
outline
had begun immediately with a If you don't want to save the text that occurs before a specified pattern, use a percent sign as the pattern delimiter:
% The preliminary text file has been suppressed, and the created files now begin where the actual outline starts (the file numbering is off, however). Let's make some further refinements. We'll use the -s option to suppress the display of the character counts, and we'll use the -f option to specify a file prefix other than the conventional xx :
%
There's still a slight problem though. In search patterns, a period is a
metacharacter (
26.10
)
that matches any single character, so the pattern
% You can also break a file at repeated occurrences of the same pattern. Let's say you have a file that describes 50 ways to cook a chicken, and you want each method stored in a separate file. Each section begins with headings WAY #1 , WAY #2 , and so on. To divide the file, use csplit 's repeat argument:
% This command splits the file at the first occurrence of WAY , and the number in braces tells csplit to repeat the split 49 more times. Note that a caret is used to match the beginning of the line and that the C shell requires quotes around the braces ( 9.5 ) . The command has created 50 files:
% Quite often, when you want to split a file repeatedly, you don't know or don't care how many files will be created; you just want to make sure that the necessary number of splits takes place. In this case, it makes sense to specify a repeat count that is slightly higher than what you need (maximum is 99). Unfortunately, if you tell csplit to create more files than it's able to, this produces an "out of range" error. Furthermore, when csplit encounters an error, it exits by removing any files it created along the way. (A bug, if you ask me.) This is where the -k option comes in. Specify -k to k eep the files around, even when the "out of range" message occurs. csplit allows you to break a file at some number of lines above or below a given search pattern. For example, to break a file at the line that is five lines below the one containing Sincerely, you could type:
% This situation might arise if you have a series of business letters strung together in one file. Each letter begins differently, but each one begins five lines after the previous letter's Sincerely line. Here's another example, adapted from AT&T's UNIX User's Reference Manual :
%
The idea is that the file
prog.c
contains a group of C routines,
and we want to place each one in a separate file
(
routine.00
,
routine.01
, etc.). The first pattern uses
The csplit command takes line-number arguments in addition to patterns. You can say:
% to create files split at some arbitrary line numbers. In that example, the new file xx00 will have lines 1-49 (49 lines total), xx01 will have lines 50-372 (323 lines total), xx02 will have lines 373-954 (582 lines total), and xx03 will hold the rest of stuff . csplit works like split if you repeat the argument. The command:
% breaks the list into 19 segments of 10 lines each. [5]
- |
|