45.30 Grabbing Parts of a StringHow can you parse (split, search) a string of text to find the last word, the second column, and so on? There are a lot of different ways. Pick the one that works best for you - or invent another one! (UNIX has slots of ways to work with strings of text.) 45.30.1 Matching with expr
The
expr
command (45.28
)
can grab part of a string with a regular expression.
The example below is from a shell script whose last command-line argument
is a filename.
The two commands below use expr
to grab the last argument and all
arguments except the last one.
The last=`expr "$*" : '.* \(.*\)'` # LAST ARGUMENT first=`expr "$*" : '\(.*\) .*'` # ALL BUT LAST ARGUMENT Let's look at the regular expression that gets the last word.
The leading part of the expression, The regular expression that grabs the first words is the same as the
previous one - but I've moved the
expr
is great when you want to split a string into just two parts.
The 45.30.2 Using echo with awk, colrm, or cutawk can split lines into words. But awk has a lot of overhead and can take some time to execute, especially on a busy system. The cut (35.14 ) and colrm (35.15 ) commands start more quickly than awk but they can't do as much. All of those utilities are designed to handle multiple lines of text. You can tell awk to handle a single line with its pattern-matching operators and its NR variable. You can also run those utilities with a single line of text, fed to the standard input through a pipe from echo (8.6 ) . For example, to get the third field from a colon-separated string: string="this:is:just:a:dummy:string" field3_awk=`echo "$string" | awk -F: '{print $3}'` field3_cut=`echo "$string" | cut -d: -f3` Let's combine two echo commands. One sends text to awk , cut , or colrm through a pipe; the utility ignores all the text from columns 1-24, then prints columns 25 to the end of the variable text . The outer echo prints The answer is and that answer. Notice that the inner double quotes are escaped with backslashes to keep the Bourne shell from interpreting them before the inner echo runs: echo "The answer is `echo \"$text\" | awk '{print substr($0,25)}'`" echo "The answer is `echo \"$text\" | cut -c25-`" echo "The answer is `echo \"$text\" | colrm 1 24`" 45.30.3 Using set
The Bourne shell
set
(44.19
)
command can be used to parse a single-line string and
store it in the
command-line parameters (44.15
)
45.30.4 Using sedThe UNIX sed (34.24 ) utility is good at parsing input that you may or may not be able to split into words otherwise, at finding a single line of text in a group and outputting it, and many other things. In this example, I want to get the percentage-used of the filesystem mounted on /home . That information is buried in the output of the df (24.9 ) command. On my system, df output looks like: %
I want the number 99
from the line ending with /home
.
The sed
address usage=`df | sed -n '/ \/home$/s/.* \([0-9][0-9]*\)%.*/\1/p'` Combining sed with eval (8.10 ) lets you set several shell variables at once from parts of the same line. Here's a command line that sets two shell variables from the df output: eval `df | sed -n '/ \/home$/s/^[^ ]* *\([0-9]*\) *\([0-9]*\).*/kb=\1 u=\2/p'` The left-hand side of that substitution command has a regular expression that uses sed 's escaped parenthesis operators. They grab the "kbytes" and "used" columns from the df output. The right-hand side outputs the two df values with Bourne shell variable-assignment commands to set the kb and u variables. After sed finishes, the resulting command line looks like this: eval kb=597759 u=534123 Now - |
|