[Chapter 4] 4.4 Command Substitution

4.4 Command Substitution

From the discussion so far, we've seen two ways of getting values into variables: by assignment statements and by the user supplying them as command-line arguments (positional parameters). There is another way: command substitution , which allows you to use the standard output of a command as if it were the value of a variable. You will soon see how powerful this feature is.

The syntax of command substitution is: [11]

[11] Bourne and C shell users should note that the command substitution syntax of those shells, ` UNIX command ` (with backward quotes, a.k.a. grave accents), is also supported by the Korn shell for backward compatibility reasons. However, Korn shell documentation considers this syntax archaic. It is harder to read and less conducive to nesting.

$(
UNIX command
)

The command inside the parenthesis is run, and anything the command writes to standard output is returned as the value of the expression. These constructs can be nested, i.e., the UNIX command can contain command substitutions.

Here are some simple examples:

The value of $(pwd ) is the current directory (same as the environment variable $PWD ).
The value of $(ls) is the names of all files in the current directory, separated by NEWLINEs.
To find out detailed information about a command if you don't know where its file resides, type ls -l $(whence -p command ). The -p option forces whence to do a pathname lookup and not consider keywords, built-ins, etc.
To get the contents of a file into a variable, you can use varname =$(< filename ). $(cat filename ) will do the same thing, but the shell catches the former as a built-in shorthand and runs it more efficiently.
If you want to edit (with emacs ) every chapter of your book on the Korn shell that has the phrase "command substitution," assuming that your chapter files all begin with ch , you could type:
```
emacs $(grep -l 'command substitution' ch

*
)
```
The -l option to grep prints only the names of files that contain matches.

Command substitution, like variable and tilde expansion, is done within double quotes. Therefore, our rule in Chapter 1 and Chapter 3 , about using single quotes for strings unless they contain variables will now be extended: "When in doubt, use single quotes, unless the string contains variables or command substitutions, in which case use double quotes."

You will undoubtedly think of many ways to use command substitution as you gain experience with the Korn shell. One that is a bit more complex than those mentioned previously relates to a customization task that we saw in Chapter 3 : personalizing your prompt string.

Recall that you can personalize your prompt string by assigning a value to the variable PS1 . If you are on a network of computers, and you use different machines from time to time, you may find it handy to have the name of the machine you're on in your prompt string. Most newer versions of UNIX have the command hostname (1), which prints the network name of the machine you are on to standard output. (If you do not have this command, you may have a similar one like gethostname .) This command enables you to get the machine name into your prompt string by putting a line like this in your .profile or environment file:

PS1="$(hostname) \$ "

(The second dollar sign must be preceded by a backslash so that the shell will take it literally.) For example, if your machine had the name coltrane , then this statement would set your prompt string to " coltrane $ ".

Command substitution helps us with the solution to the next programming task, which relates to the album database in Task 4-1.

Task 4.4

The file used in Task 4-1 is actually a report derived from a bigger table of data about albums. This table consists of several columns, or fields , to which a user refers by names like "artist," "title," "year," etc. The columns are separated by vertical bars ( | , the same as the UNIX pipe character). To deal with individual columns in the table, field names need to be converted to field numbers.

Suppose there is a shell function called getfield that takes the field name as argument and writes the corresponding field number on the standard output. Use this routine to help extract a column from the data table.

The cut (1) utility is a natural for this task. cut is a data filter: it extracts columns from tabular data. [12] If you supply the numbers of columns you want to extract from the input, cut will print only those columns on the standard output. Columns can be character positions or-relevant in this example-fields that are separated by TAB characters or other delimiters.

[12] Some older BSD-derived systems don't have cut , but you can use awk instead. Whenever you see a command of the form:
cut -f
N
 -d
C filename
Use this instead:
awk -F
C
 '{print $
N
}' 
filename

Assume that the data table in our task is a file called albums and that it looks like this:

Coltrane, John|Giant Steps|Atlantic|1960|Ja
Coltrane, John|Coltrane Jazz|Atlantic|1960|Ja
Coltrane, John|My Favorite Things|Atlantic|1961|Ja
Coltrane, John|Coltrane Plays the Blues|Atlantic|1961|Ja
...

Here is how we would use cut to extract the fourth (year) column:

cut -f4 -d\| albums

The -d argument is used to specify the character used as field delimiter ( TAB is the default). The vertical bar must be backslash-escaped so that the shell doesn't try to interpret it as a pipe.

From this line of code and the getfield routine, we can easily derive the solution to the task. Assume that the first argument to getfield is the name of the field the user wants to extract. Then the solution is:

fieldname=$1
cut -f$(getfield $fieldname) -d\| albums

If we called this script with the argument year , the output would be:

Here's another small task that makes use of cut .

Task 4.5

Send a mail message to everyone who is currently logged in.

The command who (1) tells you who is logged in (as well as which terminal they're on and when they logged in). Its output looks like this:

billr      console      May 22 07:57
fred       tty02        May 22 08:31
bob        tty04        May 22 08:12

The fields are separated by spaces, not TAB s. Since we need the first field, we can get away with using a space as the field separator in the cut command. (Otherwise we'd have to use the option to cut that uses character columns instead of fields.) To provide a space character as an argument on a command line, you can surround it by quotes:


$ 

who | cut -d' ' -f1

With the above who output, this command's output would look like this:

billr
fred
bob

This leads directly to a solution to the task. Just type:

$ 
mail $(who | cut -d
' '
  -f1)

The command mail billr fred bob will run and then you can type your message.

Here is another task that shows how useful command pipelines can be in command substitution.

Task 4.6

The ls command gives you pattern-matching capability with wildcards, but it doesn't allow you to select files by modification date . Devise a mechanism that lets you do this.

This task was inspired by the feature of the VAX/VMS operating system that lets you specify files by date with BEFORE and SINCE parameters. We'll do this in a limited way now and add features in the next chapter.

Here is a function that allows you to list all files that were last modified on the date you give as argument. Once again, we choose a function for speed reasons. No pun is intended by the function's name:

function lsd {
    date=$1
    ls -l | grep -i '^.\{41\}$date' | cut -c55-
}

This function depends on the column layout of the ls -l command. In particular, it depends on dates starting in column 42 and filenames starting in column 55. If this isn't the case in your version of UNIX, you will need to adjust the column numbers. [13]

[13] For example, ls -l on SunOS 4.1.x has dates starting in column 33 and filenames starting in column 46.

We use the grep search utility to match the date given as argument (in the form Mon DD , e.g., Jan 15 or Oct 6 , the latter having two spaces) to the output of ls -l . This gives us a long listing of only those files whose dates match the argument. The -i option to grep allows you to use all lowercase letters in the month name, while the rather fancy argument means, "Match any line that contains 41 characters followed by the function argument." For example, typing lsd ' jan 15 ' causes grep to search for lines that match any 41 characters followed by jan 15 (or Jan 15 ). [14]

[14] Some older BSD-derived versions of UNIX (without System V extensions) do not support the \{ N \} option. For this example, use 41 periods in a row instead of .\{41\} .

The output of grep is piped through our ubiquitous friend cut to retrieve the filenames only. The argument to cut tells it to extract characters in column 55 through the end of the line.

With command substitution, you can use this function with any command that accepts filename arguments. For example, if you want to print all files in your current directory that were last modified today, and today is January 15th, you could type:

$ 
lp $(lsd 
'
jan 15
'
)

The output of lsd is on multiple lines (one for each filename), but LINEFEED s are legal field separators for the lp command, because the environment variable IFS (see earlier in this chapter) contains LINEFEED by default.