Shell Function Basics (Unix Power Tools, 3rd Edition)

29.11. Shell Function Basics

Most shells have aliases (Section 29.2). Almost all Bourne-type shells have functions, which are like aliases, but richer and more flexible. Here are four examples.

29.11.1. Simple Functions: ls with Options

Let's start with two aliases from Section 29.2, changed into shell functions: The la function includes "hidden" files in ls listings. The lf function labels the names as directories, executable files, and so on.

function la ( ) { ls -a "$@"; }
function lf ( ) { ls -F "$@"; }

The spaces and the semicolon (;) are important. You don't need them on some shells, but writing functions this way (or in the multiline format in later examples) is more portable.[92] The function keyword is not needed in the original Bourne shell but is required in later versions of bash. The "$@" (Section 35.20) is replaced by any arguments (other options, or directory and filenames) you pass to the function:

[92]A function is a Bourne shell list construct.

$ la -l somedir            ...runs ls -a -l somedir

29.11.2. Functions with Loops: Internet Lookup

Go to http://examples.oreilly.com/upt3 for more information on: mx.sh

The mx function uses dig to look up the DNS MX (mail exchanger) record for a host, then sed (Section 34.1) to pull out the "ANSWER SECTION", which has the hostname or hostnames:

for Section 35.21

function mx( ) {
# Look up mail exchanger for host(s)
for host
do
    echo "==== $host ===="
    dig "$host" mx in |
    sed -n '/^;; ANSWER SECTION:/,/^$/{
            s/^[^;].* //p
    }'
done
}

mx takes one or more hostname arguments; it runs dig and sed on each hostname. For example, the mail exchangers for oreilly.com are smtp2.oreilly.com and smtp.oreilly.com. The mail exchanger for hesketh.com is mail.hesketh.com:

$ mx oreilly.com hesketh.com
==== oreilly.com ====
smtp2.oreilly.com.
smtp.oreilly.com.
==== hesketh.com ====
mail.hesketh.com.

This example shows how to write a function with more than one line. In that style, with the ending curly brace on its own line, you don't need a semicolon after the last command. (The curly braces in the middle of the function are inside quotes, so they're passed to sed as part of its script.)

The mx function looks like a little shell program (Section 35.2). Shell functions have the same syntax as a shell script, except for the enclosing function name and curly braces. In fact, a shell function can be defined and used within a shell script (Section 35.30). But, as we've seen, it's also handy for interactive use.

29.11.3. Setting Current Shell Environment: The work Function

Like aliases, functions run in the current shell process -- not in a subprocess as shell scripts do. So they can change your shell's current directory, reset shell and environment variables, and do basically anything you could do at a shell prompt. (Section 24.3 has details.)

This next function is for a group of people who are all working on a project. A directory named /work has symbolic links (Section 10.4) named for each worker -- /work/ann, /work/joe, etc. -- and each link points to the directory where that person is working. Each worker makes a function named work that, by default, cds to her directory and summarizes it. If the person gives an argument to the function -- like work todo, for instance -- the script edits the file named .todo in that directory. This setup also lets people quickly find out where others in the group are working.

Go to http://examples.oreilly.com/upt3 for more information on: work.sh

Okay, I admit that I made this up as a demonstration for this article, as a way to show a lot of features in a small amount of space. Anyway, here's the function:

if Section 35.13, '...' Section 28.14, wc Section 16.6

function work ( ) {
    local status=0
    if [ $# -eq 1 -a "$1" = todo ]
    then
        ${VISUAL-vi} /work/$USER/.todo
        status=$?  # return status from editor
    elif [ $# -ne 0 ]
    then
        echo "Usage: work [todo]" 1>&2
        status=1
    else
        cd /work/$USER
        echo "You're in your work directory `pwd`."
        echo "`ls | wc -w` files to edit."
        status=0
    fi
    return $status
}

There are three points I should make about this example. First, the local command defines a shell variable named status that's local to the function -- which means its value isn't available outside the function, so it's guaranteed not to conflict with variables set other places in the shell. I've also set the value to 0, but this isn't required. (In the original Korn shell, use the typeset command to set a local variable.) Second, when you run a function, the first argument you pass it is stored in $1 , the second in $2, and so on (Section 35.20). Shell and environment variables set outside of the function, and nonlocal variables set within the function, are passed to and from the function. Finally, the return command returns a status (Section 35.12) to the calling shell. (Without return, the function returns the status from the last command in the function.) For a function you use interactively, like this one, you may not care about the status. But you also can use return in the middle of a function to end execution and return to the calling shell immediately.

29.11.4. Functions Calling Functions: Factorials

Okay, students, this example is "extra credit" ;-)...You can ignore this ramble unless you want some esoterica. (I'm actually not trying to waste your time. There are some useful bits of info in here about the internal workings of the shells.) Functions can call each other recursively, and local variables are passed to functions they call, but changes in a called function are not passed back to the calling function. When I say "recursion," I've gotta show the classic demonstration: a factorial function.[93]

[93]Factorial is the product of all integers from some nonnegative number through one. So the factorial of 6, written 6!, is 6 × 5 × 4 × 3 × 2 × 1 or 720. Also, zero factorial (0!) is defined as 1. In recursion, a function typically calls itself to get "the next value," then waits for that value to be returned and returns its answer to the function that called it. If you ask a function to calculate 6!, it will call itself and ask for 5!, then call itself and ask for 4!, and so on. This can be confusing if you haven't seen it before, but there's information about it in almost every computer science textbook on basic programming techniques. It is also worth mentioning that recursion is a pretty poor way to calculate factorials in most languages, namely, those that lack support for tail recursion.

The fac function calculates the factorial of the number passed in $1. It writes the result to standard output, for two reasons. First, doing so lets you type fac n at the command line (why you'd need to calculate a factorial very often, though, I'm not sure!). Second, if the shells' return command works like the Unix exit statuses (and I haven't checked all versions of all shells), the values are only eight bits -- so it's better to return a string, which lets us handle bigger integers. I could put in more error checking, but since this is all theoretical anyway, here's the simple version of fac:

Go to http://examples.oreilly.com/upt3 for more information on: fac.sh

function fac ( ) {
    if [ "$1" -gt 0 ]
    then echo $(($1 * `fac $(($1 - 1))`))
    else echo 1
    fi
}

Then you can play:

$ fac 0
1
$ fac 15
2004310016
$ fac 18
-898433024

Oops: overflow. Try zsh instead of bash or ksh; zsh built-in arithmetic seems to have more capacity:

zsh$ fac 18
6402373705728000

You can do some simple tracing by typing set -x (Section 27.15) at a shell prompt. Then the shell will display the commands it executes. (This works best in bash because it puts one + character at the left edge of each line to show each level of recursion.) You also can add some tracing code that uses a local variable, level, to store the depth of recursion. The code echoes debugging messages that show the depth of recursion of each call. Note that because the "returned value" of each function is written to its standard output, these debugging messages have to be on the standard error! (To see what happens otherwise, remove the 1>&2 operator (Section 36.16).) Here's fac with debugging code:

${..-..} Section 36.7

fac ( ) {
local level=${level-0}
echo "debug: recursion level is $((level += 1)).  Doing fac of $1" 1>&2
if [ "$1" -gt 0 ]
then echo $(($1 * `fac $(($1 - 1))`))
else echo 1
fi
echo "debug: leaving level $level." 1>&2
}

Let's run the code with tracing. Note that changes to the value of level at deeper levels doesn't affect the value at higher levels -- and that level isn't set at all in the top-level shell:

$ fac 3
debug: recursion level is 1.  Doing fac of 3
debug: recursion level is 2.  Doing fac of 2
debug: recursion level is 3.  Doing fac of 1
debug: recursion level is 4.  Doing fac of 0
debug: leaving level 4.
debug: leaving level 3.
debug: leaving level 2.
6
debug: leaving level 1.
$ echo $level
$

29.11.5. Conclusion

The next two articles cover specifics about functions in particular shells, and Section 29.14 shows how to simulate functions in shells that don't have them.

Here's another overall note. Each shell has its own commands for working with functions, but in general, the typeset -f command lists the functions you've defined, and unset -f funcname deletes the definition of the function named funcname.

--JP and SJC