Arrays (Learning the Korn Shell, 2nd Edition)

6.4.1. Indexed Arrays

The Korn shell provides an indexed array facility that, while useful, is much more limited than analogous features in conventional programming languages. In particular, indexed arrays can be only one-dimensional (i.e., no arrays of arrays), and they are limited to 4096 elements.[86] Indices start at 0. This implies that the maximum index value is 4095. Furthermore, they may be any arithmetic expression: ksh automatically evaluates the expression to yield the actual index.

[86] 4096 is a minimum value in ksh93. Recent releases allow up to 64K elements.

There are three ways to assign values to elements of an array. The first is the most intuitive: you can use the standard shell variable assignment syntax with the array index in brackets ([]). For example:

nicknames[2]=bob
nicknames[3]=ed

These assignments put the values bob and ed into the elements of the array nicknames with indices 2 and 3, respectively. As with regular shell variables, values assigned to array elements are treated as character strings unless the assignment is preceded by let, or the array was declared to be numeric with one of the typeset options -i, -ui, -E, or -F. (Strictly speaking, the value assigned with let is still a string; it's just that with let, the shell evaluates the arithmetic expression being assigned to produce that string.)

The second way to assign values to an array is with a variant of the set statement, which we saw in Chapter 3. The statement:

set -A aname val1 val2 val3 ...

creates the array aname (if it doesn't already exist) and assigns val1 to aname[0], val2 to aname[1], etc. As you would guess, this is more convenient for loading up an array with an initial set of values.

The third (recommended) way is to use the compound assignment form:

aname=(val1 val2 val3)

Starting with ksh93j, you may use the += operator to add values to an array:

aname+=(val4 val5 val6)

To extract a value from an array, use the syntax ${aname[i]}. For example, ${nicknames[2]} has the value "bob". The index i can be an arithmetic expression -- see above. If you use * or @ in place of the index, the value will be all elements, separated by spaces. Omitting the index ($nicknames) is the same as specifying index 0 (${nicknames[0]}).

Now we come to the somewhat unusual aspect of Korn shell arrays. Assume that the only values assigned to nicknames are the two we saw above. If you type print "${nicknames[*]}", you will see the output:

bob ed

In other words, nicknames[0] and nicknames[1] don't exist. Furthermore, if you were to type:

nicknames[9]=pete
nicknames[31]=ralph

and then type print "${nicknames[*]}", the output would look like this:

bob ed pete ralph

This is why we said "the elements of nicknames with indices 2 and 3" earlier, instead of "the 2nd and 3rd elements of nicknames". Any array elements with unassigned values just don't exist; if you try to access their values, you get null strings.

You can preserve whatever whitespace you put in your array elements by using "${aname[@]}" (with the double quotes) instead of ${aname[*]}, just as you can with "$@" instead of $* or "$*".

The shell provides an operator that tells you how many elements an array has defined: ${#aname[*]}. Thus ${#nicknames[*]} has the value 4. Note that you need the [*] because the name of the array alone is interpreted as the 0th element. This means, for example, that ${#nicknames} equals the length of nicknames[0] (see Chapter 4). Since nicknames[0] doesn't exist, the value of ${#nicknames} is 0, the length of the null string.

If you think of an array as a mapping from integers to values (i.e., put in a number, get out a value), you can see why arrays are "number-dominated" data structures. Because shell programming tasks are much more often oriented towards character strings and text than towards numbers, the shell's indexed array facility isn't as broadly useful as it might first appear.

Nevertheless, we can find useful things to do with indexed arrays. Here is a cleaner solution to Task 5-4, in which a user can select his or her terminal type (TERM environment variable) at login time. Recall that the "user-friendly" version of this code used select and a case statement:

print 'Select your terminal type:'
PS3='terminal? '
select term in \
    'Givalt GL35a' \
    'Tsoris T-2000' \
    'Shande 531' \
    'Vey VT99'
do
    case $REPLY in
        1 ) TERM=gl35a ;;
        2 ) TERM=t2000 ;;
        3 ) TERM=s531 ;;
        4 ) TERM=vt99 ;;
        * ) print "invalid." ;;
    esac
    if [[ -n $term ]]; then
        print "TERM is $TERM"
        export TERM
        break
    fi
done

We can eliminate the entire case construct by taking advantage of the fact that the select construct stores the user's numeric choice in the variable REPLY. We just need a line of code that stores all of the possibilities for TERM in an array, in an order that corresponds to the items in the select menu. Then we can use $REPLY to index the array. The resulting code is:

set -A termnames gl35a t2000 s531 vt99
print 'Select your terminal type:'
PS3='terminal? '
select term in \
    'Givalt GL35a' \
    'Tsoris T-2000' \
    'Shande 531' \
    'Vey VT99'
do
    if [[ -n $term ]]; then
        TERM=${termnames[REPLY-1]}
        print "TERM is $TERM"
        export TERM
        break
    fi
done

This code sets up the array termnames so that ${termnames[0]} is ``gl35a'', ${termnames[1]} is "t2000", etc. The line TERM=${termnames[REPLY-1]} essentially replaces the entire case construct by using REPLY to index the array.

Notice that the shell knows to interpret the text in an array index as an arithmetic expression, as if it were enclosed in (( and )), which in turn means that the variable need not be preceded by a dollar sign ($). We have to subtract 1 from the value of REPLY because array indices start at 0, while select menu item numbers start at 1.

Think about how you might use arrays to maintain the directory stack for pushd and popd. The arithmetic for loop might come in handy too.

6.4. Arrays

6.4.1. Indexed Arrays

6.4.2. Associative Arrays

6.4.3. Array Name Operators

Table 6-5. Array name-related operators