11. A Flock of awksIn the previous four chapters, we have looked at POSIX awk, with only occasional reference to actual awk implementations that you would run. In this chapter, we focus on the different versions of awk that are available, what features they do or do not have, and how you can get them. First, we'll look at the original V7 version of awk. The original awk lacks many of the features we've described, so this section mostly describes what's not there. Next, we'll look at the three versions whose source code is freely available. All of them have extensions to the POSIX standard. Those that are common to all three versions are discussed first. Finally, we look at three commercial versions of awk. 11.1 Original awkIn each of the sections that follow, we'll take a brief look at how the original awk differs from POSIX awk. Over the years, UNIX vendors have enhanced their versions of original awk; you may need to write small test programs to see exactly what features your old awk has or doesn't have. 11.1.1 Escape SequencesThe original V7 awk only had "\t", "\n", "\"", and, of course, "\\". Most UNIX vendors have added some or all of "\b" and "\r" and "\f". 11.1.2 ExponentiationExponentiation (using the ^ , ^= , ** , and * *= operators) is not in old awk. 11.1.3 The C Conditional ExpressionThe three-argument conditional expression found in C, " expr1 ? expr2 : expr3 " is not in old awk. You must resort to a plain old if - else statement. 11.1.4 Variables as Boolean PatternsYou cannot use the value of a variable as a Boolean pattern. flag { print "..." } You must instead use a comparison expression. flag != 0 { print "..." } 11.1.5 Faking Dynamic Regular ExpressionsThe original awk made it difficult to use patterns dynamically because they had to be fixed when the script was interpreted. You can get around the problem of not being able to use a variable as a regular expression by importing a shell variable inside an awk program. The value of the shell variable will be interpreted by awk as a constant. Here's an example: $ The first line of the script makes the variable assignment before awk is invoked. To get the shell to expand the variable inside the awk procedure, we enclose it within single, then double, quotation marks.[1] Thus, awk never sees the shell variable and evaluates it as a constant string.
Here's another version that makes use of the Bourne shell variable substitution feature. Using this feature gives us an easy way to specify a default value for the variable if, for instance, the user does not supply a command-line argument. search=$1 awk '$1 ~ /'"${search:-.*}"'/' acronyms The expression "${search:-.*}" tells the shell to use the value of search if it is defined; if not, use ".*" as the value. Here, ".*" is regular-expression syntax specifying any string of characters; therefore, all entries are printed if no entry is supplied on the command line. Because the whole thing is inside double quotes, the shell does not perform a wildcard expansion on ".*". 11.1.6 Control FlowIn POSIX awk, if a program has just a BEGIN procedure, and nothing else, awk will exit after executing that procedure. The original awk is different; it will execute the BEGIN procedure and then go on to process input, even if there are no pattern-action statements. You can force awk to exit by supplying /dev/null on the command line as a data file argument, or by using exit . In addition, the BEGIN and END procedures, if present, have to be at the beginning and end of program, respectively. Furthermore, you can only have one of each. 11.1.7 Field SeparatingField separating works the same in old awk as it does in modern awk, except that you can't use regular expressions. 11.1.8 ArraysThere is no way in the original awk to delete an element from an array. The best thing you can do is assign the empty string to the unwanted array element, and then code your program to ignore array elements whose values are empty. Along the same lines, in is not an operator in original awk; you cannot use if (item in array) to see if an item is present. Unfortunately, this forces you to loop through every item in an array to see if the index you want is present. for (item in array) { if (item == searchkey) { process array[item] break } } 11.1.9 The getline FunctionThe original V7 awk did not have getline . If your awk is really ancient, then getline may not work for you. Some vendors have the simplest form of getline , which reads the next record from the regular input stream, and sets $0, NF and NR (there is no FNR , see below). All of the other forms of getline are not available. 11.1.10 FunctionsThe original awk had only a limited number of built-in string functions. (See Table 11.1 and Table 11.3 .)
Some built-in functions can be classified as arithmetic functions. Most of them take a numeric argument and return a numeric value. Table 11.2 summarizes these arithmetic functions.
One of the nicest facilities in awk, the ability to define your own functions, is also not available in original awk. 11.1.11 Built-In VariablesIn original awk only the variables shown in Table 11.3 are built in.
OFMT does double duty, serving as the conversion format for the print statement, as well as for converting numbers to strings. | ||||||||||||||||||||||||||||||||||||||||
|