Quick Reference: awk (Unix Power Tools, 3rd Edition)

Since there are many flavor of awk, such as nawk and gawk (Section 18.11), this article tries to provide a usable reference for the most common elements of the language. Dialect differences, when they occur, are noted. With the exception of array subscripts, values in [ brackets] are optional; don't type the [ or ].

20.10.7. Alphabetical Summary of Commands

The following alphabetical list of statements and functions includes all that are available in awk, nawk, or gawk. Unless otherwise mentioned, the statement or function is found in all versions. New statements and functions introduced with nawk are also found in gawk.

atan2

atan2(y,x)

Returns the arctangent of y/x in radians. (nawk)

break

Exit from a while, for, or do loop.

close

close(filename-expr) close(command-expr)

In some implementations of awk, you can have only ten files open simultaneously and one pipe; modern versions allow more than one pipe open. Therefore, nawk provides a close statement that allows you to close a file or a pipe. close takes as an argument the same expression that opened the pipe or file. (nawk)

continue

Begin next iteration of while, for, or do loop immediately.

cos

cos(x)

Return cosine of x (in radians). (nawk)

delete

delete array[element]

Delete element of array. (nawk)

do

do body while (expr)

Looping statement. Execute statements in body, then evaluate expr. If expr is true, execute body again. More than one command must be put inside braces ({}). (nawk)

exit

exit[expr]

Do not execute remaining instructions and do not read new input. END procedure, if any, will be executed. The expr, if any, becomes awk's exit status (Section 34.12).

exp

exp(arg)

Return the natural exponent of arg.

for

for ([init-expr]; [test-expr]; [incr-expr]) command

C-language-style looping construct. Typically, init-expr assigns the initial value of a counter variable. test-expr is a relational expression that is evaluated each time before executing the command. When test-expr is false, the loop is exited. incr-expr is used to increment the counter variable after each pass. A series of commands must be put within braces ({}). For example:

for (i = 1; i <= 10; i++)
     printf "Element %d is %s.\n", i, array[i]

for

for (item in array) command

For each item in an associative array, do command. More than one command must be put inside braces ({}). Refer to each element of the array as array[item].

getline

getline [var][<file] or command | getline [var]

Read next line of input. Original awk does not support the syntax to open multiple input streams. The first form reads input from file, and the second form reads the standard output of a Unix command. Both forms read one line at a time, and each time the statement is executed, it gets the next line of input. The line of input is assigned to $0, and it is parsed into fields, setting NF, NR, and FNR. If var is specified, the result is assigned to var and the $0 is not changed. Thus, if the result is assigned to a variable, the current line does not change. getline is actually a function, and it returns 1 if it reads a record successfully, 0 if end-of-file is encountered, and -1 if for some reason it is otherwise unsuccessful. (nawk)

gsub

gsub(r,s[,t])

Globally substitute s for each match of the regular expression r in the string t. Return the number of substitutions. If t is not supplied, defaults to $0. (nawk)

if

if (condition) command [else command]

If condition is true, do command(s), otherwise do command(s) in else clause (if any). condition can be an expression that uses any of the relational operators <, <=, ==, != , >=, or >, as well as the pattern-matching operators ~ or !~ (e.g., if ($1 ~ /[Aa].*[Zz]/)). A series of commands must be put within braces ({}).

index

index(str,substr)

Return position of first substring substr in string str or 0 if not found.

int

int(arg)

Return integer value of arg.

length

length(arg)

Return the length of arg.

log

log(arg)

Return the natural logarithm of arg.

match

match(s,r)

Function that matches the pattern, specified by the regular expression r, in the string s and returns either the position in s where the match begins or 0 if no occurrences are found. Sets the values of RSTART and RLENGTH. (nawk)

next

print

print [args] [destination]

Print args on output, followed by a newline. args is usually one or more fields, but it may also be one or more of the predefined variables -- or arbitrary expressions. If no args are given, prints $0 (the current input record). Literal strings must be quoted. Fields are printed in the order they are listed. If separated by commas (,) in the argument list, they are separated in the output by the OFS character. If separated by spaces, they are concatenated in the output. destination is a Unix redirection or pipe expression (e.g., > file) that redirects the default standard output.

printf

printf format [, expression(s)] [destination]

Formatted print statement. Fields or variables can be formatted according to instructions in the format argument. The number of expressions must correspond to the number specified in the format sections. format follows the conventions of the C-language printf statement. Here are a few of the most common formats:

%s

A string.

%d

A decimal number.

%n.mf

A floating-point number, where n is the total number of digits and m is the number of digits after the decimal point.

%[-]nc

n specifies minimum field length for format type c, while - left-justifies value in field; otherwise value is right-justified.

format can also contain embedded escape sequences: \n (newline) or \t (tab) are the most common. destination is a Unix redirection or pipe expression (e.g., > file) that redirects the default standard output.

For example, using the following script:

{printf "The sum on line %s is %d.\n", NR, $1+$2}

and the following input line:

5   5

produces this output, followed by a newline:

The sum on line 1 is 10.

rand

rand( )

Generate a random number between 0 and 1. This function returns the same series of numbers each time the script is executed, unless the random number generator is seeded using the srand( ) function. (nawk)

return

return [expr]

Used at end of user-defined functions to exit the function, returning value of expression expr, if any. (nawk)

sin

sin(x)

Return sine of x (in radians). (nawk)

split

split(string,array[,sep])

Split string into elements of array array[1], . . . ,array[n]. string is split at each occurrence of separator sep. (In nawk, the separator may be a regular expression.) If sep is not specified, FS is used. The number of array elements created is returned.

sprintf

sprintf (format [, expression(s)])

Return the value of expression(s), using the specified format (see printf). Data is formatted but not printed.

sqrt

sqrt(arg)

Return square root of arg.

srand

srand(expr)

Use expr to set a new seed for random number generator. Default is time of day. Returns the old seed. (nawk)

sub

sub(r,s[,t])

Substitute s for first match of the regular expression r in the string t. Return 1 if successful; 0 otherwise. If t is not supplied, defaults to $0. (nawk)

substr

substr(string,m[,n])

Return substring of string, beginning at character position m and consisting of the next n characters. If n is omitted, include all characters to the end of string.

system

system(command)

Function that executes the specified Unix command and returns its status (Section 34.12). The status of the command that is executed typically indicates its success (0) or failure (nonzero). The output of the command is not available for processing within the nawk script. Use command | getline to read the output of the command into the script. (nawk)

tolower