United States-English |
|
|
HP-UX Reference > Aawk(1)HP-UX 11i Version 3: February 2007 |
|
NAMEawk — pattern-directed scanning and processing language DESCRIPTIONawk scans each input file for lines that match any of a set of patterns specified literally in program or in one or more files specified as -f progfile. With each pattern there can be an associated action that is to be performed when a line in a file matches the pattern. Each line is matched against the pattern portion of every pattern-action statement, and the associated action is performed for each matched pattern. The file name - means the standard input. Any file of the form var=value is treated as an assignment, not a filename. An assignment is evaluated at the time it would have been opened if it were a filename, unless the -v option is used. An input line is made up of fields separated by white space, or by regular expression FS. The fields are denoted $1, $2, ...; $0 refers to the entire line. Optionsawk recognizes the following options and arguments:
StatementsA pattern-action statement has the form: pattern { action } A missing { action } means print the line; a missing pattern always matches. Pattern-action statements are separated by new-lines or semicolons. An action is a sequence of statements. A statement can be one of the following: if(expression) statement [ else statement ] while(expression) statement for(expression;expression;expression) statement for(var in array) statement do statement while(expression) break continue {[statement ...]} expression # commonly var = expression print [expression-list] [ > expression] printf format [, expression-list] [ > expression] return [expression] next # skip remaining patterns on this input line. delete array [expression] # delete an array element. exit [expression] # exit immediately; status is expression. Statements are terminated by semicolons, newlines or right braces. An empty expression-list stands for $0. String constants are quoted (""), with the usual C escapes recognized within. Expressions take on string or numeric values as appropriate, and are built using the operators +, -, *, /, %, ^ (exponentiation), and concatenation (indicated by a blank). The operators ++, --, +=, -=, *=, /=, %=, ^=, **=, >, >=, <, <=, ==, !=, "" (double quotes, string conversion operator), and ?: are also available in expressions. Variables can be scalars, array elements (denoted x[i] ) or fields. Variables are initialized to the null string. Array subscripts can be any string, not necessarily numeric (this allows for a form of associative memory). Multiple subscripts such as [i, j,k] are permitted. The constituents are concatenated, separated by the value of SUBSEP. The print statement prints its arguments on the standard output (or on a file if >file or >>file is present or on a pipe if |cmd is present), separated by the current output field separator, and terminated by the output record separator. file and cmd can be literal names or parenthesized expressions. Identical string values in different statements denote the same open file. The printf statement formats its expression list according to the format (see printf(3S)). Built-In FunctionsThe built-in function close(expr) closes the file or pipe expr opened by a print or printf statement or a call to getline with the same string-valued expr. This function returns zero if successful, otherwise, it returns non-zero. The customary functions exp, log, sqrt, sin, cos, atan2 are built in. Other built-in functions are:
The built-in function getline sets $0 to the next input record from the current input file; getline < file sets $0 to the next record from file. getline x sets variable x instead. Finally, cmd | getline pipes the output of cmd into getline; each call of getline returns the next line of output from cmd. In all cases, getline returns 1 for a successful input, 0 for end of file, and -1 for an error. PatternsPatterns are arbitrary Boolean combinations (with ! || &&) of regular expressions and relational expressions. awk supports Extended Regular Expressions as described in regexp(5). Isolated regular expressions in a pattern apply to the entire line. Regular expressions can also occur in relational expressions, using the operators ~ and !~. /re/ is a constant regular expression; any string (constant or variable) can be used as a regular expression, except in the position of an isolated regular expression in a pattern. A pattern can consist of two patterns separated by a comma; in this case, the action is performed for all lines from an occurrence of the first pattern though an occurrence of the second. A relational expression is one of the following:
where a relop is any of the six relational operators in C, and a matchop is either ~ (matches) or !~ (does not match). A conditional is an arithmetic expression, a relational expression, or a Boolean combination of the two. The special patterns BEGIN and END can be used to capture control before the first input line is read and after the last. BEGIN and END do not combine with other patterns. Special CharactersThe following special escape sequences are recognized by awk in both regular expressions and strings:
Variable NamesVariable names with special meanings are:
Functions can be defined (at the position of a pattern-action statement) as follows: function foo(a, b, c) { ...; return x } Parameters are passed by value if scalar, and by reference if array name. Functions can be called recursively. Parameters are local to the function; all other variables are global. Note that if pattern-action statements are used in an HP-UX command line as an argument to the awk command, the pattern-action statement must be enclosed in single quotes to protect it from the shell. For example, to print lines longer than 72 characters, the pattern-action statement as used in a script (-f progfile command form) is: length > 72 The same pattern action statement used as an argument to the awk command is quoted in this manner: awk 'length > 72' EXTERNAL INFLUENCESFor information about the UNIX standard environment, see standards(5). Environment Variables
In addition, all environment variables will be visible via the awk variable ENVIRON. EXAMPLESPrint lines longer than 72 characters: length > 72 Print first two fields in opposite order: { print $2, $1 } Same, with input fields separated by comma and/or blanks and tabs: BEGIN { FS = ",[ \t]*|[ \t]+" } { print $2, $1 } Add up first column, print sum and average: { s += $1 }" END { print "sum is", s, " average is", s/NR } Print all lines between start/stop pairs: /start/, /stop/ Simulate echo command (see echo(1)): BEGIN { # Simulate echo(1) for (i = 1; i < ARGC; i++) printf "%s ", ARGV[i] printf "\n" exit } SEE ALSOlex(1), sed(1), standards(5). A. V. Aho, B. W. Kernighan, P. J. Weinberger: The AWK Programming Language, Addison-Wesley, 1988. |
Printable version | ||
|