|
Appendix B Quick Reference for awk
|
|
The following alphabetical list of statements and functions includes
all that are available in POSIX awk, nawk, or gawk. See
Chapter 11, A Flock of awks
, for extensions available in different
implementations.
- atan2()
atan2
(y
, x
)
Returns the arctangent of
y
/x
in radians.
- break
Exit from a while
, for
, or
do
loop.
- close()
close
(filename-expr
)
close
(command-expr
)
In most implementations of awk, you can only have a limited number of
files and/or pipes open simultaneously. Therefore, awk provides a
close()
function that allows you to close a
file or a pipe. It takes as an argument the same expression that
opened the pipe or file. This expression must be identical, character
by character, to the one that opened the file or pipe - even whitespace
is significant.
- continue
Begin next iteration of while
,
for
, or do
loop.
- cos()
cos
(x
)
Return cosine of x
in radians.
- delete
delete
array
[element
]
Delete element of an array.
- do
do
body
while
(expr
)
Looping statement. Execute statements in
body
then evaluate
expr
and if true, execute
body
again.
- exit
exit
[expr
]
Exit from script, reading no new input. The END
rule,
if it exists, will be executed. An optional expr
becomes awk's return value.
- exp()
exp
(x
)
Return exponential of x
(e
^ x
).
- for
for
(init-expr
; test-expr
; incr-expr
) statement
C-style looping construct. init-expr
assigns the initial value of the counter
variable. test-expr
is a relational
expression that is evaluated each time before executing the
statement. When test-expr
is false, the
loop is exited. incr-expr
is used to
increment the counter variable after each pass.
for
(item
in
array
) statement
Special loop designed for reading associative arrays. For each
element of the array, the statement
is
executed; the element can be referenced by
array
[item
].
- getline
Read next line of input.
getline
[var
] [<file
]
command
| getline
[var
]
The first form reads input from file
and the
second form reads the output of command
. Both
forms read one line at a time, and each time the statement is executed
it gets the next line of input. The line of input is assigned to $0
and it is parsed into fields, setting NF
,
NR
, and FNR
. If
var
is specified, the result is assigned to
var
and the $0 is not changed. Thus, if the
result is assigned to a variable, the current line does not change.
getline
is actually a function and it returns 1 if
it reads a record successfully, 0 if end-of-line is encountered, and
-1 if for some reason it is otherwise unsuccessful.
- gsub()
gsub
(r
, s
, t
)
Globally substitute s
for each match of the
regular expression r
in the string
t
. Return the number of substitutions. If
t
is not supplied, defaults to $0.
- if
if
(expr
) statement1
[ else
statement2
]
Conditional statement. Evaluate expr
and,
if true, execute statement1
; if
else
clause is supplied, execute
statement2
if
expr
is false.
- index()
index
(str
, substr
)
Return position (starting at 1) of substring in string.
- int()
int
(x
)
Return integer value of x
by truncating any
digits following a decimal point.
- length()
length
(str
)
Return length of string, or the length of $0 if no argument.
- log()
log
(x
)
Return natural logarithm (base e
) of
x
.
- match()
match
(s
, r
)
Function that matches the pattern, specified by the regular expression
r
, in the string
s
and returns either the position in
s
where the match begins, or 0 if no
occurrences are found. Sets the values of RSTART
and RLENGTH
to the start and length of the match,
respectively.
- next
Read next input line and begin executing script at first rule.
- print
print
[ output-expr
] [ dest-expr
]
Evaluate the output-expr
and direct it to
standard output followed by the value of ORS
. Each
output-expr
is separated by the value of
OFS
. dest-expr
is an
optional expression that directs the output to a file or pipe. ">
file
" directs the output to a file,
overwriting its previous contents. ">>
file
" appends the output to a file,
preserving its previous contents. In both of these cases, the file will be
created if it does not already exist. "| command
"
directs the output as the input to a system command.
- printf
printf
(format-expr
[, expr-list
]) [ dest-expr
]
An alternative output statement borrowed from the C language. It has
the ability to produce formatted output. It can also be used to
output data without automatically producing a newline.
format-expr
is a string of format
specifications and constants; see next section for a list of format
specifiers. expr-list
is a list of
arguments corresponding to format specifiers. See the
print
statement for a description of
dest-expr
.
- rand()
rand()
Generate a random number between 0 and 1. This function returns the
same series of numbers each time the script is executed, unless the
random number generator is seeded using the
srand()
function.
- return
return
[expr
]
Used at end of user-defined functions to exit function, returning
value of expression.
- sin()
sin
(x
)
Return sine of x
in radians.
- split()
split
(str
, array
, sep
)
Function that parses string into elements of array using field
separator, returning number of elements in array. Value of
FS
is used if no field separator is specified.
Array splitting works the same as field splitting.
- sprintf()
sprintf
(format-expr
[, expr-list
] )
Function that returns string formatted according to
printf
format specification. It formats data but
does not output it. format-expr
is a
string of format specifications and constants; see the next section for a
list of format specifiers. expr-list
is a
list of arguments corresponding to format specifiers.
- sqrt()
sqrt
(x
)
Return square root of x
.
- srand()
srand
(expr
)
Use expr
to set a new seed for random
number generator. Default is time of day. Return value is the old
seed.
- sub()
sub
(r
, s
, t
)
Substitute s
for first match of the regular
expression r
in the string
t
. Return 1 if successful; 0 otherwise.
If t
is not supplied, defaults to $0.
- substr()
substr
(str
, beg
, len
)
Return substring of string str
at beginning
position beg
, and the characters that
follow to maximum specified length len
. If
no length is given, use the rest of the string.
- system()
system
(command
)
Function that executes the specified
command
and returns its status. The status
of the executed command typically indicates success or failure. A
value of 0 means that the command executed successfully. A non-zero
value, whether positive or negative, indicates a failure of some sort.
The documentation for the command you're running will give you the
details. The output of the command is not available for processing
within the awk script. Use "command
| getline
" to read the output of a command into
the script.
- tolower()
tolower
(str
)
Translate all uppercase characters in str
to lowercase and return the new string.[3]
- toupper()
toupper
(str
)
Translate all lowercase characters in str
to uppercase and return the new string.
- while
while
(expr
)
statement
Looping construct. While expr
is true,
execute statement
.
A format expression can take three optional modifiers following "%"
and preceding the format specifier:
%-
width
.precision format-specifier
The width
of the output field is a numeric
value. When you specify a field width, the contents of the field will
be right-justified by default. You must specify "-" to get
left-justification. Thus, "%-20s" outputs a string left-justified in
a field 20 characters wide. If the string is less than 20 characters,
the field will be padded with spaces to fill.
The precision
modifier, used for decimal or
floating-point values, controls the number of digits that appear to
the right of the decimal point. For string formats, it controls the
number of characters from the string to print.
You can specify both the width
and
precision
dynamically, via values in the
printf
or sprintf
argument list.
You do this by specifying asterisks, instead of specifying literal values.
printf("%*.*g\n", 5, 3, myvar);
In this example, the width is 5, the precision is 3, and the value to
print will come from myvar
. Older versions of nawk
may not support this.
Note that the default precision for the output of numeric values is
"%.6g." The default can be changed by setting the system variable
OFMT
. This affects the precision used by the
print
statement when outputting numbers. For
instance, if you are using awk to write reports that contain dollar
values, you might prefer to change OFMT
to "%.2f."
The format specifiers, shown in Table 13.7
,
are used with
printf
and sprintf
statements.
Table B.6: Format Specifiers Used in printf
Character |
Description |
c
|
ASCII character. |
d
|
Decimal integer. |
i
|
Decimal integer. Added in POSIX. |
e
|
Floating-point format
([-]d
.precision
e
[+-]dd
).
|
E
|
Floating-point format ([-]d
.precision
E
[+-]dd
).
|
f
|
Floating-point format ([-]ddd
.precision
).
|
g
|
e
or f
conversion, whichever is
shortest, with trailing zeros removed.
|
G
|
E
or f
conversion, whichever is
shortest, with trailing zeros removed.
|
o
|
Unsigned octal value. |
s
|
String. |
x
|
Unsigned hexadecimal number. Uses
a
-f
for 10 to 15.
|
X
|
Unsigned hexadecimal number. Uses
A
-F
for 10 to 15.
|
%
|
Literal %. |
Often, whatever format specifiers are available in the system's
sprintf
(3) subroutine are available in awk.
The way printf
and
sprintf()
do rounding will often depend
upon the system's C sprintf
(3) subroutine.
On many machines, sprintf
rounding is
"unbiased," which means it doesn't always round a trailing ".5" up,
contrary to naive expectations. In unbiased rounding, ".5" rounds to
even, rather than always up, so 1.5 rounds to 2 but 4.5 rounds to 4.
The result is that if you are using a format that does rounding (e.g.,
"%.0f"
) you should check what your system does.
The following function does traditional rounding; it might be useful
if your awk's printf
does unbiased rounding.
# round --- do normal rounding
# Arnold Robbins, arnold@gnu.ai.mit.edu
# Public Domain
function round(x, ival, aval, fraction)
{
ival = int(x) # integer part, int() truncates
# see if fractional part
if (ival == x) # no fraction
return x
if (x < 0) {
aval = -x # absolute value
ival = int(aval)
fraction = aval - ival
if (fraction >= .5)
return int(x) - 1 # -2.5 --> -3
else
return int(x) # -2.3 --> -2
} else {
fraction = x - ival
if (fraction >= .5)
return ival + 1
else
return ival
}
}
|