11.10. Alphabetical Summary of Functions and Commands
The following alphabetical list of keywords and functions includes all
that are available in awk, nawk,
and gawk. nawk includes all old
awk functions and keywords, plus some additional
ones (marked as {N}).
gawk includes all nawk functions and
keywords, plus some additional ones (marked as {G}).
Items marked with {B}
are available in the Bell Labs awk.
Items that aren't marked with a symbol are available in all
versions.
atan2 | atan2(y, x)
Return the arctangent of y/x in radians.
{N}
| break | break
Exit from a while, for, or
do loop.
| close | close(filename-expr) close(command-expr)
In most implementations of awk, you can have only 10 files open
simultaneously and one pipe. Therefore, nawk provides a close
function that allows you to close a file or a pipe. It takes
as an argument the same expression that opened the pipe
or file. This expression must be identical, character
by character, to the one that opened the file or pipe; even whitespace
is significant.
{N}
| continue | continue
Begin next iteration of while, for,
or do loop.
| cos | cos(x)
Return the cosine of x, an angle in radians.
{N}
| delete | delete array[element] tt class="literal">delete array
Delete element from array.
The brackets are typed literally.
The second form is a common extension, which deletes all
elements of the array at one shot.
{N}
|
do | do statement while (expr)
Looping statement.
Execute statement,
then evaluate expr and, if true,
execute statement again.
A series of statements must be put within braces.
{N}
| exit | exit [expr]
Exit from script, reading no new input. The END procedure,
if it exists, will be executed. An optional expr
becomes awk's return value.
| exp | exp(x)
Return exponential of x
(ex).
| fflush | fflush([output-expr])
Flush any buffers associated with open output file or pipe
output-expr.
{B}
gawk extends this function.
If no output-expr is supplied,
it flushes standard output.
If output-expr is the null
string (""), it flushes
all open files and pipes. {G}
| for | for (init-expr; test-expr; incr-expr) statement
C-style looping construct. init-expr assigns the
initial value of a counter variable. test-expr is
a relational expression that is evaluated each time before executing
the statement. When test-expr is
false, the loop is exited. incr-expr increments
the counter variable after each pass. All the expressions are
optional. A missing test-expr is considered to be true.
A series of statements must be put within braces.
| for | for (item in array) statement
Special loop designed for reading associative arrays. For each
element of the array, the statement is
executed; the element can be referenced by
array[item].
A series of statements must be put within braces.
|
function | function name(parameter-list) { statements }
Create name as a user-defined function consisting of awk
statements that apply to the specified list of parameters.
No space is allowed between name and the left paren
when the function is called.
{N}
| getline | getline [var] [< file] or command | getline [var]
Read next line of input.
Original awk doesn't support the syntax
to open multiple input streams.
The first form reads input from file; the second form reads the output of command.
Both forms read one record at a time, and each time
the statement is executed, it gets the next record
of input. The record is assigned to $0
and is parsed into fields, setting NF,
NR and FNR.
If var is specified, the result is assigned
to var, and $0 and
NF aren't changed. Thus, if
the result is assigned to a variable, the
current record doesn't change.
getline is actually a function and returns 1 if it
reads a record successfully, 0 if end-of-file is
encountered, and -1 if it's
otherwise unsuccessful.
{N}
| gensub | gensub(r, s, h [, t])
General substitution function. Substitute s for matches
of the regular expression r in the string
t. If h is a number, replace
the hth match. If it is
"g" or "G",
substitute globally. If t is not supplied,
$0 is used. Return the new string value. The
original t is not modified.
(Compare gsub and sub.)
{G}
| gsub | gsub(r, s [, t])
Globally substitute s for each match of the
regular expression r in the string t.
If t is not supplied,
defaults to $0.
Return the number of substitutions.
{N}
| if | if (condition) statement [else statement]
If condition is true, do statement(s); otherwise do
statement in the optional else clause.
The condition can be an expression using
any of the relational operators <,
<=,
==,
!=,
>=,
or
>,
as well as
the array membership operator in,
and
the pattern-matching operators ~
and !~
(e.g., if ($1 ~ /[Aa].*/)).
A series of statements must be put within braces.
Another if can directly follow an else
in order to produce a chain of tests or decisions.
| index | index(str, substr)
Return the position (starting at 1) of substr
in str,
or zero if substr is not present in str.
| int | int(x)
Return integer value of x by truncating any
fractional part.
| length | length([arg])
Return length of arg,
or the length of $0 if no argument.
| log | log(x)
Return the natural logarithm (base e)
of x.
| match | match(s, r)
Function that matches the pattern, specified by the regular expression r,
in the string s, and
returns either the position in s, where the match
begins, or 0 if no occurrences are found. Sets the values
of RSTART and RLENGTH to the start and
length of the match,
respectively.
{N}
| next | next
Read next input line and start new cycle through pattern/procedures
statements.
| nextfile | nextfile
Stop processing the current input file and
start new cycle through pattern/procedures
statements,
beginning with the first record of the next file.
{B} {G}
| print | print [ output-expr[, ...]] [ dest-expr ]
Evaluate the output-expr and direct it to
standard output, followed by the value of ORS.
Each comma-separated output-expr is
separated in the output by the value of OFS.
With no output-expr, print $0.
Output Redirectionsdest-expr is an
optional expression that directs the output to a file or pipe.
- > file
- Directs the output to a file,
overwriting its previous contents.
- >> file
- Appends the output to a file,
preserving its previous contents.
In both cases, the file is
created if it does not already exist.
- | command
- Directs the output as the input to a Unix command.
Be careful not to mix > and >>
for the same file.
Once a file has been opened with >, subsequent
output statements continue to append to the file until it is closed.
Remember to call close()
when you have finished with a file or pipe.
If you don't, eventually you will hit the system limit
on the number of simultaneously open files.
| printf | printf(format [, expr-list ]) [ dest-expr ]
An alternative output statement borrowed from the C language. It can produce formatted output and also
output data without automatically producing a newline.
format is a string of format
specifications and constants.
expr-list is a list of
arguments corresponding to format specifiers.
See print for a description of
dest-expr.
format follows the conventions of the C-language
printf(3S) library function.
Here are a few of the most common formats:
%s | A string.
|
%d | A decimal number.
|
%n.mf | A floating-point number; n = total number of digits. m =
number of digits after decimal point.
|
%[-]nc | n specifies minimum field length for format type
c, while - left-justifies value in
field; otherwise, value is right-justified.
|
Like any string,
format can also contain embedded escape sequences:
\n (newline) or \t (tab)
being the most common.
Spaces and literal text can be placed in the format argument
by quoting the entire argument.
If there are multiple expressions to be printed, there should be
multiple formats specified.
ExampleUsing the script:
{ printf("The sum on line %d is %.0f.\n", NR, $1+$2) }
The following input line:
5 5
produces this output, followed by a newline:
The sum on line 1 is 10.
| rand | rand()
Generate a random number between 0 and 1. This function returns the
same series of numbers each time the script is executed, unless the random
number generator is seeded using srand().
{N}
| return | return [expr]
Used within a user-defined function to exit the function,
returning value of expr.
The return value of a function is undefined if expr
is not provided.
{N}
| sin | sin(x)
Return the sine of x, an angle in radians.
{N}
| split | split(string, array [, sep])
Split string into elements of array
array[1],...,array[n].
The string
is split at each occurrence of separator sep.
If sep is
not specified, FS is used.
The number of array elements created is
returned.
| sprintf | sprintf(format [, expressions])
Return the formatted value of one or more expressions,
using the specified format
(see printf). Data is formatted but not printed.
{N}
| sqrt | sqrt(arg)
Return square root of arg.
|
srand | srand([expr])
Use optional expr to set a new seed for the
random number generator.
Default is the time of day.
Return value is the old seed.
{N}
| strftime | strftime([format [,timestamp]])
Format timestamp according to format.
Return the formatted string.
The timestamp is a time-of-day value in
seconds since midnight, January 1, 1970, UTC.
The format string is similar to that of
sprintf.
(See the Example for systime.)
If timestamp is omitted, it defaults to the
current time.
If format is omitted, it defaults to a value
that produces output similar to that of date.
{G}
| sub | sub(r, s [, t])
Substitute s for first match of the
regular expression r in the string t.
If t is not supplied,
defaults to $0.
Return 1 if successful; 0 otherwise.
{N}
| substr | substr(string, beg [, len])
Return substring of string at beginning
position beg and the characters that
follow to maximum specified length len. If
no length is given, use the rest of the string.
| system | system(command)
Function that executes the specified
command and returns its status. The status
of the executed command typically indicates success or failure. A
value of 0 means that the command executed successfully. A nonzero
value indicates a failure of some sort.
The documentation for the command you're running will give you the
details.
The output of the command is not available for processing
within the awk script.
Use command | getline
to read the output of a command into the script.
{N}
| systime | systime()
Return a
time-of-day value in
seconds since midnight, January 1, 1970, UTC.
{G}
ExampleLog the start and end times of a data-processing program:
BEGIN {
now = systime()
mesg = strftime("Started at %m/%d/%Y %H:%M:%S", now)
print mesg
}
process data ...
END {
now = systime()
mesg = strftime("Ended at %m/%d/%Y %H:%M:%S", now)
print mesg
}
| tolower | tolower(str)
Translate all uppercase characters
in str to lowercase and return the new string.[15]
{N}
| toupper | toupper(str)
Translate all lowercase characters
in str to uppercase and return the new string.
{N}
| while | while (condition) statement
Do statement while condition is true
(see if for a
description of allowable conditions).
A series of statements must be put within braces.
|
11.10.1. printf FormatsFormat specifiers for printf and sprintf
have the following form:
%[flag][width][.precision]letter
The control letter is required.
The format conversion control letters are as follows.
Character | Description |
c | ASCII character |
d | Decimal integer |
i | Decimal integer (added in POSIX) |
e | Floating-point format ([-]d.precisione[+-]dd) |
E | Floating-point format ([-]d.precisionE[+-]dd) |
f | Floating-point format ([-]ddd.precision) |
g | e or f conversion, whichever is shortest, with trailing zeros removed |
G | E or f conversion, whichever is shortest, with trailing zeros removed |
o | Unsigned octal value |
s | String |
x | Unsigned hexadecimal number; uses a-f for 10 to 15 |
X | Unsigned hexadecimal number; uses A-F for 10 to 15 |
% | Literal % |
The optional flag is one of the following.
Character | Description |
- | Left-justify the formatted value within the field.
|
space | Prefix positive values with a space and negative values with a minus.
|
+ | Always prefix numeric values with a sign,
even if the value is positive.
|
# | Use an alternate form:
%o has a preceding 0;
%x and %X are prefixed with
0x and 0X, respectively;
%e, %E, and %f
always have a decimal point in the result;
and
%g and %G do not have
trailing zeros removed.
|
0 | Pad output with zeros, not spaces.
This happens only when the field width is wider than the converted result.
|
The optional width is the minimum number of characters to
output.
The result will be padded to this size if it is smaller.
The 0 flag causes padding with zeros; otherwise,
padding is with spaces.
The precision is optional.
Its meaning varies by control letter,
as shown in this table.
Conversion | Precision Means |
%d, %i, %o
%u, %x, %X
|
The minimum number of digits to print |
%e, %E, %f |
The number of digits to the right of the decimal point |
%g, %G | The maximum number of significant digits |
%s | The maximum number of characters to print |
| | | 11.9. Implementation Limits | | III. Text Formatting |
Copyright © 2003 O'Reilly & Associates. All rights reserved.
|