Chapter 13. The gawk Scripting Language
gawk is the GNU version of awk, a
powerful pattern-matching program for processing
text files that may be composed of fixed- or variable-length records
separated by some delineator (by default, a newline character).
gawk may be used from the command
line or in gawk scripts. You should
normally be able to invoke this utility using either awk or gawk
on the shell command line. With gawk, you can: -
Conveniently process a text file as though it were made up of records
and fields in a textual database.
-
Use variables to change those records and fields.
-
Execute shell commands from a script.
-
Perform arithmetic and string operations.
-
Use programming constructs such as loops and conditionals.
-
Define your own functions.
-
Process the result of shell commands.
-
Produce formatted reports.
For more information on gawk, see
sed & awk (O'Reilly) or
Effective gawk Programming
(O'Reilly).
13.1. Command-Line Syntax
gawk's syntax has two
forms:
gawk [options] 'script' var=value file(s)
gawk [options] -f scriptfile var=value file(s)
You can specify a script directly on the command
line, or you can store a script in a scriptfile
and specify it with -f. Multiple
-f options are allowed; awk concatenates the files. This feature is
useful for including libraries.
gawk operates on one or more input
files. If none are specified (or if - is specified), gawk reads from standard input.
Variables can be assigned a value on the
command line. The value assigned to a variable
can be a literal, a shell variable ($name), or a
command
substitution (`cmd`), but the value is available
only after a line of input is read (i.e., after the BEGIN statement).
For example, to print the first three (colon-separated) fields of the
password file, use -F to set the
field separator to a colon:
gawk -F : '{print $1; print $2; print $3}' /etc/passwd
Numerous examples are shown later in Section 13.2.
13.1.1. Options
All
options exist in both traditional POSIX (one-letter) format and
GNU-style (long) format. Some recognized options
are:
- --
-
Treat all subsequent text as commands or filenames, not options.
- -f scriptfile, --file=scriptfile
-
Read gawk commands from
scriptfile instead of command line.
- -v var=value, --assign=var=value
-
Assign a value to variable
var. This allows assignment before the script
begins execution.
- -F c, --field-separator=c
-
Set the field separator to character c. This is
the same as setting the variable FS.
c may be a regular expression. Each input line,
or record, is divided into fields by whitespace (blanks or tabs) or
by some other user-definable record separator. Fields are referred to
by the variables $1, $2,..., $n. $0 refers to the entire record.
- -W option
-
All -W options are specific to
gawk, as opposed to awk. An alternate syntax is --option (i.e.,
--compat).
option may be one of:
- compat, traditional
-
Behave exactly like traditional (non-GNU) awk.
- copyleft, copyright
-
Print copyleft notice and exit.
- dump-variables[=file]
-
Print the name, type, and value of all global variables to the
specified file, or to the file
awkvars.out in the current directory if no file
is specified.
- help, usage
-
Print syntax and list of options, then exit.
- lint[=fatal]
-
Warn about commands that might not port to other versions of
awk or that gawk considers problematic. When fatal is specified, warnings are treated as
fatal errors.
- lint-old
-
Like lint, but compares to an older
version of awk used on Version 7
Unix.
- non-decimal-data
-
When reading data, interpret numbers beginning with 0 to be octal,
and those beginning with 0x to be hexadecimal. (To print nondecimal
numbers, use the printf command, as
print prints only string
representations of nondecimal numbers.)
- posix
-
Expect exact compatibility with POSIX; disable all gawk extensions as if traditional had been specified. Ignore
\x escape sequences, **, **=, the
keyword func, and single-tab field
separators. Disallow newlines after ? or
: and the fflush
function.
- profile[=file]
-
Write a pretty printed version of the script being executed to the
specified file, or to the file
awkprof.out in the current directory if no other
file is specified. When gawk is
invoked as pgawk and passed this
version of the program with the -f
option, it will add profile data to the file inserting execution
counts to the left of each statement in the program.
- re-interval
-
Allow use of {n,m}
intervals in regular expressions.
- source=script
-
Treat script as gawk commands. Like the
'script'
argument, but lets you mix commands from files (using -f options) with commands on the gawk command line.
- version
-
Print version information and exit.
| | | 12.4. Alphabetical Summary of sed Commands | | 13.2. Patterns and Procedures |
Copyright © 2003 O'Reilly & Associates. All rights reserved.
|
|