home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam  


11.3 Patterns and Procedures

awk scripts consist of patterns and procedures:

pattern
  { procedure
 }

Both are optional. If pattern is missing, { procedure } is applied to all lines; if { procedure } is missing, the matched line is printed.

11.3.1 Patterns

A pattern can be any of the following:

/regular expression
/
relational expression

pattern-matching expression

BEGIN
END
  • Expressions can be composed of quoted strings, numbers, operators, functions, defined variables, or any of the predefined variables described later in the section "Built-in Variables."

  • Regular expressions use the extended set of metacharacters and are described in Chapter 6, Pattern Matching .

  • ^ and $ refer to the beginning and end of a string (such as the fields), respectively, rather than the beginning and end of a line. In particular, these metacharacters will not match at a newline embedded in the middle of a string.

  • Relational expressions use the relational operators listed in the section "Operators" later in this chapter. For example, $2 > $1 selects lines for which the second field is greater than the first. Comparisons can be either string or numeric. Thus, depending on the types of data in $1 and $2 , awk does either a numeric or a string comparison. This can change from one record to the next.

  • Pattern-matching expressions use the operators ~ (match) and !~ (don't match). See the section "Operators" later in this chapter.

  • The BEGIN pattern lets you specify procedures that take place before the first input line is processed. (Generally, you set global variables here.)

  • The END pattern lets you specify procedures that take place after the last input record is read.

  • In nawk , BEGIN and END patterns may appear multiple times. The procedures are merged as if there had been one large procedure.

Except for BEGIN and END , patterns can be combined with the Boolean operators || (or), && (and), and ! (not). A range of lines can also be specified using comma-separated patterns:

pattern
,pattern

11.3.2 Procedures

Procedures consist of one or more commands, functions, or variable assignments, separated by newlines or semicolons, and contained within curly braces. Commands fall into five groups:

  • Variable or array assignments

  • Printing commands

  • Built-in functions

  • Control-flow commands

  • User-defined functions (nawk only)

11.3.3 Simple Pattern-Procedure Examples

  • Print first field of each line:

    { print $1 }

  • Print all lines that contain pattern :

    /pattern
    /

  • Print first field of lines that contain pattern :

    /pattern
    / { print $1 }

  • Select records containing more than two fields:

    NF > 2

  • Interpret input records as a group of lines up to a blank line. Each line is a single field:

    BEGIN { FS = "\n"; RS = "" }

  • Print fields 2 and 3 in switched order, but only on lines whose first field matches the string "URGENT":

    $1 ~ /URGENT/ { print $3, $2 }

  • Count and print the number of pattern found:

    /pattern
    / { ++x }
    END { print x }

  • Add numbers in second column and print total:

    { total += $2 }
    END { print "column total is", total}

  • Print lines that contain less than 20 characters:

    length($0) < 20

  • Print each line that begins with Name: and that contains exactly seven fields:

    NF == 7 && /^Name:/

  • Print the fields of each input record in reverse order, one per line:

    {
    	for (i = NF; i >= 1; i--)
    		print $i
    }


Previous: 11.2 Command-Line Syntax UNIX in a Nutshell: System V Edition Next: 11.4 Built-in Variables
11.2 Command-Line Syntax Book Index 11.4 Built-in Variables

The UNIX CD Bookshelf NavigationThe UNIX CD BookshelfUNIX Power ToolsUNIX in a NutshellLearning the vi Editorsed & awkLearning the Korn ShellLearning the UNIX Operating System