home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam  


sed & awksed & awkSearch this book

Chapter 10. The Bottom Drawer

This chapter is proof that not everything has its place. Some things just don't seem to fit, no matter how you organize them. This chapter is a collection of such things. It is tempting to label it "Advanced Topics," as if to explain its organization (or lack thereof), but some readers might feel they need to make more progress before reading it. We have therefore called it "The Bottom Drawer," thinking of the organization of a chest of drawers, with underwear, socks, and other day-to-day things in the top drawers and heavier things that are less frequently used, like sweaters, in the bottom drawers. All of it is equally accessible, but you have to bend over to get things in the bottom drawer. It requires a little more effort to get something, that's all.

In this chapter we cover a number of topics, including the following:

  • The getline function

  • The system() function

  • Directing output to files and pipes

  • Debugging awk scripts

10.1. The getline Function

The getline function is used to read another line of input. Not only can getline read from the regular input data stream, it can also handle input from files and pipes.

The getline function is similar to awk's next statement. While both cause the next input line to be read, the next statement passes control back to the top of the script. The getline function gets the next line without changing control in the script. Possible return values are:

1

If it was able to read a line.

0

If it encounters the end-of-file.

-1

If it encounters an error.

NOTE: Although getline is called a function and it does return a value, its syntax resembles a statement. Do not write getline(); its syntax does not permit parentheses.

In the previous chapter, we used a manual page source file as an example. The -man macros typically place the text argument on the next line. Although the macro is the pattern that you use to find the line, it is actually the next line that you process. For instance, to extract the name of the command from the manpage, the following example matches the heading "Name," reads the next line, and prints the first field of it:

# getline.awk -- test getline function
/^\.SH "?Name"?/ { 
	getline # get next line
	print $1 # print $1 of new line.
}

The pattern matches any line with ".SH" followed by "Name," which might be enclosed in quotes. Once this line is matched, we use getline to read the next input line. When the new line is read, getline assigns it $0 and parses it into fields. The system variables NF, NR, and FNR are also set. Thus, the new line becomes the current line, and we are able to refer to "$1" and retrieve the first field. Note that the previous line is no longer available as $0. However, if necessary, you can assign the line read by getline to a variable and avoid changing $0, as we'll see shortly.

Here's an example that shows how the previous script works, printing out the first field of the line following ".SH Name."

$ awk -f getline.awk test
XSubImage

The sorter.awk program that we demonstrated at the end of Chapter 9, "Functions", could have used getline to read all the lines after the heading "Related Commands." We can test the return value of getline in a while loop to read a number of lines from the input. The following procedure replaces the first two procedures in the sorter program:

# Match "Related Commands" and collect them
/^\.SH "?Related Commands"?/ {
	print
	while (getline > 0)
		commandList = commandList $0
}

The expression "getline > 0" will be true as long as getline successfully reads an input line. When it gets to the end-of-file, getline returns 0 and the loop is exited.

10.1.3. Reading Input from a Pipe

You can execute a command and pipe the output into getline. For example, look at the following expression:

"who am i" | getline

That expression sets "$0" to the output of the who am i command.

dale       ttyC3        Jul 18 13:37

The line is parsed into fields and the system variable NF is set. Similarly, you can assign the result to a variable:

"who am i" | getline me

By assigning the output to a variable, you avoid setting $0 and NF, but the line is not split into fields.

The following script is a fairly simple example of piping the output of a command to getline. It uses the output from the who am i command to get the user's name. It then looks up the name in /etc/passwd, printing out the fifth field of that file, the user's full name:

awk '# getname - print users fullname from /etc/passwd
BEGIN { "who am i" | getline 
	name = $1
	FS = ":"
}
name ~ $1 { print $5 }
' /etc/passwd

The command is executed from the BEGIN procedure, and it provides us with the name of the user that will be used to find the user's entry in /etc/passwd. As explained above, who am i outputs a single line, which getline assigns to $0. $1, the first field of that output, is then assigned to name.

The field separator is set to a colon (:) to allow us to access individual fields in entries in the /etc/passwd file. Notice that FS is set after getline or else the parsing of the command's output would be affected.

Finally, the main procedure is designed to test that the first field matches name. If it does, the fifth field of the entry is printed. For instance, when Dale runs this script, it prints "Dale Dougherty."

When the output of a command is piped to getline and it contains multiple lines, getline reads a line at a time. The first time getline is called it reads the first line of output. If you call it again, it reads the second line. To read all the lines of output, you must set up a loop that executes getline until there is no more output. For instance, the following example uses a while loop to read each line of output and assign it to the next element of the array, who_out:

while ("who" | getline)
	who_out[++i] = $0

Each time the getline function is called, it reads the next line of output. The who command, however, is executed only once.

The next example looks for "@date" in a document and replaces it with today's date:

# subdate.awk -- replace @date with todays date
/@date/ {
	"date +'%a., %h %d, %Y'" | getline today
	gsub(/@date/, today)
}
{ print }

The date command, using its formatting options,[67] provides the date and getline assigns it to the variable today. The gsub() function replaces each instance of "@date" with today's date.

[67]Older versions of date don't support formatting options. Particularly the one on SunOS 4.1.x systems; there you have to use /usr/5bin/date. Check your local documentation.

This script might be used to insert the date in a form letter:

To: Peabody
From: Sherman 
Date: @date

I am writing you on @date to 
remind you about our special offer.

All lines of the input file would be passed through as is, except the lines containing "@date", which are replaced with today's date:

$ awk -f subdate.awk subdate.test
To: Peabody
From: Sherman
Date: Sun., May 05, 1996

I am writing you on Sun., May 05, 1996 to
remind you about our special offer.


Library Navigation Links

Copyright © 2003 O'Reilly & Associates. All rights reserved.