1.18. Program: psgrepMany programs, including ps , netstat , lsof , ls -l , find -ls , and tcpdump , can produce more output than can be conveniently summarized. Logfiles also often grow too long to be easily viewed. You could send these through a filter like grep to pick out only certain lines, but regular expressions and complex logic don't mix well; just look at the hoops we jump through in Recipe 6.17 . What we'd really like is to make full queries on the program output or logfile. For example, to ask ps something like, "Show me all the processes that exceed 10K in size but which aren't running as the superuser." Or, "Which commands are running on pseudo-ttys?" The psgrep program does this - and infinitely more - because the specified selection criteria are not mere regular expressions; they're full Perl code. Each criterion is applied in turn to every line of output. Only lines matching all arguments are output. The following is a list of things to find and how to find them. Lines containing "sh" at the end of a word: % psgrep '/sh\b/' Processes whose command names end in "sh": % psgrep 'command =~ /sh$/' Processes running with a user ID below 10: % psgrep 'uid < 10' Login shells with active ttys: % psgrep 'command =~ /^-/' 'tty ne "?"' Processes running on pseudo-ttys: % psgrep 'tty =~ /^[p-t]/' Non-superuser processes running detached: % psgrep 'uid && tty eq "?"' Huge processes that aren't owned by the superuser: % psgrep 'size > 10 * 2**10' 'uid != 0' The last call to psgrep produced the following output when run on our system. As one might expect, only netscape and its spawn qualified.
Example 1.6 shows the psgrep program. Example 1.6: psgrep#!/usr/bin/perl -w # psgrep - print selected lines of ps output by # compiling user queries into code use strict; # each field from the PS header my @fieldnames = qw(FLAGS UID PID PPID PRI NICE SIZE RSS WCHAN STAT TTY TIME COMMAND); # determine the unpack format needed (hard-coded for Linux ps) my $fmt = cut2fmt(8, 14, 20, 26, 30, 34, 41, 47, 59, 63, 67, 72); my %fields; # where the data will store die <<Thanatos unless @ARGV; usage: $0 criterion ... Each criterion is a Perl expression involving: @fieldnames All criteria must be met for a line to be printed. Thanatos # Create function aliases for uid, size, UID, SIZE, etc. # Empty parens on closure args needed for void prototyping. for my $name (@fieldnames) { no strict 'refs'; *$name = *{lc $name} = sub () { $fields{$name} }; } my $code = "sub is_desirable { " . join(" and ", @ARGV) . " } "; unless (eval $code.1) { die "Error in code: $@\n\t$code\n"; } open(PS, "ps wwaxl |") || die "cannot fork: $!"; print scalar <PS>; # emit header line while (<PS>) { @fields{@fieldnames} = trim(unpack($fmt, $_)); print if is_desirable(); # line matches their criteria } close(PS) || die "ps failed!"; # convert cut positions to unpack format sub cut2fmt { my(@positions) = @_; my $template = ''; my $lastpos = 1; for my $place (@positions) { $template .= "A" . ($place - $lastpos) . " "; $lastpos = $place; } $template .= "A*"; return $template; } sub trim { my @strings = @_; for (@strings) { s/^\s+//; s/\s+$//; } return wantarray ? @strings : $strings[0]; } # the following was used to determine column cut points. # sample input data follows #123456789012345678901234567890123456789012345678901234567890123456789012345 # 1 2 3 4 5 6 7 # Positioning: # 8 14 20 26 30 34 41 47 59 63 67 72 # | | | | | | | | | | | | __END__
The
psgrep
program integrates many techniques presented throughout this book. Stripping strings of leading and trailing whitespace is found in
Recipe 1.14
. Converting cut marks into an
The multiline string in the here document passed to
The sample program input contained beneath
The real power and expressiveness in
psgrep
derive from Perl's use of string arguments not as mere strings but directly as Perl code. This is similar to the technique in
Recipe 9.9
, except that in
psgrep
, the user's arguments are wrapped with a routine called eval "sub is_desirable { uid < 10 } " . 1;
The mysterious "
Specifying arbitrary Perl code in a filter to select records is a breathtakingly powerful approach, but it's not entirely original. Perl owes much to the
awk
programming language, which is often used for such filtering. One problem with
awk
is that it can't easily treat input as fixed-size fields instead of fields separated by something. Another is that the fields are not mnemonically named:
awk
uses
The user criteria don't even have to be simple expressions. For example, this call initializes a variable % psgrep 'no strict "vars"; BEGIN { $id = getpwnam("nobody") } uid == $id '
How can we use unquoted words without even a dollar sign, like
One twist here not seen in those recipes is empty parentheses on the closure. These allowed us to use the function in an expression anywhere we'd use a single term, like a string or a numeric constant. It creates a void prototype so the field-accessing function named The version of psgrep demonstrated here expects the output from Red Hat Linux's ps . To port to other systems, look at which columns the headers begin at. This approach isn't relevant only to ps or only to Unix systems. It's a generic technique for filtering input records using Perl expressions, easily adapted to other record layouts. The input format could be in columns, space separated, comma separated, or the result of a pattern match with capturing parentheses. The program could even be modified to handle a user-defined database with a small change to the selection functions. If you had an array of records as described in Recipe 11.9 , you could let users specify arbitrary selection criteria, such as: sub id() { $_->{ID} } sub title() { $_->{TITLE} } sub executive() { title =~ /(?:vice-)?president/i } # user search criteria go in the grep clause @slowburners = grep { id < 10 && !executive } @employees; For reasons of security and performance, this kind of power is seldom found in database engines like those described in Chapter 14, Database Access . SQL doesn't support this, but given Perl and small bit of ingenuity, it's easy to roll it up on your own. The search engine at http://mox.perl.com/cgi-bin/MxScreen uses such a technique, but instead of output from ps , its records are Perl hashes loaded from a database. |
|