home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam  


Perl CookbookPerl CookbookSearch this book

8.8. Reading a Particular Line in a File

8.8.3. Discussion

Each strategy has different features, useful in different circumstances. The linear access approach is easy to write and best for short files. The Tie::File module gives good performance, regardless of the size of the file or which line you're reading (and is pure Perl, so doesn't require any external libraries). The DB_File mechanism has some initial overhead, but later accesses are faster than with linear access, so use it for long files that are accessed more than once and are accessed out of order.

It is important to know whether you're counting lines from 0 or 1. The $. variable is 1 after the first line is read, so count from 1 when using linear access. The index mechanism uses many offsets, so count from 0. Tie::File and DB_File treat the file's records as an array indexed from 0, so count lines from 0.

Here are three different implementations of the same program, print_line. The program takes two arguments: a filename and a line number to extract.

The version in Example 8-1 simply reads lines until it finds the one it's looking for.

Example 8-1. print_line-v1

  #!/usr/bin/perl -w
  # print_line-v1 - linear style
  
  @ARGV =  = 2 or die "usage: print_line FILENAME LINE_NUMBER\n";
  
  ($filename, $line_number) = @ARGV;
  open(INFILE, "<", $filename)
    or die "Can't open $filename for reading: $!\n";
  while (<INFILE>) {
      $line = $_;
      last if $. =  = $line_number;
  }
  if ($. != $line_number) {
      die "Didn't find line $line_number in $filename\n";
  }
  print;

The Tie::File version is shown in Example 8-2.

Example 8-2. print_line-v2

  #!/usr/bin/perl -w
  # print_line-v2 - Tie::File style
  use Tie::File;
  use Fcntl;
  @ARGV =  = 2 or die "usage: print_line FILENAME LINE_NUMBER\n";
  ($filename, $line_number) = @ARGV;
  tie @lines, Tie::File, $filename, mode => O_RDWR
      or die "Can't open $filename for reading: $!\n";
  if (@lines > $line_number) {
      die "Didn't find line $line_number in $filename\n";
  }
  print "$lines[$line_number-1]\n";

The DB_File version in Example 8-3 follows the same logic as Tie::File.

Example 8-3. print_line-v3

  #!/usr/bin/perl -w
  # print_line-v3 - DB_File style
  use DB_File;
  use Fcntl;
  
  @ARGV =  = 2 or die "usage: print_line FILENAME LINE_NUMBER\n";
  ($filename, $line_number) = @ARGV;
  $tie = tie(@lines, DB_File, $filename, O_RDWR, 0666, $DB_RECNO)
      or die "Cannot open file $filename: $!\n";
  
  unless ($line_number < $tie->length) {
      die "Didn't find line $line_number in $filename\n"
  }
  
  print $lines[$line_number-1];                        # easy, eh?

If you will be retrieving lines by number often and the file doesn't fit into memory, build a byte-address index to let you seek directly to the start of the line using the techniques in Recipe 8.27.



Library Navigation Links

Copyright © 2003 O'Reilly & Associates. All rights reserved.