home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam  

Book HomePerl & LWPSearch this book

7.3. Individual Tokens

Now that you know the composition of the various types of tokens, let's see how to use HTML::TokeParser to write useful programs. Many problems are quite simple and require only one token at a time. Programs to solve these problems consist of a loop over all the tokens, with an if statement in the body of the loop identifying the interesting parts of the HTML:

use HTML::TokeParser;
my $stream = HTML::TokeParser->new($filename)
  || die "Couldn't read HTML file $filename: $!";
# For a string: HTML::TokeParser->new( \$string_of_html );

while (my $token = $stream->get_token) {
   if ($token->[0] eq 'T') { # text
     # process the text in $text->[1]

   } elsif ($token->[0] eq 'S') { # start-tag
     my($tagname, $attr) = @$token[1,2];
     # consider this start-tag...

   } elsif ($token->[0] eq 'E') {
     my $tagname = $token->[1];
     # consider this end-tag

   # ignoring comments, declarations, and PIs

Library Navigation Links

Copyright © 2002 O'Reilly & Associates. All rights reserved.