Another feature of this module is that the declarations in the
internal subset are captured in lists accessible through the
XML::Grove object. Every entity or notation
declaration is available for your perusal. For example, the following
program counts the distribution of elements and other nodes, and then
prints a list of node types and their frequency.
First, we initialize the parser with the style
"grove" (to tell
XML::Parser that it needs to use
XML::Parser::Grove to process its output):
use XML::Parser;
use XML::Parser::Grove;
use XML::Grove;
my $parser = XML::Parser->new( Style => 'grove', NoExpand => '1' );
my $grove = $parser->parsefile( shift @ARGV );
Next, we access the contents of the grove by calling the
contents( ) method. This method returns a list
including the root element and any comments or PIs outside of it. A
subroutine called tabulate( ) counts nodes and
descends recursively through the tree. Finally, the results are
printed:
# tabulate elements and other nodes
my %dist;
foreach( @{$grove->contents} ) {
&tabulate( $_, \%dist );
}
print "\nNODES:\n\n";
foreach( sort keys %dist ) {
print "$_: " . $dist{$_} . "\n";
}
Here is the subroutine that handles each node in the tree. Since each
node is a different class, we can use ref( ) to
get the type. Attributes are not treated as nodes in this model, but
are available through the element class's method
attributes( ) as a hash. The call to
contents( ) allows the routine to continue
processing the element's children:
# given a node and a table, find out what the node is, add to the count,
# and recurse if necessary
#
sub tabulate {
my( $node, $table ) = @_;
my $type = ref( $node );
if( $type eq 'XML::Grove::Element' ) {
$table->{ 'element' }++;
$table->{ 'element (' . $node->name . ')' }++;
foreach( keys %{$node->attributes} ) {
$table->{ "attribute ($_)" }++;
}
foreach( @{$node->contents} ) {
&tabulate( $_, $table );
}
} elsif( $type eq 'XML::Grove::Entity' ) {
$table->{ 'entity-ref (' . $node->name . ')' }++;
} elsif( $type eq 'XML::Grove::PI' ) {
$table->{ 'PI (' . $node->target . ')' }++;
} elsif( $type eq 'XML::Grove::Comment' ) {
$table->{ 'comment' }++;
} else {
$table->{ 'text-node' }++
}
}