home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam  


17.5 Sample Specification Parser

The input specification parser is particular to an application domain. In this section, we look at the parser that is required for our toy object model specification, primarily to review how the AST library is used; the parsing code itself is quite trivial. For more involved parsing tasks, you can use a version of Berkeley yacc , which has been hacked up to output Perl instead of C (available from http://ftp.sterling.com:/local/perl-byacc.tar.Z ). I have successfully used this combination to produce IDL parsers for the CORBA specification.

The parser in Example 17.4 allows attributes to have additional annotations like this:

class Foo {
    int  id,  access=readonly, db_col_name=id, index=yes;
};

In the template, these attribute properties can be used just like "standard" properties such as attr_name and attr_type .

Example 17.4: OO_Schema.pm: The Specification Parser

package SchemaParser;
use Ast;
use Carp;
sub parse{
    my ($package, $filename) = @_;
    open (P, $filename) || die "Could not open $filename : $@";
    my $root = Ast->new("Root");
    eval {
        while (1) {
            get_line();
            next unless ($line =~ /^\s*class +(\w+)/);
            $c = Ast->new($1);
            $c->add_prop("class_name" => $1);
            $root->add_prop_list("class_list", $c);
            while (1) {
                get_line();
                last if $line =~ /^\s*}/;
                if ($line =~ s/^\s*(\w+)\s*(\w+)//) {
                    $a = Ast->new($2);  #attribute name
                    $a->add_prop ("attr_name", $2);  #attribute type
                    $a->add_prop ("attr_type", $1);  #attribute type
                    $c->add_prop_list("attr_list", $a);
                }
                $curr_line = $line;
                while ($curr_line !~ /;/) {
                    get_line();
                    $curr_line .= $line;
                }
                @props = split (/[,;]/,$curr_line);
                foreach $prop (@props) {
                    if ($prop =~ /\s*(\w*)\s*=\s*(.*)\s*/) {
                         $a->add_prop($1, $2);
                    }
                }
            }
        }
    };
    # Comes here if "END OF FILE" exception is thrown
    die $@ if ($@ && ($@  !~ /END OF FILE/));
    return $root;
}
sub get_line {
    while (defined($line = <P>)) {
        chomp $line;
        $line =~ s#//.*$##;          # remove comments
        return if $line !~ /^\s*$/;  # return if not white-space
    } 
    die "END OF FILE"; 
}
1;

OO_Schema::parse starts by creating a new AST root node, and whenever it encounters a new class declaration, it adds it to the root's class_list property. Similarly, for each attribute, it creates a new node and adds it to the attr_list property of the AST node representing the class being examined.

The procedure get_line throws an end of file exception when there's nothing more to read. This way, the user of get_line can wrap multiple calls to get_line inside an eval w ithout having to check at each place if it has prematurely reached the end of input.