Writing Your Own Pod Tools (Programming Perl)

26.3. Writing Your Own Pod Tools

Pod was designed first and foremost to be easy to write. As an added benefit, pod's simplicity also lends itself to writing simple tools for processing pod. If you're looking for pod directives, just set your input record separator to paragraph mode (perhaps with the -00 switch), and only pay attention to paragraphs that look poddish.

For example, here's a simple olpod program to produce a pod outline:

#!/usr/bin/perl -l00n
# olpod - outline pod
next unless /^=head/;
s/^=head(\d)\s+/ ' ' x ($1 * 4 - 4)/e;
print $_, "\n";

If you run that on the current chapter of this book, you'll get something like this:

Plain Old Documentation
    Pod in a Nutshell
        Verbatim Paragraphs
        Pod Directives
        Pod Sequences
    Pod Translators and Modules
    Writing Your Own Pod Tools
    Pod Pitfalls
    Documenting Your Perl Programs

That pod outliner didn't really pay attention to whether it was in a valid pod block or not. Since pod and nonpod can intermingle in the same file, running general-purpose tools to search or analyze the whole file doesn't always make sense. But that's no problem, given how easy it is to write tools for pod. Here's a tool that is aware of the difference between pod and nonpod, and produces only the pod:

#!/usr/bin/perl -00
# catpod - cat out just the pods
while (<>) {
    if (! $inpod) { $inpod = /^=/;            }
    if ($inpod)   { $inpod = !/^=cut/; print; }
} continue {
    if (eof)      {  close ARGV; $inpod = ''; }
}

You could use that program on another Perl program or module, then pipe the output along to another tool. For example, if you have the wc(1) program[2] to count lines, words, and characters, you could feed it catpod output to consider only pod in its counting:

[2]And if you don't, get the Perl Power Tools version from the CPAN scripts directory.

% catpod MyModule.pm | wc

There are plenty of places where pod allows you to write primitive tools trivially using plain, straightforward Perl. Now that you have catpod to use as a component, here's another tool to show just the indented code:

#!/usr/bin/perl -n00
# podlit - print the indented literal blocks from pod input
print if /^\s/;

What would you do with that? Well, you might want to do perl -wc checks on the code in the document, for one thing. Or maybe you want a flavor of grep(1)[3] that only looks at the code examples:

[3]And if you don't have grep, see previous footnote.

% catpod MyModule.pm | podlit | grep funcname

This tool-and-filter philosophy of interchangeable (and separately testable) parts is a sublimely simple and powerful approach to designing reusable software components. It's a form of laziness to just put together a minimal solution that gets the job done today--for certain kinds of jobs, at least.

For other tasks, though, this can even be counterproductive. Sometimes it's more work to write a tool from scratch, sometimes less. For those we showed you earlier, Perl's native text-processing prowess makes it expedient to use brute force. But not everything works that way. As you play with pod, you might notice that although its directives are simple to parse, its sequences can get a little dicey. Although some, um, subcorrect translators don't accommodate this, sequences can nest within other sequences and can have variable-length delimiters.

Instead of coding up all that parsing code on your own, laziness looks for another solution. The standard Pod::Parser module fits that bill. It's especially useful for complicated tasks, like those that require real parsing of the internal bits of the paragraphs, conversion into alternative output formats, and so on. It's easier to use the module for complicated cases, because the amount of code you end up writing is smaller. It's also better because the tricky parsing is already worked out for you. It's really the same principle as using catpod in a pipeline.

The Pod::Parser module takes an interesting approach to its job. It's an object-oriented module of a different flavor than most you've seen in this book. Its primary goal isn't so much to provide objects for direct manipulation as it is to provide a base class upon which other classes can be built.

You create your own class and inherit from Pod::Parser. Then you declare subroutines to serve as callback methods for your parent class's parser to invoke. It's a very different way of programming than the procedural programs given earlier. In a sense, it's more of a declarative programming style, because to get the job done, you simply register functions and let other entities invoke them for you. The program's tiresome logic is handled elsewhere. You just give some plug-and-play pieces.

Here's a rewrite of the original catpod program given earlier, but this time it uses the Pod::Parser module to create our own subclass:

#!/usr/bin/perl
# catpod2, class and program

package catpod_parser;
use Pod::Parser;
@ISA = qw(Pod::Parser);
sub command {
    my ($parser, $command, $paragraph, $line_num) = @_;
    my $out_fh = $parser->output_handle();
    $paragraph .= "\n" unless substr($paragraph, -1) eq "\n";
    $paragraph .= "\n" unless substr($paragraph, -2) eq "\n\n";
    print $out_fh "=$command $paragraph";
}

sub verbatim {
    my ($parser, $paragraph, $line_num) = @_;
    my $out_fh = $parser->output_handle();
    print $out_fh $paragraph;
}

sub textblock {
    my ($parser, $paragraph, $line_num) = @_;
    my $out_fh = $parser->output_handle();
    print $out_fh $paragraph;
}
sub interior_sequence {
    my ($parser, $seq_command, $seq_argument) = @_;
    return "$seq_command<$seq_argument>";
}

if (!caller) {
    package main;
    my $parser = catpod_parser::->new();
    unshift @ARGV, '-' unless @ARGV;
    for (@ARGV) { $parser->parse_from_file($_); }
}
1;
__END__

=head1 NAME
docs describing the new catpod program here

As you see, it's a good bit longer and more complicated. It's also more extensible because all you have to do is plug in your own methods when you want your subclass to act differently than its base class.

The last bit at the end there, where it says !caller, checks whether the file is being used as a module or as a program. If it's being used as a program, then there is no caller. So it fires up its own parser (using the new method it inherited) and runs that parser on the command-line arguments. If no filenames were supplied, it assumes standard input, just as the previous version did.

Following the module code is an __END__ marker, a blank line without whitespace on it, and then the program/module's own pod documentation. This is an example of one file that's a program and a module and its own documentation. It's probably several other things as well.