Code Profiling Techniques (Practical mod

9.5.1. Profiling with Devel::DProf

Devel::DProf collects information on the execution time of a Perl script and of the subroutines in that script.

Let's take for example the diagnostics pragma and measure the impact of its usage on the program compilation and execution speed. This pragma extends the terse diagnostics normally emitted by both the Perl compiler and the Perl interpreter, augmenting them with the more verbose and endearing descriptions found in the perldiag manpage. We have claimed that this pragma should not be used on a production server. We are going to use Devel::DProf to explain our claim.

We will run a benchmark, once with diagnostics enabled and once disabled, on a subroutine called test_code( ).

The code inside the subroutine does either a lexical or a numeric comparison of two strings. It assigns one string to another if the condition tests true, but the condition is always false. To demonstrate the diagnostics pragma overhead, the comparison operator that we use in Example 9-7 is intentionally wrong. It should be a string comparison (eq), and we use a numeric one (= =).

Example 9-7. bench_diagnostics.pl

use Benchmark;
use diagnostics;
use strict;

my $count = 50000;

disable diagnostics;
my $t1 = timeit($count,\&test_code);

enable  diagnostics;
my $t2 = timeit($count,\&test_code);

print "Off: ",timestr($t1),"\n";
print "On : ",timestr($t2),"\n";

sub test_code {
    my ($a, $b) = qw(foo bar);
    my $c;
    if ($a =  = $b) {
        $c = $a;
    }
}

For only a few lines of code we get:

Off:  1 wallclock secs ( 0.81 usr +  0.00 sys =  0.81 CPU)
On : 13 wallclock secs (12.54 usr +  0.01 sys = 12.55 CPU)

With diagnostics enabled, the subroutine test_code( ) is 16 times slower (12.55/0.81: remember that we're talking in CPU time, not wallclock seconds) than with diagnostics disabled!

Now let's fix the comparison the way it should be, by replacing = = with eq, so we get:

my ($a, $b) = qw(foo bar);
my $c;
if ($a eq $b) {
    $c = $a;
}

and run the same benchmark again:

Off:  1 wallclock secs ( 0.57 usr +  0.00 sys =  0.57 CPU)
On :  1 wallclock secs ( 0.56 usr +  0.00 sys =  0.56 CPU)

Now there is no overhead at all. The diagnostics pragma slows things down only when warnings are generated.

After we have verified that using the diagnostics pragma might add a big overhead to execution runtime, let's use code profiling to understand why this happens. We use Devel::DProf to profile the code shown in Example 9-8.

Example 9-8. diagnostics.pl

use diagnostics;

test_code( );
sub test_code {
    my($a, $b) = qw(foo bar);
    my $c;
    if ($a =  = $b) {
        $c = $a;
    }
}

Run it with the profiler enabled, and then create the profiling statistics with the help of dprofpp:

panic% perl -d:DProf diagnostics.pl
panic% dprofpp

Total Elapsed Time = 0.342236 Seconds
  User+System Time = 0.335420 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c  Name
 92.1   0.309  0.358      1   0.3089 0.3578  main::BEGIN
 14.9   0.050  0.039   3161   0.0000 0.0000  diagnostics::unescape
 2.98   0.010  0.010      2   0.0050 0.0050  diagnostics::BEGIN
 0.00   0.000 -0.000      2   0.0000      -  Exporter::import
 0.00   0.000 -0.000      2   0.0000      -  Exporter::export
 0.00   0.000 -0.000      1   0.0000      -  Config::BEGIN
 0.00   0.000 -0.000      1   0.0000      -  Config::TIEHASH
 0.00   0.000 -0.000      2   0.0000      -  Config::FETCH
 0.00   0.000 -0.000      1   0.0000      -  diagnostics::import
 0.00   0.000 -0.000      1   0.0000      -  main::test_code
 0.00   0.000 -0.000      2   0.0000      -  diagnostics::warn_trap
 0.00   0.000 -0.000      2   0.0000      -  diagnostics::splainthis
 0.00   0.000 -0.000      2   0.0000      -  diagnostics::transmo
 0.00   0.000 -0.000      2   0.0000      -  diagnostics::shorten
 0.00   0.000 -0.000      2   0.0000      -  diagnostics::autodescribe

It's not easy to see what is responsible for this enormous overhead, even if main::BEGINseems to be running most of the time. To get the full picture we must see the OPs tree, which shows us who calls whom, so we run:

panic% dprofpp -T

The output is:

main::BEGIN
  diagnostics::BEGIN
     Exporter::import
        Exporter::export
  diagnostics::BEGIN
     Config::BEGIN
     Config::TIEHASH
     Exporter::import
        Exporter::export
  Config::FETCH
  Config::FETCH
  diagnostics::unescape
  .....................
  3159 times [diagnostics::unescape] snipped
  .....................
  diagnostics::unescape
  diagnostics::import
diagnostics::warn_trap
  diagnostics::splainthis
     diagnostics::transmo
     diagnostics::shorten
     diagnostics::autodescribe
main::test_code
  diagnostics::warn_trap
     diagnostics::splainthis
        diagnostics::transmo
        diagnostics::shorten
        diagnostics::autodescribe
  diagnostics::warn_trap
     diagnostics::splainthis
        diagnostics::transmo
        diagnostics::shorten
       diagnostics::autodescribe

So we see that 2 executions of diagnostics::BEGIN and 3,161 of diagnostics::unescape are responsible for most of the running overhead.

If we comment out the diagnostics module, we get:

Total Elapsed Time = 0.079974 Seconds
  User+System Time = 0.059974 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c  Name
 0.00   0.000 -0.000      1   0.0000      -  main::test_code

It is possible to profile code running under mod_perl with the Devel::DProf module, available on CPAN. However, you must have PerlChildExitHandler enabled during the mod_perl build process. When the server is started, Devel::DProf installs an END block to write the tmon.out file. This block will be called at server shutdown. Here is how to start and stop a server with the profiler enabled:

panic% setenv PERL5OPT -d:DProf
panic% httpd -X -d `pwd` &
... make some requests to the server here ...
panic% kill `cat logs/httpd.pid`
panic% unsetenv PERL5OPT
panic% dprofpp

The Devel::DProf package is a Perl code profiler. It will collect information on the execution time of a Perl script and of the subroutines in that script (remember that print( ) and map( ) are just like any other subroutines you write, but they come bundled with Perl!).

Another approach is to use Apache::DProf, which hooks Devel::DProf into mod_perl. The Apache::DProf module will run a Devel::DProf profiler inside the process and write the tmon.out file in the directory $ServerRoot/logs/dprof/$$ (make sure that it's writable by the server!) when the process is shut down (where $$ is the PID of the process). All it takes to activate this module is to modify httpd.conf.

You can test for a command-line switch in httpd.conf. For example, to test if the server was started with -DPERLDPROF, use:

<Location /perl>
     SetHandler perl-script
     PerlHandler Apache::Registry
     <IfDefine PERLDPROF>
         PerlModule Apache::DProf
     </IfDefine>
</Location>

And to activate profiling, use:

panic% httpd -X -DPERLDPROF &

Remember that any PerlHandler that was pulled in before Apache::DProf in the httpd.conf or startup.pl file will not have code-debugging information inserted. To run dprofpp, chdir to $ServerRoot/logs/dprof/$$[36] and run:

[36]Look up the ServerRoot directive's value in httpd.conf to figure out what your $ServerRoot is.

panic% dprofpp

Use the command-line options for dropfpp(1) if a nondefault output is desired, as explained in the dropfpp manpage. You might especially want to look at the -r switch to display wallclock times (more relevant in a web-serving environment) and the -l switch to sort by number of subroutine calls.

If you are running Perl 5.6.0 or higher, take a look at the new module Devel::Profiler (Version 0.04 as of this writing), which is supposed to be a drop-in replacement for Apache::DProf, with improved functionality and stability.

9.5. Code Profiling Techniques

9.5.1. Profiling with Devel::DProf

Example 9-7. bench_diagnostics.pl

Example 9-8. diagnostics.pl

9.5.2. Profiling with Devel::SmallProf

Example 9-9. table_gen.pl

Example 9-10. table_gen2.pl