Benchmark (Programming Perl)

32.2. Benchmark

use Benchmark qw(timethese cmpthese timeit countit timestr);

# You can always pass in code as strings:
timethese $count, {
    'Name1' => '...code1...',
    'Name2' => '...code2...',
};

# Or as subroutines references:
timethese $count, {
    'Name1' => sub { ...code1... },
    'Name2' => sub { ...code2... },
};

cmpthese $count, {
    'Name1' => '...code1...',
    'Name2' => '...code2...',
};

$t = timeit $count, '...code...';
print "$count loops of code took:", timestr($t), "\n";

$t = countit $time, '...code...';
$count = $t->iters;
print "$count loops of code took:", timestr($t), "\n";

The Benchmark module can help you determine which of several possible choices executes the fastest. The timethese function runs the specified code segments the number of times requested and reports back how long each segment took. You can get a nicely sorted comparison chart if you call cmpthese the same way.

Code segments may be given as function references instead of strings (in fact, they must be if you use lexical variables from the calling scope), but call overhead can influence the timings. If you don't ask for enough iterations to get a good timing, the function emits a warning.

Lower-level interfaces are available that run just one piece of code either for some number of iterations (timeit) or for some number of seconds (countit). These functions return Benchmark objects (see the online documentation for a description). With countit, you know it will run in enough time to avoid warnings, because you specified a minimum run time.

To get the most out of the Benchmark module, you'll need a good bit of practice. It isn't usually enough to run a couple different algorithms on the same data set, because the timings only reflect how well those algorithms did on that particular data set. To get a better feel for the general case, you'll need to run several sets of benchmarks, varying the data sets used.

For example, suppose you wanted to know the best way to get a copy of a string without the last two characters. You think of four ways to do so (there are, of course, several others): chop twice, copy and substitute, or use substr on either the left- or righthand side of an assignment. You test these algorithms on strings of length 2, 200, and 20_000:

use Benchmark qw/countit cmpthese/;
sub run($) { countit(5, @_) }
for $size (2, 200, 20_000) {
    $s = "." x $len;
    print "\nDATASIZE = $size\n";
    cmpthese {
        chop2   => run q{
            $t = $s; chop $t; chop $t;
        },
        subs    => run q{
            ($t = $s) =~ s/..\Z//s;
        },
        lsubstr => run q{
             $t = $s; substr($t, -2) = '';
        },
        rsubstr => run q{
             $t = substr($s, 0, length($s)-2);
        },

    };
}

which produces the following output:

DATASIZE = 2
            Rate    subs lsubstr   chop2 rsubstr
subs    181399/s      --    -15%    -46%    -53%
lsubstr 214655/s     18%      --    -37%    -44%
chop2   338477/s     87%     58%      --    -12%
rsubstr 384487/s    112%     79%     14%      --

DATASIZE = 200
            Rate    subs lsubstr rsubstr   chop2
subs    200967/s      --    -18%    -24%    -34%
lsubstr 246468/s     23%      --     -7%    -19%
rsubstr 264428/s     32%      7%      --    -13%
chop2   304818/s     52%     24%     15%      --

DATASIZE = 20000
          Rate rsubstr    subs lsubstr   chop2
rsubstr 5271/s      --    -42%    -43%    -45%
subs    9087/s     72%      --     -2%     -6%
lsubstr 9260/s     76%      2%      --     -4%
chop2   9660/s     83%      6%      4%      --

With small data sets, the "rsubstr" algorithm runs 14% faster than the "chop2" algorithm, but in large data sets, it runs 45% slower. On empty data sets (not shown here), the substitution mechanism is the fastest. So there is often no best solution for all possible cases, and even these timings don't tell the whole story, since you're still at the mercy of your operating system and the C library Perl was built with. What's good for you may be bad for someone else. It takes a while to develop decent benchmarking skills. In the meantime, it helps to be a good liar.