home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam  


2.10. Generating Biased Random Numbers

Problem

You want to pick a random value where the probabilities of the values are not equal (the distribution is not even). You might be trying to randomly select a banner to display on a web page, given a set of relative weights saying how often each banner is to be displayed. Alternatively, you might want to simulate behavior according to a normal distribution (the bell curve).

Solution

If you want a random value distributed according to a specific function  - e.g., the Gaussian (Normal) distribution  - consult a statistics textbook to find the appropriate function or algorithm. This subroutine generates random numbers that are normally distributed, with a standard deviation of 1 and a mean of 0.

sub gaussian_rand {
    my ($u1, $u2);  # uniformly distributed random numbers
    my $w;          # variance, then a weight
    my ($g1, $g2);  # gaussian-distributed numbers

    do {
        $u1 = 2 * rand() - 1;
        $u2 = 2 * rand() - 1;
        $w = $u1*$u1 + $u2*$u2;
    } while ( $w >= 1 );

    $w = sqrt( (-2 * log($w))  / $w );
    $g2 = $u1 * $w;
    $g1 = $u2 * $w;
    # return both if wanted, else just one
    return wantarray ? ($g1, $g2) : $g1;
}

If you have a list of weights and values you want to randomly pick from, follow this two-step process: First, turn the weights into a probability distribution with weight_to_dist below, and then use the distribution to randomly pick a value with weighted_rand :

# weight_to_dist: takes a hash mapping key to weight and returns
# a hash mapping key to probability
sub weight_to_dist {
    my %weights = @_;
    my %dist    = ();
    my $total   = 0;
    my ($key, $weight);
    local $_;

    foreach (values %weights) {
        $total += $_;
    }

    while ( ($key, $weight) = each %weights ) {
        $dist{$key} = $weight/$total;
    }

    return %dist;
}

# weighted_rand: takes a hash mapping key to probability, and
# returns the corresponding element
sub weighted_rand {
    my %dist = @_;
    my ($key, $weight);

    while (1) {                     # to avoid floating point inaccuracies
        my $rand = rand;
        while ( ($key, $weight) = each %dist ) {
            return $key if ($rand -= $weight) < 0;
        }
    }
}

Discussion

The gaussian_rand function implements the polar Box Muller method for turning two independent uniformly distributed random numbers between 0 and 1 (such as rand returns) into two numbers with a mean of 0 and a standard deviation of 1 (i.e., a Gaussian distribution). To generate numbers with a different mean and standard deviation, multiply the output of gaussian_rand by the new standard deviation, and then add the new mean:

# gaussian_rand as above
$mean = 25;
$sdev = 2;
$salary = gaussian_rand() * $sdev + $mean;
printf("You have been hired at \$%.2f\n", $salary);

The weighted_rand function picks a random number between 0 and 1. It then uses the probabilities generated by weight_to_dist to see which element the random number corresponds to. Because of the vagaries of floating-point representation, the accumulated errors of representation might mean we don't find an element to return. This is why we wrap the code in a while to pick a new random number and try again.

In addition, the CPAN module Math::Random has functions to return random numbers from a variety of distributions.

See Also

The rand function in perlfunc (1) and Chapter 3 of Programming Perl ; Recipe 2.7 ; the documentation for the CPAN module Math::Random


Previous: 2.9. Making Numbers Even More Random Perl Cookbook Next: 2.11. Doing Trigonometry in Degrees, not Radians
2.9. Making Numbers Even More Random Book Index 2.11. Doing Trigonometry in Degrees, not Radians