Published on Perl.com
http://www.perl.com/pub/a/2002/11/27/classdbi.html
See this if you're having trouble printing code examples
Class::DBI
By Tony Bowden
Several articles on Perl.com, including the recent
Phrasebook
Design Pattern, have discussed the problems faced when writing
Perl code that interacts with a database. Terrence Brannon's
DBIx::Recordset
article attempted to show how code dealing with databases can be made
simpler, and more maintainable. In this article, I will try to show how
Class::DBI can make this easier still.
Class::DBI prizes laziness and simplicity. Its goal is to make simple
database interactions trivial, while leaving the hard ones possible.
For many simple applications, it replaces the need for writing SQL
entirely. On the other hand, it doesn't force you to build complex data
structures to specify a complex query; if you really need the power or
expressiveness of raw SQL, then it gets out of your way and lets you drop back
to that.
The easiest way to see Class::DBI in action is to build a simple
application with it. In this article, I'll build a tool for performing
analysis on my telephone bill.
Data::BT::PhoneBill
(available from CPAN), provides a simple interface to a phone bill
downloaded from the BT Web site. So, armed with this module, and a few
recent phonebills, let's store these details in a database, and see how
to extract useful information from them.
Class::DBI works on the basis that each table in your database has a
corresponding class. Although each class could set up its own connection
information, it's a better idea to encapsulate that connection in one
class, and have all the others inherit from that. So, we set up our
database, and create the base class for our application:
package My::PhoneBill::DBI;
use base 'Class::DBI';
__PACKAGE__->set_db('Main', 'dbi:mysql:phonebill', 'u/n', 'p/w');
1;
We simply inherit from Class::DBI and use the 'set_db ' method to set up
the connection information for our database. That's all we need in this
class for now, so next we set up our table for storing the phone call
information:
CREATE TABLE call (
callid MEDIUMINT UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT,
number VARCHAR(20) NOT NULL,
destination VARCHAR(255) NOT NULL,
calldate DATE NOT NULL,
calltime TIME NOT NULL,
type VARCHAR(50) NOT NULL,
duration SMALLINT UNSIGNED NOT NULL,
cost FLOAT(8,1)
);
For this, we set up a corresponding class:
package My::PhoneBill::Call;
use base 'My::PhoneBill::DBI';
__PACKAGE__->table('call');
__PACKAGE__->columns(All =>
qw/callid number destination calldate calltime type duration cost/);
1;
We inherit our connection information from our base class, and then
specify what table we're dealing with, and what its columns are.
Now we have enough to populate the table.
So, we create a simple, "populate_phone_bill" script:
#!/usr/bin/perl
use Data::BT::PhoneBill;
use My::PhoneBill::Call;
my $file = shift or die "Need a phone bill file";
my $bill = Data::BT::PhoneBill->new($file) or die "Can't parse bill";
while (my $call = $bill->next_call) {
My::PhoneBill::Call->create({
number => $call->number,
calldate => $call->date,
calltime => $call->time,
destination => $call->destination,
duration => $call->duration,
type => $call->type,
cost => $call->cost,
});
}
The create() call runs the SQL to INSERT the row for each call. As we're
using Class::DBI , and have defined our primary key column to be
AUTO_INCREMENT , we don't need to specify a value for that column. On databases
that support sequences, we could also inform Class::DBI what sequence
should be used to provide the primary key.
Now that we have a table populated with calls, we can begin to run
queries against it. Let's write a simple script that reports on all the
calls to a specified number:
#!/usr/bin/perl
use My::PhoneBill::Call;
my $number = shift or die "Usage: $0 <number>";
my @calls = My::PhoneBill::Call->search(number => $number);
my $total_cost = 0;
foreach my $call (@calls) {
$total_cost += $call->cost;
printf "%s %s - %d secs, %.1f pence\n",
$call->calldate, $call->calltime, $call->duration, $call->cost;
}
printf "Total: %d calls, %d pence\n", scalar @calls, $total_cost;
Here we can see that Class::DBI provides a 'search ' method for us
to use. We supply a hash of column/value pairs, and we get back all the
records that matched. Each result is an instance of the Call class,
and each has an accessor method corresponding to each column. (It's also
a mutator method, so we can update that record, but we're only reporting
at this stage).
So, if we want to see how often we're calling the Speaking Clock, then
we run
> perl calls_to 123
2002-09-17 11:06:00 - 5 secs, 8.5 pence
2002-10-19 21:20:00 - 8 secs, 8.5 pence
Total: 2 calls, 17 pence
Similarly, if we want to see all the calls on a given date, then we could
have a 'calls_on ' script:
#!/usr/bin/perl
use My::PhoneBill::Call;
my $date = shift or die "Usage: $0 <date>";
my @calls = My::PhoneBill::Call->search(calldate => $date);
my $total_cost = 0;
foreach my $call (@calls) {
$total_cost += $call->cost;
printf "%s) %s - %d secs, %.1f pence\n",
$call->calltime, $call->number, $call->duration, $call->cost;
}
printf "Total: %d calls, %d pence\n", scalar @calls, $total_cost;
Running it gives:
> perl calls_on 2002-10-19
...
18:36:00) 028 9024 4222 - 41 secs, 4.2 pence
21:20:00) 123 - 8 secs, 8.5 pence
...
Total: 7 calls, 92 pence
As promised, we've written a database application without writing any
SQL. OK, so we haven't really done anything very complicated yet, but
even for this simplistic use Class::DBI makes our life much easier.
I used to have a good memory for phone numbers. But Nokia, Ericsson, et
al, have conspired against me. By giving my cell phone a built-in address
book, they ensured that the part of my brain responsible for remembering
10 or 11 digit numbers would gradually atrophy. Now, when I look at the
output of 'calls_on ', I have no idea who "028 9024 4222" represents. So,
let's build an address book that can store this information, and then
change our reports to use it.
The first thing we should do is arrange our information a little better.
We'll take the number and destination columns, and move them to a
"recipient" table, to which we'll add a name column. "Destination" doesn't
make as much sense when associated with the number, rather than the call,
so we'll rename it to "location".
CREATE TABLE recipient (
recipid MEDIUMINT UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT,
number VARCHAR(20) NOT NULL,
location VARCHAR(255) NOT NULL,
name VARCHAR(255),
KEY (number)
);
And then we create the relevant class for this table:
package My::PhoneBill::Recipient;
use base 'My::PhoneBill::DBI';
__PACKAGE__->table('recipient');
__PACKAGE__->columns(All => qw/recipid number location name/);
1;
We also need to modify the Call table:
CREATE TABLE call (
callid MEDIUMINT UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT,
* recipient MEDIUMINT UNSIGNED NOT NULL,
calldate DATE NOT NULL,
calltime TIME NOT NULL,
type VARCHAR(50) NOT NULL,
duration SMALLINT UNSIGNED NOT NULL,
cost FLOAT(8,1),
* KEY (recipient)
);
and its associated class:
package My::PhoneBill::Call;
use base 'My::PhoneBill::DBI';
__PACKAGE__->table('call');
__PACKAGE__->columns(All =>
* qw/callid recipient calldate calltime type duration cost/);
1;
Then we can modify our script that populates the database:
#!/usr/bin/perl
use Data::BT::PhoneBill;
use My::PhoneBill::Call;
*use My::PhoneBill::Recipient;
my $file = shift or die "Need a phone bill file";
my $bill = Data::BT::PhoneBill->new($file) or die "Can't parse bill";
*while (my $call = $bill->next_call) {
* my $recipient = My::PhoneBill::Recipient->find_or_create({
* number => $call->number,
* location => $call->destination,
* });
* My::PhoneBill::Call->create({
* recipient => $recipient->id,
calldate => $call->date,
calltime => $call->time,
duration => $call->duration,
type => $call->type,
cost => $call->cost,
});
}
This time we need to create the Recipient first, so we can link to it
from the Call . But we don't want to create a new Recipient for each call -
if we've ever rang this person before, there'll already be an entry
in the recipient table: so we use find_or_create to give us back the
existing entry if it's there, or else create a new one.
With the table repopulated we can return to our reporting scripts.
Our calls_on script now fails as we can can't ask a call for its
'number'. So, we change it to:
#!/usr/bin/perl
use My::PhoneBill::Call;
my $date = shift or die "Usage: $0 <date>";
my @calls = My::PhoneBill::Call->search(calldate => $date);
my $total_cost = 0;
foreach my $call (@calls) {
$total_cost += $call->cost;
printf "%s) %s - %d secs, %.1f pence\n",
* $call->calltime, $call->recipient, $call->duration, $call->cost;
}
printf "Total: %d calls, %d pence\n", scalar @calls, $total_cost;
However, running it doesn't really give us what we want:
> perl calls_on 2002-10-19
...
18:36:00) 67 - 41 secs, 4.2 pence
21:20:00) 47 - 8 secs, 8.5 pence
...
Total: 7 calls, 92 pence
Instead of getting the phone number, we now get the ID from the
recipient table, which is just an auto-incrementing value.
To turn this into a sensible value, we add the following line to the
Call class:
__PACKAGE__->has_a(recipient => 'My::PhoneBill::Recipient');
This tells it that the recipient method doesn't just return a simple
value, but that that value should be automatically turned into an instance
of the Recipient class.
Of course, calls_on still won't be correct:
> perl calls_on 2002-10-19
...
18:36:00) My::PhoneBill::Recipient=HASH(0x835b6b8) - 41 secs, 4.2 pence
21:20:00) My::PhoneBill::Recipient=HASH(0x835a210) - 8 secs, 8.5 pence
...
Total: 7 calls, 92 pence
But now all it takes is a small tweak to the script:
printf "%s) %s - %d secs, %.1f pence\n",
$call->calltime, $call->recipient->number, $call->duration, $call->cost;
And now everything is working perfectly again:
> perl calls_on 2002-10-19
...
18:36:00) 028 9024 4222 - 41 secs, 4.2 pence
21:20:00) 123 - 8 secs, 8.5 pence
...
Total: 7 calls, 92 pence
The calls_to script is slightly trickier, as the initial search is now
on the recipient rather than the call.
So, we change the initial search to:
my ($recipient) = My::PhoneBill::Recipient->search(number => $number)
or die "No calls to $number\n";
Then we need to get all the calls to that recipient. For this, we need to
inform Recipient of the relationship with calls. Unlike the has_a we
just set up in the Call class, the recipient table doesn't store any value
concerned with the call table that could be inflated on demand. Instead
we need to add a has_many declaration to Recipient:
__PACKAGE__->has_many(calls => 'My::PhoneBill::Call');
This creates a new method calls for each Recipient object, returning
a list of Call objects whose recipient method is our primary key.
So, having found our recipient in the calls_to script, we can simply
ask:
my @calls = $recipient->calls;
And now the script works just as before:
#!/usr/bin/perl
use My::PhoneBill::Recipient;
my $number = shift or die "Usage: $0 <number>";
my ($recipient) = My::PhoneBill::Recipient->search(number => $number)
or die "No calls to $number\n";
my @calls = $recipient->calls;
my $total_cost = 0;
foreach my $call (@calls) {
$total_cost += $call->cost;
printf "%s %s - %d secs, %.1f pence\n",
$call->calldate, $call->calltime, $call->duration, $call->cost;
}
printf "Total: %d calls, %d pence\n", scalar @calls, $total_cost;
And now we can get the old results again:
> perl calls_to 123
2002-09-17 11:06:00 - 5 secs, 8.5 pence
2002-10-19 21:20:00 - 8 secs, 8.5 pence
Total: 2 calls, 17 pence
Next we need a script to give a name to a number in our address book:
#!/usr/bin/perl
use My::PhoneBill::Recipient;
my($number, $name) = @ARGV;
die "Usage $0 <number> <name>\n" unless $number and $name;
my $recip = My::PhoneBill::Recipient->find_or_create({number => $number});
my $old_name = $recip->name;
$recip->name($name);
$recip->commit;
if ($old_name) {
print "OK. $number changed from $old_name to $name\n";
} else {
print "OK. $number is $name\n";
}
This lets us associate a number to a name:
> perl add_phone_number 123 "Speaking Clock"
OK. 123 is Speaking Clock
And now with a tiny change to our calls_on script we can output the name
where known:
printf "%s) %s - %d secs, %.1f pence\n",
$call->calltime, $call->recipient->name || $call->recipient->number,
$call->duration, $call->cost;
> perl calls_on 2002-10-19
...
18:36:00) 028 9024 4222 - 41 secs, 4.2 pence
21:20:00) Speaking Clock - 8 secs, 8.5 pence
...
Total: 7 calls, 92 pence
To allow the calls_to script to work if we give it either a name or a
number, we can use:
my $recipient = My::PhoneBill::Recipient->search(name => $number)
|| My::PhoneBill::Recipient->search(number => $number)
|| die "No calls to $number\n";
However, as we've called search in scalar context, rather than the
normal list context, we get back an interator rather than an individual
Recipient object. As one name may map to multiple numbers, we need to
iterate over each of these in turn:
my @calls;
while (my $recip = $recipient->next) {
push @calls, $recip->calls;
}
(Printing individual totals for each number is left as an exercise for
the reader.)
> perl calls_to "Speaking Clock"
2002-09-17 11:06:00 - 5 secs, 8.5 pence
2002-10-19 21:20:00 - 8 secs, 8.5 pence
Total: 2 calls, 17 pence
Sometimes the information we store in the database can be used to work
with non-Class::DBI modules. For example, if we wanted to format the
dates of our calls differently, we might like to turn them into
Date::Simple objects. Again, Class::DBI makes this easy.
In the Call class, we again use has_a to declare this relationship:
__PACKAGE__->has_a(recipient => 'My::PhoneBill::Recipient');
*__PACKAGE__->has_a(calldate => 'Date::Simple');
Now, when we fetch the calldate it is automatically inflated to a
Date::Simple object. So, we can change the output of calls_to to print
the date in a nicer format:
printf "%s %s - %d secs, %.1f pence\n",
$call->calldate->format("%d %b"), $call->calltime,
$call->duration, $call->cost;
> perl calls_to "Speaking Clock"
17 Sep 11:06:00 - 5 secs, 8.5 pence
19 Oct 21:20:00 - 8 secs, 8.5 pence
Total: 2 calls, 17 pence
Class::DBI assumes that any non-Class::DBI class is inflated through a
new method, and can be deflated through stringification. As both of
these are true for Date::Simple , we didn't need to do anything more. If
this is not the case - for example, if you wanted to use Time::Piece instead
of Date::Simple - you need to tell Class::DBI how to do the inflating
and deflating as the value goes in and comes out of the database.
__PACKAGE__->has_a(calldate => 'Time::Piece',
inflate => sub { Time::Piece->strptime(shift, "%Y-%m-%d") },
deflate => 'ymd'
);
Deflating a Time::Piece object back to an ISO date string suitable for
MySQL is quite simple: you can just call its ymd() method. Thus we can
specify this as a string. Inflating is more difficult, as it requires a
two argument call to strptime() . Thus we need to specify this as a
subroutine reference. When inflating, this is called with the value from
the database as its one and only argument. Thus, we can pass that to
Time::Piece 's strptime method, specifying the format to instantiate
from.
Using Time::Piece instead of Date::Time requires one further change to
to our output script:
printf "%s %s - %d secs, %.1f pence\n",
* $call->calldate->strftime("%d %b"), $call->calltime,
$call->duration, $call->cost;
BT provide a service that allows you to save money on 10 numbers of your
choice. So it would be useful if we were able to find out which numbers
we spend the most money calling. We'll assume that any number we've only
ever called once isn't worth adding to our list, as it was probably a
one-off call. Thus we want the 10 numbers with the highest spend that
have had more than one call.
As we said earlier, Class::DBI doesn't attempt to provide a syntax to
express any arbitrary SQL, so there's no way of asking directly for
this information. Instead, we'll try a simple approach.
Firstly we'll add a method to the Recipient class to tell us the total
we've spent on calls to that number:
use List::Util 'sum';
sub total_spend {
my $self = shift;
return sum map $_->cost, $self->calls;
}
Then we can create a top_ten script:
#!/usr/bin/perl
use My::PhoneBill::Recipient;
my @recipients = My::PhoneBill::Recipient->retrieve_all;
my @regulars = grep $_->calls > 1, @recipients;
my @sorted = sort { $b->total_spend <=> $a->total_spend } @regulars;
foreach my $recip (@sorted[0 .. 9]) {
printf "%s - %d calls = %d pence\n",
$recip->name || $recip->number,
scalar $recip->calls,
$recip->total_spend;
}
This is very slow, however, once you have more than a few hundred calls
stored in your database. This is mainly because we're sorting based on a
method call. Replacing the sort above with a Schwartzian Transform
speeds everything up significantly:
my @sorted = map $_->[0],
sort { $b->[1] <=> $a->[1] }
map [ $_, $_->total_spend ], @regulars;
Now, until the database gets significantly bigger, this is probably fast
enough - especially as you probably won't be running the script very
often.
However, if this isn't enough, then we can always drop back to SQL. Of
course, when we need to optimize for speed, we usually lose something
else - in this case, portability. In this example, we're using MySQL, so I
would add the relevant MySQL-specific query to Recipient.pm:
__PACKAGE__->set_sql(top_ten => qq{
SELECT recipient.recipid,
SUM(call.cost) AS spend,
COUNT(call.callid) AS calls
FROM recipient, call
WHERE recipient.recipid = call.recipient
GROUP BY recipient.recipid
HAVING calls > 1
ORDER BY spend DESC
LIMIT 10
});
Then we can set up a method that returns the relevant objects for these:
sub top_ten {
my $class = shift;
my $sth = $class->sql_top_ten;
$sth->execute;
return $class->sth_to_objects($sth);
}
Any SQL set up using set_sql can be retrieved as a readily prepared DBI
statement handle by prepending its name with sql_ - thus we fetch back
this top_ten using my $sth = $class->sql_top_ten ;
Although we could happily execute this and then call any of the
traditional DBI commands (fetchrow_array etc.), we can also take a
shortcut. As one of the columns being returned from our query is the
primary key of the Recipient, we can simply feed the results into the
underlying method in Class::DBI that allows searches to return a list
of objects, sth_to_objects .
So, our script becomes simply:
foreach my $recip (My::PhoneBill::Recipient->top_ten) {
printf "%s - %d calls = %d pence\n",
$recip->name || $recip->number,
scalar $recip->calls,
$recip->total_spend;
}
As we have seen, Class::DBI makes it possible to perform many of the
common database tasks completely trivially - without writing a single
line of SQL. But, when you really need it, it's fairly easy to
write the SQL that you need and execute it.
Return to Related Articles from the O'Reilly Network .
Perl.com Compilation Copyright © 1998-2003 O'Reilly & Associates, Inc. |