7.6. Monitoring Server HealthFor our final set of examples, we'll take a look at several ways to monitor the health of an SQL server. This sort of routine monitoring is similar in nature to the network service monitoring we saw in Chapter 5, "TCP/IP Name Services". 7.6.1. Space MonitoringTo get technical for a moment, database servers are places to hold stuff. If you run out of space to hold stuff, this is known as either "a bad thing" or "a very bad thing." As a result, programs that help us monitor the amount of space allocated and used on a server are very useful indeed. Let's look at a DBI program designed to look at the space situation on a Sybase server. Here's a snippet of output from a program that shows graphically how space is used in each database on the server. Each section shows a bar chart of the percentage of allocated data and log space in use in a database. In the following chart, d stands for data space and l stands for log space. For each bar the percentage of space used and the total available space is indicated: |ddddddd |15.23%/5MB hpotter--------| | | |0.90%/5MB |ddddddd |15.23%/5MB dumbledore-----| | | |1.52%/5MB |dddddddd |16.48%/5MB hgranger-------| | | |1.52%/5MB |ddddddd |15.23%/5MB rweasley-------| | |l |3.40%/5MB |ddddddddddddddddddddddddddd |54.39%/2MB hagrid---------| | |- no log | Here's how we generated this output: use DBI; $admin = 'sa'; print "Enter passwd for $admin: "; chomp($pw = <STDIN>); $pages = 2; # data is stored in 2k pages # connect to the server $dbh = DBI->connect('dbi:Sybase:',$admin,$pw); die "Unable to connect: $DBI::errstr\n" unless (defined $dbh); # get the name of the databases on the server $sth = $dbh->prepare(q{SELECT name from sysdatabases}) or die "Unable to prepare sysdatabases query: ".$dbh->errstr."\n"; $sth->execute or die "Unable to execute sysdatabases query: ".$dbh->errstr."\n"; while ($aref = $sth->fetchrow_arrayref) { push(@dbs, $aref->[0]); } $sth->finish; # retrieve stats for each of the databases foreach $db (@dbs) { # get and total the size column from all non-log segments $size = &querysum(qq{SELECT size FROM master.dbo.sysusages WHERE dbid = db_id(\'$db\') AND segmap != 4}); # get and total the size column for the log segment $logsize = &querysum(qq{SELECT size FROM master.dbo.sysusages WHERE dbid = db_id(\'$db\') AND segmap = 4}); # change to the database and retrieve usage stats $dbh->do(q{use $db}) or die "Unable to change to $db: ".$dbh->errstr."\n"; # we used the reserved_pgs function to return the number of pages # used by both the data (doampg) and index (ioampg) part # of the database $used=&querysum(q{SELECT reserved_pgs(id,doampg)+reserved_pgs(id,ioampg) FROM sysindexes WHERE id != 8}); # same, except this time we look at the log usage $logused=&querysum(q{SELECT reserved_pgs(id, doampg) FROM sysindexes WHERE id=8}); # show this information graphically &graph($db,$size,$logsize,$used,$logused); } $dbh->disconnect; # prepare/exec a given single-column SELECT query, return # the sum of the results sub querysum { my($query) = shift; my($sth,$aref,$sum); $sth = $dbh->prepare($query) or die "Unable to prepare $query: ".$dbh->errstr."\n"; $sth->execute or die "Unable to exec $query: ".$dbh->errstr."\n"; while ($aref=$sth->fetchrow_arrayref) { $sum += $aref->[0]; } $sth->finish; $sum; } # print out nice chart given database name, size, log size, # and usage info sub graph { my($dbname,$size,$logsize,$used,$logused) = @_; # line for data space usage print ' 'x15 . '|'.'d'x (50 *($used/$size)) . ' 'x (50-(50*($used/$size))) . '|'; # percentage used and total M for data space printf("%.2f",($used/$size*100)); print "%/". (($size * $pages)/1024)."MB\n"; print $dbname.'-'x(14-length($dbname)).'-|'.(' 'x 49)."|\n"; if (defined $logsize) { # line for log space usage print ' 'x15 . '|' . 'l'x (50 *($logused/$logsize)) . ' 'x (50-(50*($logused/$logsize))) . '|'; # percentage used and total M for log space printf("%.2f",($logused/$logsize*100)); print "%/". (($logsize * $pages)/1024)."MB\n"; } else { # some databases do not have separate log space print ' 'x15 . "|- no log".(' 'x 41)."|\n"; } print "\n"; } SQL-savvy folks will probably wonder why a special subroutine (querysum) is being invoked to total the contents of a single column instead of using the perfectly good SUM calculation operator in SQL. querysum( ) is just meant to be an example of the sort of manipulation on the fly one can do from Perl. A Perl subroutine is probably more appropriate for more complex jobs. For instance, if we need to keep separate running totals based on a regular expression, that's probably best done from Perl rather than asking the server to perform the tabulation (even if it could). 7.6.2. Monitoring the CPU Health of a SQL ServerFor the final example of this chapter, we'll use DBI to show us a minute-by-minute status line display of the CPU health of a SQL server. Just to make it more interesting, we'll monitor two separate servers simultaneously from the same script. We'll comment on this code in a moment: use DBI; $syadmin = "sa"; print "Sybase admin passwd: "; chomp($sypw = <STDIN>); $msadmin = "sa"; print "MS-SQL admin passwd: "; chomp($mspw = <STDIN>); # connect to Sybase server $sydbh = DBI->connect("dbi:Sybase:server=SYBASE",$syadmin,$sypw); die "Unable to connect to sybase server: $DBI::errstr\n" unless (defined $sydbh); # turn on the ChopBlanks option to remove trailing whitespace in columns $sydbh->{ChopBlanks} = 1; # connect to MS-SQL server (handy that we can use DBD::Sybase for this!) $msdbh = DBI->connect("dbi:Sybase:server=MSSQL",$msadmin,$mspw); die "Unable to connect to mssql server: $DBI::errstr\n" unless (defined $msdbh); # turn on the ChopBlanks option to remove trailing whitespace in columns $msdbh->{ChopBlanks} = 1; $|=1; # turn off STDOUT IO buffering # initialize the signal handler so we can cleanup nicely $SIG{INT} = sub {$byebye = 1;}; # infinitely loop unless our interrupt flag has been set while (1) { last if ($byebye); # run the stored procedure sp_monitor $systh = $sydbh->prepare(q{sp_monitor}) or die "Unable to prepare sy sp_monitor:".$sydbh->errstr."\n"; $systh->execute or die "Unable to execute sy sp_monitor:".$sydbh->errstr."\n"; # loop to retrieve the lines from the output we need. # We know we have all of it when we see the cpu_busy information while($href = $systh->fetchrow_hashref or $systh->{syb_more_results}) { # got what we needed, stop asking last if (defined $href->{cpu_busy}); } $systh->finish; # substitute out everything but the % number from # the values we receive for (keys %{$href}) { $href->{$_} =~ s/.*-(\d+%)/\1/; } # collect all the data we need into a single line $info = "Sybase: (".$href->{cpu_busy}." CPU), ". "(".$href->{io_busy}." IO), ". "(".$href->{idle}." idle) "; # ok, now let's do it all over again for the second server (MS-SQL) $mssth = $msdbh->prepare(q{sp_monitor}) or die "Unable to prepare ms sp_monitor:".$msdbh->errstr."\n"; $mssth->execute or die "Unable to execute ms sp_monitor:".$msdbh->errstr."\n"; while($href = $mssth->fetchrow_hashref or $mssth->{syb_more_results}) { # got what we needed, stop asking last if (defined $href->{cpu_busy}); } $mssth->finish; # substitute out everything but the % number from # the values we receive for (keys %{$href}) { $href->{$_} =~ s/.*-(\d+%)/\1/; } $info .= "MSSQL: (" . $href->{'cpu_busy'}." CPU), ". "(".$href->{'io_busy'}." IO), ". "(".$href->{'idle'}." idle)"; print " "x78,"\r"; print $info,"\r"; sleep(5) unless ($byebye); } # only end up here if we've broken out of the loop thanks to an interrupt $sydbh->disconnect; $msdbh->disconnect; This script keeps this line on your screen, which is refreshed every five seconds: Sybase: (33% CPU), (33% IO), (0% idle) MSSQL: (0% CPU), (0% IO), (100% idle) The heart of this program is the stored procedure sp_monitor that exists both on Sybase and MS-SQL. sp_monitor's output looks like this: last_run current_run seconds --------------------------- --------------------------- ----------- Aug 3 1998 12:05AM Aug 3 1998 12:05AM 1 cpu_busy io_busy idle ------------------------- ------------------------- ------------------------- 0(0)-0% 0(0)-0% 40335(0)-0% packets_received packets_sent packet_errors ------------------------- ------------------------- ------------------------- 1648(0) 1635(0) 0(0) total_read total_write total_errors connections ------------------- ------------------- ------------------- ------------------ 391(0) 180(0) 0(0) 11(0) Unfortunately, sp_monitor exposes a nonportable Sybase-ism that was carried over to MS-SQL: multiple result sets. Each of the lines returned comes back as a separate result set. DBD::Sybase handles this by setting a special statement attribute. That's why you see this test: while($href = $systh->fetchrow_hashref or $systh->{syb_more_results}) { and why we exit this loop early once we've seen the columns we're looking for: # got what we needed, stop asking last if (defined $href->{cpu_busy}); The program itself loops forever until it receives an interrupt signal (most likely from the user pressing Ctrl-C). When we receive this signal, we do the safest thing possible in a signal handler and set an exit flag. This is the technique recommended by the perlipc manpage for safe signal handling. Receiving the INT signal will set a flag that punts us out of the loop on the next iteration. Catching this signal allows the program to nicely close its database handles before shuffling off this mortal coil. This small program just scratches the surface of the sort of server monitoring we can do. It would be easy to take the results we get from sp_monitor and graph them over time to get a better notion of how our server is being used. Let creeping featurism be your muse. Copyright © 2001 O'Reilly & Associates. All rights reserved. |
|