12.4. Veritas Disk Check
The
Veritas Volume Manager is a package that allows you to manipulate
disks and their partitions. It gives you the ability to add and
remove mirrors, work with RAID arrays, and resize partitions, to name
a few things. Although Veritas is a specialized and expensive package
that is usually found at large data centers, don't assume that
you can skip this section. The point isn't to show you how to
monitor Veritas, but to show you how you can provide meaningful traps
using a typical status program. You should be able to extract the
ideas from the script we present here and use them within your own
context.
Veritas Volume Manager (vxvm) comes with a utility
called vxprint. This program displays records from
the Volume Manager configuration and shows the status of each of your
local disks. If there is an error, such as a bad disk or broken
mirror, this command will report it. A healthy
vxprint on the rootvol (/) looks like this:
$ vxprint -h rootvol
Disk group: rootdg
TY NAME ASSOC KSTATE LENGTH PLOFFS STATE TUTIL0 PUTIL0
v rootvol root ENABLED 922320 - ACTIVE - -
pl rootvol-01 rootvol ENABLED 922320 - ACTIVE - -
sd rootdisk-B0 rootvol-01 ENABLED 1 0 - - Block0
sd rootdisk-02 rootvol-01 ENABLED 922319 1 - - -
pl rootvol-02 rootvol ENABLED 922320 - ACTIVE - -
sd disk01-01 rootvol-02 ENABLED 922320 0 - - -
The KSTATE (kernel state) and
STATE columns give us a behind-the-scenes look at
our disks, mirrors, etc. Without explaining the output in detail, a
KSTATE of ENABLED is a good
sign; a STATE of ACTIVE or -
indicates that there are no problems. We can take this output and
pipe it into a script that sends SNMP traps when errors are
encountered. We can send different traps of an appropriate severity,
based on the type of error that vxprint reported.
Here's a script that runs vxprint and
analyzes the results:
#!/usr/local/bin/perl -wc
$VXPRINT_LOC = "/usr/sbin/vxprint";
$HOSTNAME = `/bin/uname -n`; chop $HOSTNAME;
while ($ARGV[0] =~ /^-/)
{
if ($ARGV[0] eq "-debug") { shift; $DEBUG = $ARGV[0]; }
elsif ($ARGV[0] eq "-state_active") { $SHOW_STATE_ACTIVE = 1; }
shift;
}
####################################################################
########################### Begin Main ###########################
####################################################################
&get_vxprint; # Get it, process it, and send traps if errors found!
####################################################################
######################## Begin SubRoutines #######################
####################################################################
sub get_vxprint
{
open(VXPRINT,"$VXPRINT_LOC |") || die "Can't Open $VXPRINT_LOC";
while($VXLINE=<VXPRINT>)
{
print $VXLINE unless ($DEBUG < 2);
if ($VXLINE ne "\n")
{
&is_a_disk_group_name;
&split_vxprint_output;
if (($TY ne "TY") &&
($TY ne "Disk") &&
($TY ne "dg") &&
($TY ne "dm"))
{
if (($SHOW_STATE_ACTIVE) && ($STATE eq "ACTIVE"))
{
print "ACTIVE: $VXLINE";
}
if (($STATE ne "ACTIVE") &&
($STATE ne "DISABLED") &&
($STATE ne "SYNC") &&
($STATE ne "CLEAN") &&
($STATE ne "SPARE") &&
($STATE ne "-") &&
($STATE ne ""))
{
&send_error_msgs;
}
elsif (($KSTATE ne "ENABLED") &&
($KSTATE ne "DISABLED") &&
($KSTATE ne "-") &&
($KSTATE ne ""))
{
&send_error_msgs;
}
} # end if (($TY
} # end if ($VXLINE
} # end while($VXLINE
} # end sub get_vxprint
sub is_a_disk_group_name
{
if ($VXLINE =~ /^Disk\sgroup\:\s(\w+)\n/)
{
$DISK_GROUP = $1;
print "Found Disk Group :$1:\n" unless (!($DEBUG));
return 1;
}
}
sub split_vxprint_output
{
($TY, $NAME, $ASSOC, $KSTATE,
$LENGTH, $PLOFFS, $STATE, $TUTIL0,
$PUTIL0) = split(/\s+/,$VXLINE);
if ($DEBUG) {
print "SPLIT: $TY $NAME $ASSOC $KSTATE ";
print "$LENGTH $PLOFFS $STATE $TUTIL0 $PUTIL0:\n";
}
}
sub send_snmp_trap
{
$SNMP_TRAP_LOC = "/opt/OV/bin/snmptrap";
$SNMP_COMM_NAME = "public";
$SNMP_TRAP_HOST = "nms";
$SNMP_ENTERPRISE_ID = ".1.3.6.1.4.1.2789.2500";
$SNMP_GEN_TRAP = "6";
$SNMP_SPECIFIC_TRAP = "1000";
chop($SNMP_TIME_STAMP = "1" . `date +%H%S`);
$SNMP_EVENT_IDENT_ONE = ".1.3.6.1.4.1.2789.2500.1000.1";
$SNMP_EVENT_VTYPE_ONE = "octetstringascii";
$SNMP_EVENT_VAR_ONE = "$HOSTNAME";
$SNMP_EVENT_IDENT_TWO = ".1.3.6.1.4.1.2789.2500.1000.2";
$SNMP_EVENT_VTYPE_TWO = "octetstringascii";
$SNMP_EVENT_VAR_TWO = "$NAME";
$SNMP_EVENT_IDENT_THREE = ".1.3.6.1.4.1.2789.2500.1000.3";
$SNMP_EVENT_VTYPE_THREE = "octetstringascii";
$SNMP_EVENT_VAR_THREE = "$STATE";
$SNMP_EVENT_IDENT_FOUR = ".1.3.6.1.4.1.2789.2500.1000.4";
$SNMP_EVENT_VTYPE_FOUR = "octetstringascii";
$SNMP_EVENT_VAR_FOUR = "$DISK_GROUP";
$SNMP_TRAP = "$SNMP_TRAP_LOC \-c $SNMP_COMM_NAME $SNMP_TRAP_HOST
$SNMP_ENTERPRISE_ID \"\" $SNMP_GEN_TRAP $SNMP_SPECIFIC_TRAP $SNMP_TIME_STAMP
$SNMP_EVENT_IDENT_ONE $SNMP_EVENT_VTYPE_ONE \"$SNMP_EVENT_VAR_ONE\"
$SNMP_EVENT_IDENT_TWO $SNMP_EVENT_VTYPE_TWO \"$SNMP_EVENT_VAR_TWO\"
$SNMP_EVENT_IDENT_THREE $SNMP_EVENT_VTYPE_THREE \"$SNMP_EVENT_VAR_THREE\"
$SNMP_EVENT_IDENT_FOUR $SNMP_EVENT_VTYPE_FOUR \"$SNMP_EVENT_VAR_FOUR\"";
# Sending a trap using Net-SNMP
#
#system "/usr/local/bin/snmptrap $SNMP_TRAP_HOST $SNMP_COMM_NAME
#$SNMP_ENTERPRISE_ID '' $SNMP_GEN_TRAP $SNMP_SPECIFIC_TRAP ''
#$SNMP_EVENT_IDENT_ONE s \"$SNMP_EVENT_VAR_ONE\"
#$SNMP_EVENT_IDENT_TWO s \"$SNMP_EVENT_VAR_TWO\"
#$SNMP_EVENT_IDENT_THREE s \"$SNMP_EVENT_VAR_THREE\"
#$SNMP_EVENT_IDENT_FOUR s \"$SNMP_EVENT_VAR_FOUR\"";
# Sending a trap using Perl
#
#use SNMP_util "0.54"; # This will load the BER and SNMP_Session for us
#snmptrap("$SNMP_COMM_NAME\@$SNMP_TRAP_HOST:162", "$SNMP_ENTERPRISE_ID",
#mylocalhostname, $SNMP_GEN_TRAP, $SNMP_SPECIFIC_TRAP,
#"$SNMP_EVENT_IDENT_ONE", "string", "$SNMP_EVENT_VAR_ONE",
#"$SNMP_EVENT_IDENT_TWO", "string", "$SNMP_EVENT_VAR_TWO",
#"$SNMP_EVENT_IDENT_THREE", "string", "$SNMP_EVENT_VAR_THREE",
#"$SNMP_EVENT_IDENT_FOUR", "string", "$SNMP_EVENT_VAR_FOUR");
# Sending a trap using OpenView's snmptrap (using VARs from above)
#
if($SEND_SNMP_TRAP) {
print "Problem Running SnmpTrap with Result ";
print ":$SEND_SNMP_TRAP: :$SNMP_TRAP:\n";
}
sub send_error_msgs
{
$TY =~ s/^v/Volume/;
$TY =~ s/^pl/Plex/;
$TY =~ s/^sd/SubDisk/;
print "VXfs Problem: Host:[$HOSTNAME] State:[$STATE] DiskGroup:[$DISK_GROUP]
Type:[$TY] FileSystem:[$NAME] Assoc:[$ASSOC] Kstate:[$KSTATE]\n"
unless (!($DEBUG));
&send_snmp_trap;
}
Knowing what the output of
vxprint should look like, we can formulate Perl
statements that figure out when to generate a trap. That task makes
up most of the get_vxprint subroutine. We also
know what types of error messages will be produced. Our script tries
to ignore all the information from the healthy disks and sort the
error messages. For example, if the STATE field
contains NEEDSYNC, the disk mirrors are probably
not synchronized and the volume needs some sort of attention. The
script doesn't handle this particular case explicitly, but it
is caught with the default entry.
The actual mechanism for sending the trap is tied up in a large
number of variables. Basically, though, we use any of the trap
utilities we've discussed; the enterprise ID is
.1.3.6.1.4.1.2789.2500 ; the specific trap ID is
1000 ; and we include four variable bindings,
which report the hostname, the volume name, the volume's state,
and the disk group.
As with the previous script, it's a simple matter to run this
script periodically and watch the results on whatever
network-management software you're using. It's also easy
to see how you could develop similar scripts that generate reports
from other status programs.
 |  |  | | 12.3. Throw Core |  | 12.5. Disk-Space Checker |
Copyright © 2002 O'Reilly & Associates. All rights reserved.
|
|