12.4. Veritas Disk CheckThe Veritas Volume Manager is a package that allows you to manipulate disks and their partitions. It gives you the ability to add and remove mirrors, work with RAID arrays, and resize partitions, to name a few things. Although Veritas is a specialized and expensive package that is usually found at large data centers, don't assume that you can skip this section. The point isn't to show you how to monitor Veritas, but to show you how you can provide meaningful traps using a typical status program. You should be able to extract the ideas from the script we present here and use them within your own context. Veritas Volume Manager (vxvm) comes with a utility called vxprint. This program displays records from the Volume Manager configuration and shows the status of each of your local disks. If there is an error, such as a bad disk or broken mirror, this command will report it. A healthy vxprint on the rootvol (/) looks like this:The KSTATE (kernel state) and STATE columns give us a behind-the-scenes look at our disks, mirrors, etc. Without explaining the output in detail, a KSTATE of ENABLED is a good sign; a STATE of ACTIVE or - indicates that there are no problems. We can take this output and pipe it into a script that sends SNMP traps when errors are encountered. We can send different traps of an appropriate severity, based on the type of error that vxprint reported. Here's a script that runs vxprint and analyzes the results:$ vxprint -h rootvol Disk group: rootdg TY NAME ASSOC KSTATE LENGTH PLOFFS STATE TUTIL0 PUTIL0 v rootvol root ENABLED 922320 - ACTIVE - - pl rootvol-01 rootvol ENABLED 922320 - ACTIVE - - sd rootdisk-B0 rootvol-01 ENABLED 1 0 - - Block0 sd rootdisk-02 rootvol-01 ENABLED 922319 1 - - - pl rootvol-02 rootvol ENABLED 922320 - ACTIVE - - sd disk01-01 rootvol-02 ENABLED 922320 0 - - - Knowing what the output of vxprint should look like, we can formulate Perl statements that figure out when to generate a trap. That task makes up most of the get_vxprint subroutine. We also know what types of error messages will be produced. Our script tries to ignore all the information from the healthy disks and sort the error messages. For example, if the STATE field contains NEEDSYNC, the disk mirrors are probably not synchronized and the volume needs some sort of attention. The script doesn't handle this particular case explicitly, but it is caught with the default entry. The actual mechanism for sending the trap is tied up in a large number of variables. Basically, though, we use any of the trap utilities we've discussed; the enterprise ID is .1.3.6.1.4.1.2789.2500 ; the specific trap ID is 1000 ; and we include four variable bindings, which report the hostname, the volume name, the volume's state, and the disk group. As with the previous script, it's a simple matter to run this script periodically and watch the results on whatever network-management software you're using. It's also easy to see how you could develop similar scripts that generate reports from other status programs.#!/usr/local/bin/perl -wc $VXPRINT_LOC = "/usr/sbin/vxprint"; $HOSTNAME = `/bin/uname -n`; chop $HOSTNAME; while ($ARGV[0] =~ /^-/) { if ($ARGV[0] eq "-debug") { shift; $DEBUG = $ARGV[0]; } elsif ($ARGV[0] eq "-state_active") { $SHOW_STATE_ACTIVE = 1; } shift; } #################################################################### ########################### Begin Main ########################### #################################################################### &get_vxprint; # Get it, process it, and send traps if errors found! #################################################################### ######################## Begin SubRoutines ####################### #################################################################### sub get_vxprint { open(VXPRINT,"$VXPRINT_LOC |") || die "Can't Open $VXPRINT_LOC"; while($VXLINE=<VXPRINT>) { print $VXLINE unless ($DEBUG < 2); if ($VXLINE ne "\n") { &is_a_disk_group_name; &split_vxprint_output; if (($TY ne "TY") && ($TY ne "Disk") && ($TY ne "dg") && ($TY ne "dm")) { if (($SHOW_STATE_ACTIVE) && ($STATE eq "ACTIVE")) { print "ACTIVE: $VXLINE"; } if (($STATE ne "ACTIVE") && ($STATE ne "DISABLED") && ($STATE ne "SYNC") && ($STATE ne "CLEAN") && ($STATE ne "SPARE") && ($STATE ne "-") && ($STATE ne "")) { &send_error_msgs; } elsif (($KSTATE ne "ENABLED") && ($KSTATE ne "DISABLED") && ($KSTATE ne "-") && ($KSTATE ne "")) { &send_error_msgs; } } # end if (($TY } # end if ($VXLINE } # end while($VXLINE } # end sub get_vxprint sub is_a_disk_group_name { if ($VXLINE =~ /^Disk\sgroup\:\s(\w+)\n/) { $DISK_GROUP = $1; print "Found Disk Group :$1:\n" unless (!($DEBUG)); return 1; } } sub split_vxprint_output { ($TY, $NAME, $ASSOC, $KSTATE, $LENGTH, $PLOFFS, $STATE, $TUTIL0, $PUTIL0) = split(/\s+/,$VXLINE); if ($DEBUG) { print "SPLIT: $TY $NAME $ASSOC $KSTATE "; print "$LENGTH $PLOFFS $STATE $TUTIL0 $PUTIL0:\n"; } } sub send_snmp_trap { $SNMP_TRAP_LOC = "/opt/OV/bin/snmptrap"; $SNMP_COMM_NAME = "public"; $SNMP_TRAP_HOST = "nms"; $SNMP_ENTERPRISE_ID = ".1.3.6.1.4.1.2789.2500"; $SNMP_GEN_TRAP = "6"; $SNMP_SPECIFIC_TRAP = "1000"; chop($SNMP_TIME_STAMP = "1" . `date +%H%S`); $SNMP_EVENT_IDENT_ONE = ".1.3.6.1.4.1.2789.2500.1000.1"; $SNMP_EVENT_VTYPE_ONE = "octetstringascii"; $SNMP_EVENT_VAR_ONE = "$HOSTNAME"; $SNMP_EVENT_IDENT_TWO = ".1.3.6.1.4.1.2789.2500.1000.2"; $SNMP_EVENT_VTYPE_TWO = "octetstringascii"; $SNMP_EVENT_VAR_TWO = "$NAME"; $SNMP_EVENT_IDENT_THREE = ".1.3.6.1.4.1.2789.2500.1000.3"; $SNMP_EVENT_VTYPE_THREE = "octetstringascii"; $SNMP_EVENT_VAR_THREE = "$STATE"; $SNMP_EVENT_IDENT_FOUR = ".1.3.6.1.4.1.2789.2500.1000.4"; $SNMP_EVENT_VTYPE_FOUR = "octetstringascii"; $SNMP_EVENT_VAR_FOUR = "$DISK_GROUP"; $SNMP_TRAP = "$SNMP_TRAP_LOC \-c $SNMP_COMM_NAME $SNMP_TRAP_HOST $SNMP_ENTERPRISE_ID \"\" $SNMP_GEN_TRAP $SNMP_SPECIFIC_TRAP $SNMP_TIME_STAMP $SNMP_EVENT_IDENT_ONE $SNMP_EVENT_VTYPE_ONE \"$SNMP_EVENT_VAR_ONE\" $SNMP_EVENT_IDENT_TWO $SNMP_EVENT_VTYPE_TWO \"$SNMP_EVENT_VAR_TWO\" $SNMP_EVENT_IDENT_THREE $SNMP_EVENT_VTYPE_THREE \"$SNMP_EVENT_VAR_THREE\" $SNMP_EVENT_IDENT_FOUR $SNMP_EVENT_VTYPE_FOUR \"$SNMP_EVENT_VAR_FOUR\""; # Sending a trap using Net-SNMP # #system "/usr/local/bin/snmptrap $SNMP_TRAP_HOST $SNMP_COMM_NAME #$SNMP_ENTERPRISE_ID '' $SNMP_GEN_TRAP $SNMP_SPECIFIC_TRAP '' #$SNMP_EVENT_IDENT_ONE s \"$SNMP_EVENT_VAR_ONE\" #$SNMP_EVENT_IDENT_TWO s \"$SNMP_EVENT_VAR_TWO\" #$SNMP_EVENT_IDENT_THREE s \"$SNMP_EVENT_VAR_THREE\" #$SNMP_EVENT_IDENT_FOUR s \"$SNMP_EVENT_VAR_FOUR\""; # Sending a trap using Perl # #use SNMP_util "0.54"; # This will load the BER and SNMP_Session for us #snmptrap("$SNMP_COMM_NAME\@$SNMP_TRAP_HOST:162", "$SNMP_ENTERPRISE_ID", #mylocalhostname, $SNMP_GEN_TRAP, $SNMP_SPECIFIC_TRAP, #"$SNMP_EVENT_IDENT_ONE", "string", "$SNMP_EVENT_VAR_ONE", #"$SNMP_EVENT_IDENT_TWO", "string", "$SNMP_EVENT_VAR_TWO", #"$SNMP_EVENT_IDENT_THREE", "string", "$SNMP_EVENT_VAR_THREE", #"$SNMP_EVENT_IDENT_FOUR", "string", "$SNMP_EVENT_VAR_FOUR"); # Sending a trap using OpenView's snmptrap (using VARs from above) # if($SEND_SNMP_TRAP) { print "Problem Running SnmpTrap with Result "; print ":$SEND_SNMP_TRAP: :$SNMP_TRAP:\n"; } sub send_error_msgs { $TY =~ s/^v/Volume/; $TY =~ s/^pl/Plex/; $TY =~ s/^sd/SubDisk/; print "VXfs Problem: Host:[$HOSTNAME] State:[$STATE] DiskGroup:[$DISK_GROUP] Type:[$TY] FileSystem:[$NAME] Assoc:[$ASSOC] Kstate:[$KSTATE]\n" unless (!($DEBUG)); &send_snmp_trap; } Copyright © 2002 O'Reilly & Associates. All rights reserved. |
|