10.2. Receiving Traps
Let's
start by discussing how to deal with incoming traps. Handling
incoming traps is the responsibility of the NMS. Some NMSs do as
little as display the incoming traps to standard output
(stdout). However, an NMS server typically has
the ability to react to SNMP traps it receives. For example, when an
NMS receives a linkDown trap from a router, it
might respond to the event by paging the contact person, displaying a
pop-up message on a management console, or forwarding the event to
another NMS. This procedure is streamlined in commercial packages but
still can be achieved with freely available open source programs.
10.2.1. HP OpenView
OpenView
uses three pieces of software to receive and interpret traps:
OpenView's main trap-handling daemon
is called ovtrapd. This program listens for
traps generated by devices on the network and hands them off to the
Postmaster daemon (pmd ). In turn,
pmd triggers what OpenView calls an event.
Events can be configured to perform actions ranging from sending a
pop-up window to NNM users, forwarding the event to other NMSs, or
doing nothing at all. The configuration process uses
xnmtrap, the Event Configurations GUI. The
xnmevents program displays the events that have
arrived, sorting them into user-configurable categories.
OpenView
keeps a history of all the traps it has received; to retrieve that
history, use the command $OV_BIN/ovdumpevents.
Older versions of OpenView kept an event logging file in
$OV_LOG/trapd.log. By default, this file rolls
over after it grows to 4 MB. It is then renamed
trapd.log.old and a new
trapd.log file is started. If you are having
problems with traps, either because you don't know whether they
are reaching the NMS or because your NMS is being bombarded by too
many events, you can use tail -f to watch
trapd.log so you can see the traps as they
arrive. (You can also use ovdumpevents to create
a new file.) To learn more about the format of this file, refer to
OpenView's manual pages for trapd.conf (4)
and ovdumpevents (1M).
It might be helpful to define what exactly an OpenView event is.
Think of it as a small record, similar to a database record. This
record defines which trap OpenView should watch out for. It further
defines what sort of action (send an email, page someone, etc.), if
any, should be performed.
10.2.2. Using NNM's Event Configurations
OpenView
uses an internal definition file to determine how to react to
particular situations. This definition file is maintained by the
xnmtrap program. We can start
xnmtrap by using the menu item "Options
Event Configurations" (on the NNM GUI) or by giving the
command $OV_BIN/xnmtrap. In the Enterprise
Identification window, scroll down and click on the enterprise name
"OpenView .1.3.6.1.4.1.11.2.17.1." This displays a list
in the Event Identification window. Scroll down in this list until
you reach "OV_Node_Down." Double-click on this event to
bring up the Event Configurator (Figure 10-1).
Figure 10-1. OpenView Event Configurator -- OV_Node_Down
Figure 10-1 shows the
OV_Node_Down event in the Event Configurator.
When this event gets triggered, it inserts an entry containing the
message "Node down," with a severity level of
"Warning," into the Status Events category. OpenView
likes to have a leading 0 (zero) in the Event Object Identifier,
which indicates that this is an event or trap -- there is no way
to change this value yourself. The number before the 0 is the
enterprise OID; the number after the 0 is the specific trap number,
in this case 58916865. [42] Later we will use these numbers as
parameters when generating our own traps.
10.2.2.3. Forwarding events and event severities
The
"Forward Event" radio button, once checked, allows you to
forward an event to other NMSs. This feature is useful if you have
multiple NMSs or a distributed network-management architecture. Say
that you are based in Atlanta, but your network has a management
station in New York in addition to the one on your desk. You
don't want to receive all of New York's events, but you
would like the node_down information forwarded
to you. On New York's NMS, you could click "Forward
Event" and insert the IP address of your NMS in Atlanta. When
New York receives a node_down event, it will
forward the event to Atlanta.
The Severity drop-down list assigns
a severity level to the event. OpenView supports six severity levels:
Unknown, Normal, Warning, Minor, Major, and Critical. The severity
levels are color-coded to make identification easier; Table 10-1 shows the color associated with each severity
level. The levels are listed in order of increasing severity. For
example, an event with a severity level of Minor has a higher
precedence than an event with a severity of Warning.
Table 10-1. OpenView Severity Levels
Severity
|
Color
|
Unknown
|
Blue
|
Normal
|
Green
|
Warning
|
Cyan
|
Minor
|
Yellow
|
Major
|
Orange
|
Critical
|
Red
|
The colors are used
both on OpenView's maps and in the Event Categories. Parent
objects, which represent the starting point for a network, are
displayed in the color of the highest severity level associated with
any object underneath them.[45] For example, if an object represents a
network with 250 nodes and one of those nodes is down (a Critical
severity), the object will be colored red, regardless of how many
nodes are up and functioning normally. The term for how OpenView
displays colors in relation to objects is status source
; it is explained in more detail in Chapter 6, "Configuring Your NMS".
10.2.2.4. Log messages, notifications, and automatic actions
Returning
to Figure 10-1, the Event Log Message and Popup
Notification fields are similar, but serve different purposes. The
Event Log Message is displayed when you view the Event Categories and
select a category from the drop-down list. The Popup Notification,
which is optional, displays its message in a window that appears on
any server running OpenView's NNM. Figure 10-2
shows a typical pop-up message. The event name,
delme in this case, appears in the title bar.
The time and date at which the event occurred are followed by the
event message, "Popup Message Here." To create a pop-up
message like this, insert "Popup Message Here" in the
Popup Notification section of the Event Configurator. Every time the
event is called, a pop-up will appear.
Figure 10-2. OpenView pop-up message
The last section of the Event
Configurator is the Command for Automatic Action. The automatic
action allows you to specify a Unix command or script to execute when
OpenView receives an event. You can run multiple commands by
separating them with a semicolon, much as you would in a Unix shell.
When configuring an automatic action, remember that
rsh can be very useful. I like to use
rsh sunserver1 audioplay -v50
/opt/local/sounds/siren.au, which causes a siren audio file
to play. The automatic action can range from touching a file to
opening a trouble ticket.
In
each Event Log Message, Popup Notification, and Command for Automatic
Action, special variables can help you identify the values from your
traps or events. These variables provide the user with additional
information about the event. Here are some of the variables you can
use; the online help has a complete list:
- $1
-
Print the first passed attribute (i.e., the value of the first
variable binding) from the trap.
- $2
-
Print the second passed attribute.
- $n
-
Print the nth attribute as a value string. Must
be in the range of 1-99.
- $*
-
Print all the attributes as [seq] name (type).
WARNING:
Before you start running scripts for an
event, find out the average number of traps you are likely to receive
for that event. This is especially true for
OV_Node_Down. If you write a script that opens a
trouble ticket whenever a node goes down, you could end up with
hundreds of tickets by the end of the day. Monitoring your network
will make you painfully aware of how much your network
"flaps," or goes up and down. Even if the network goes
down for a second, for whatever reason, you'll get a trap,
which will in turn generate an event, which might register a new
ticket, send you a page, etc. The last thing you want is "The
Network That Cried Down!" You and other people on your staff
will start ignoring all the false warnings and may miss any serious
problems that arise. One way to estimate how frequently you will
receive events is to log events in a file ("Log only").
After a week or so, inspect the log file to see how many events
accumulated (i.e., the number of traps received). This is by no means
scientific, but it will give you an idea of what you can expect.
10.2.4. The Event Categories Display
The Event Categories
window (Figure 10-4) is displayed on the
user's screen when NNM is started. It provides a very brief
summary of what's happening on your network; if it is set up
appropriately, you can tell at a glance whether there are any
problems you should be worrying about.
Figure 10-4. OpenView Event Categories
If the window gets
closed during an OpenView session, you can restart it using the
"Fault Events" menu item or
by issuing the command $OV_BIN/xnmevents. The menu
displays all the event categories, including any categories you have
created. Two categories are special: the Error category is the
default category used when an event is associated with a category
that cannot be found; the All category is a placeholder for all
events and cannot be configured by the Event Configurator. The window
shows you the highest severity level of any event in each event
category.
The box to the left of
Status Events is cyan (a light blue), showing that the highest
unacknowledged severity in the Status Events category is Warning.
Clicking on that box displays an alarm browser that lists all the
events received in the category. A nice feature of the Event
Categories display is the ability to restore a browser's state
or reload events from the trapd.log and
trapd.log.old files. Reloading events is useful
if you find that you need to restore messages you deleted in the
past.
TIP:
Newer versions of OpenView extend the abilities of Event Categories
by keeping a common database of acknowledged and unacknowledged
events. Thus, when a user acknowledges an event, all other users see
this event updated.
At the bottom of Figure 10-4, the phrase "[Read-Only]" means
that you don't have write access to Event Categories. If this
phrase isn't present, you have write access. OpenView keeps
track of events on a per-user basis, using a special database located
in $OV_LOG/xnmevents.username. [46] With write access, you
have the ability to update this file whenever you exit. By default,
you have write access to your own event category database, unless
someone has already started the database by starting a session with
your username. There may be only one write-access Event Categories
per user, with the first one getting write access and all others
getting read-only privileges.
10.2.5. The Alarm Browser
Figure 10-5 shows the
alarm browser for the Status Events category. In it we see a single
Warning event, which is causing the Status Events category to show
cyan.
Figure 10-5. OpenView alarm browser
The color of the Status Events box is determined by the
highest-precedence event in the category. Therefore, the color
won't change until either you acknowledge the
highest-precedence event or an event arrives with an even higher
precedence. Clicking in the far left column (Ack) acknowledges the
message[47] and sets the severity to 0.
The
Actions menu in the alarm browser allows you to acknowledge,
deacknowledge, or delete some or all events. You can even change the
severity of an event. Keep in mind that this does
not change the severity of the event on other
Event Categories sessions that are running. For example, if one user
changes the severity of an event from Critical to Normal, the event
will remain Critical for other users. The View menu lets you define
filters, which allow you to include or discard messages that match
the filter.
When
configuring events, keep in mind that you may receive more traps than
you want. When this happens, you have two choices. First, you can go
to the agent and turn off trap generation, if the agent supports
this. Second, you can configure your trap view to ignore these traps.
We saw how to do this earlier: you can set the event to "Log
only" or try excluding the device from the Event Sources list.
If bandwidth is a concern, you should investigate why the agent is
sending out so many traps before trying to mask the problem.
10.2.6. Creating Events Within OpenView
OpenView gives you the option of
creating additional (private) events. Private events are just like
regular events, except that they belong to your private-enterprise
subtree, rather than to a public MIB. To create your own events,
launch the Event Configuration window from the Options menu of NNM.
You will see a list of all currently loaded events (Figure 10-6).
Figure 10-6. OpenView's Event Configuration
The window is divided into two panes. The top pane displays the
Enterprise Identification, which is the leftmost part of an OID.
Clicking on an enterprise ID displays all the events belonging to
that enterprise in the lower pane. To add your own enterprise ID,
select "Edit Add Enterprise Identification"
and insert your enterprise name and a registered enterprise
ID.[48] Now you're ready to create private events. Click on
the enterprise name you just created; the enterprise ID you've
associated with this name will be used to form the OID for the new
event. Click "Edit  Add  Event"; then type
the Event Name for your new event, making sure to use Enterprise
Specific (the default) for the event type. Insert an Event Object
Identifier. This identifier can be any number that hasn't
already been assigned to an event in the currently selected
enterprise. Finally, click "OK" and save the event
configuration (using "File  Save").
To copy an existing event, click on the event you wish to copy and
select "Edit  Copy Event"; you'll see a new
window with the event you selected. From this point on, the process
is the same.
Traps with "no format"
are traps for which nothing has been defined in the Event
Configuration window. There are two ways to solve this problem: you
can either create the necessary events on your own or you can load a
MIB that contains the necessary trap definitions, as discussed in
Chapter 6, "Configuring Your NMS". "No format" traps
are frequently traps defined in a vendor-specific MIB that
hasn't been loaded. Loading the appropriate MIB often fixes the
problem by defining the vendor's traps and their associated
names, IDs, comments, severity levels, etc.
TIP:
Before loading a MIB, review the types of
traps the MIB supports. You will find that most traps you load come,
by default, in LOGONLY mode. This means that you will
not be notified when the traps come in. After
you load the MIB you may want to edit the events it defines,
specifying the local configuration that best fits your
site.
10.2.7. Monitoring Traps with Perl
If
you can't afford an expensive package like OpenView, you can
use the Perl language to write your own monitoring and logging
utility. You get what you pay for, since you will have to write
almost everything from scratch. But you'll learn a lot and
probably get a better appreciation for the finer points of network
management. One of the most elementary, but effective, programs to
receive traps is in a distribution of SNMP Support for Perl 5,
written by Simon Leinen. Here's a modified version of
Simon's program:
#!/usr/local/bin/perl
use SNMP_Session "0.60";
use BER;
use Socket;
$session = SNMPv1_Session->open_trap_session ( );
while (($trap, $sender, $sender_port) = $session->receive_trap ( ))
{
chomp ($DATE=`/bin/date \'+%a %b %e %T\'`);
print STDERR "$DATE - " . inet_ntoa($sender) . " - port: $sender_port\n";
print_trap ($session, $trap);
}
1;
sub print_trap ($$) {
($this, $trap) = @_;
($community, $ent, $agent, $gen, $spec, $dt, @bindings) = \
$this->decode_trap_request ($trap);
print " Community:\t".$community."\n";
print " Enterprise:\t".BER::pretty_oid ($ent)."\n";
print " Agent addr:\t".inet_ntoa ($agent)."\n";
print " Generic ID:\t$gen\n";
print " Specific ID:\t$spec\n";
print " Uptime:\t".BER::pretty_uptime_value ($dt)."\n";
$prefix = " bindings:\t";
foreach $encoded_pair (@bindings)
{
($oid, $value) = decode_by_template ($encoded_pair, "%{%O%@");
#next unless defined $oid;
print $prefix.BER::pretty_oid ($oid)." => ".pretty_print ($value)."\n";
$prefix = " ";
}
}
This
program displays traps as they are received from different devices in
the network. Here's some output, showing two traps:
Mon Apr 28 22:07:44 - 10.123.46.26 - port: 63968
community: public
enterprise: 1.3.6.1.4.1.2789.2500
agent addr: 10.123.46.26
generic ID: 6
specific ID: 5247
uptime: 0:00:00
bindings: 1.3.6.1.4.1.2789.2500.1234 => 14264026886
Mon Apr 28 22:09:46 - 172.16.51.25 - port: 63970
community: public
enterprise: 1.3.6.1.4.1.2789.2500
agent addr: 172.16.253.2
generic ID: 6
specific ID: 5247
uptime: 0:00:00
bindings: 1.3.6.1.4.1.2789.2500.2468 => Hot Swap Now In Sync
The output format is the same
for both traps. The first line shows the date and time at which the
trap occurred, together with the IP address of the device that sent
the trap. Most of the remaining output items should be familiar to
you. The bindings output item lists the variable
bindings that were sent in the trap PDU. In the example above, each
trap contained one variable binding. The object ID is in numeric
form, which isn't particularly friendly. If a trap has more
than one variable binding, this program displays each binding, one
after another.
An ad hoc monitoring system can be fashioned by using this Perl
script to collect traps and some other program to inspect the traps
as they are received. Once the traps are parsed, the possibilities
are endless. You can write user-defined rules that watch for
significant traps and, when triggered, send an email alert, update an
event database, send a message to a pager, etc. These kinds of
solutions work well if you're in a business with little or no
budget for commercially available NMS software or if you're on
a small network and don't need a heavyweight management
tool.
 |  |  | 10. Traps |  | 10.3. Sending Traps |
Copyright © 2002 O'Reilly & Associates. All rights reserved.
|