Chapter 11. Extensible SNMP AgentsThere will come a time when you want to extend an agent's functionality. Extending an agent usually means adding or changing the MIBs the agent supports. Many agents that claim to support SNMP cover only a minimal number of somewhat useless MIBs -- obviously a frustrating situation for someone who is planning on doing lots of automated network management. Upgrading your software to a newer version of SNMP, say Version 2 or 3, won't help; you won't get any more information out of a device than if you were using SNMPv1. The newer versions of SNMP add features to the protocol (such as additional security or more sophisticated options for retrieving and setting values), but the information that's available from any device is defined in the agent's MIBs, which are independent of the protocol itself.
When you are faced with an agent's limitations, you can turn to extensible agents. These programs, or extensions to existing programs, allow you to extend a particular agent's MIB and retrieve values from an external source (a script, program, or file). In some cases, data can be returned as if it were coming from the agent itself. Most of the time you will not see a difference between the agent's native MIBs and your extensible ones. Many extensible agents give you the ability to read files, run programs, and return their results; they can even return tables of information. Some agents have configurable options that allow you to run external programs and have preset functions, such as disk-space checkers, built in.
We don't make a distinction between existing agents that can be extended and agents that exist purely to support extensions. We'll call them both "extensible agents."The OpenView, Net-SNMP, and SystemEDGE agents are all examples of extensible agents. OpenView provides a separate extensible agent that allows you to extend the master agent (snmpdm); requests for the extensible agent won't work unless the master agent is running. You can start and stop the extensible agent without disturbing the master agent. To customize the extensible agent you define new objects using the ASN.1 format, as specified by the SMI. The Net-SNMP agent takes an alternate approach. It doesn't make a distinction between the master agent and the extensible agent; there's only one agent to worry about. You can use ASN.1 to define new objects (as with the OpenView extensible agent), but there's also a facility for adding extensions without writing any ASN.1, making this agent significantly more accessible for the novice administrator. SystemEDGE is similar to Net-SNMP in that there is only one agent to worry about. Of the three agents discussed in this chapter, it is the easiest to extend. Figure 11-1 compares the design strategies of the OpenView, Net-SNMP, and SystemEDGE agents.
Figure 11-1. Architecture of extensible agentsAll three agents have fairly comprehensive configuration options and all allow you to extend the local agent without heavy programming. You may need to write some scripts or a few short C programs, but with the sample programs here and the thousands more that are on the Internet, nonprogrammers can still get a lot done.
See Chapter 1, "What Is SNMP?" for a list of a few web sites that have links to commercial and free SNMP software.We'll start with the Net-SNMP agent, since it is the simplest, then move to SystemEDGE. We'll round out the discussion with OpenView's extensible agent. Be sure to see Chapter 5, "Network-Management Software" for information on where to obtain these agents.
When you install the Net-SNMP package, it creates a sample snmpd.conf configuration file called EXAMPLE.conf in the source directory. This file contains some great examples that demonstrate how to extend your agent. Read through it to see the types of things you can and can't do. We will touch on only a few of Net-SNMP's features: checking for any number of running processes (proc), executing a command that returns a single line of output (exec), executing a command that returns multiple lines of output (exec), and checking disk-space utilization (disk).
The main Net-SNMP configuration file can be found at $NET_SNMP_HOME/share/snmp/snmpd.conf, where $NET_SNMP_HOME is the directory in which you installed Net-SNMP. Here is the configuration file that we will use for the remainder of this section:
Whenever you make changes to the Net-SNMP agent's configuration file, you can have it reread the configuration by sending the process an HUP signal:# Filename: $NET_SNMP_HOME/share/snmp/snmpd.conf # Check for processes running # Items in here will appear in the ucdavis.procTable proc sendmail 10 1 proc httpd # Return the value from the executed program with a passed parm. # Items in here will appear in the ucdavis.extTable exec FileCheck /opt/local/shell_scripts/filecheck.sh /tmp/vxprint.error # Multiline return from the command # This needs its own OID # I have used a subset of my registered enterprise ID (2789) within the OID exec .184.108.40.206.4.1.2021.2789.51 FancyCheck /opt/local/shell_scripts/fancycheck.sh \ /core # Check disks for their mins disk / 100000
Now let's look at the file itself. The first proc command says to check for the process sendmail. The numbers 10 and 1 define how many sendmail processes we want running at any given time (a maximum of 10 and a minimum of 1). The second proc command says that we want at least one httpd process running. To see what effect these commands have on our agent, let's look at an snmpwalk of ucdavis.procTable (.220.127.116.11.4.1.2021.2):$ ps -ef | grep snmpd root 12345 1 0 Nov 16 ? 2:35 /usr/local/bin/snmpd $ kill -HUP 12345
The agent returns the contents of the procTable. In this table, the sendmail and httpd process entries occupy instances 1 and 2. prMin and prMax are the minimum and maximum numbers we set for the sendmail and httpd processes. The prCount value gives us the number of processes currently running: it looks like we have one sendmail process and six httpd processes. To see what happens when the number of processes falls outside the range we specified, let's kill all six httpd processes and look at the procTable again (instead of listing the whole table, we'll walk only instance 2, which describes the httpd process):$ snmpwalk sunserver2 public .18.104.22.168.4.1.2021.2 enterprises.ucdavis.procTable.prEntry.prIndex.1 = 1 enterprises.ucdavis.procTable.prEntry.prIndex.2 = 2 enterprises.ucdavis.procTable.prEntry.prNames.1 = "sendmail" enterprises.ucdavis.procTable.prEntry.prNames.2 = "httpd" enterprises.ucdavis.procTable.prEntry.prMin.1 = 1 enterprises.ucdavis.procTable.prEntry.prMin.2 = 0 enterprises.ucdavis.procTable.prEntry.prMax.1 = 10 enterprises.ucdavis.procTable.prEntry.prMax.2 = 0 enterprises.ucdavis.procTable.prEntry.prCount.1 = 1 enterprises.ucdavis.procTable.prEntry.prCount.2 = 6 enterprises.ucdavis.procTable.prEntry.prErrorFlag.1 = 0 enterprises.ucdavis.procTable.prEntry.prErrorFlag.2 = 0 enterprises.ucdavis.procTable.prEntry.prErrMessage.1 = "" enterprises.ucdavis.procTable.prEntry.prErrMessage.2 = "" enterprises.ucdavis.procTable.prEntry.prErrFix.1 = 0 enterprises.ucdavis.procTable.prEntry.prErrFix.2 = 0
When prMin and prMax are both 0, it says that we want at least one and a maximum of infinity processes running.
We had six httpd processes running and now, per prCount, we have none. The prErrMessage reports the problem, and the prErrorFlag has changed from 0 to 1, indicating that something is wrong. This flag makes it easy to poll the agent, using the techniques discussed in Chapter 9, "Polling and Thresholds", and see that the httpd processes have stopped. Let's try a variation on this theme. If we set prMin to indicate that we want more than six httpd processes running, then restart httpd, our prErrMessage is:$ snmpwalk sunserver2 public .22.214.171.124.4.1.2021.2 enterprises.ucdavis.procTable.prEntry.prIndex.1 = 1 enterprises.ucdavis.procTable.prEntry.prNames.1 = "httpd" enterprises.ucdavis.procTable.prEntry.prMin.1 = 0 enterprises.ucdavis.procTable.prEntry.prMax.1 = 0 enterprises.ucdavis.procTable.prEntry.prCount.1 = 0 enterprises.ucdavis.procTable.prEntry.prErrorFlag.1 = 1 enterprises.ucdavis.procTable.prEntry.prErrMessage.1 = "No httpd process running." enterprises.ucdavis.procTable.prEntry.prErrFix.1 = 0
The next command in the configuration file is exec; this command allows us to execute any program and return the program's results and exit value to the agent. This is helpful when you already have a program you would like to use in conjunction with the agent. We've written a simple shell script called filecheck.sh that checks whether the file that's passed to it on the command line exists. If the file exists, it returns a 0 (zero); otherwise, it returns a 1 (one):enterprises.ucdavis.procTable.prEntry.prErrMessage.1 = "Too few httpd running (# = 0)"
Our configuration file uses filecheck.sh to check for the existence of the file /tmp/vxprint.error. Once you have the filecheck.sh script in place, you can see the results it returns by walking ucdavis.extTable (.126.96.36.199.4.1.2021.8):#!/bin/sh # FileName: /opt/local/shell_scripts/filecheck.sh if [ -f $1 ]; then exit 0 fi exit 1
The first argument to the exec command in the configuration file is a label that identifies the command so we can easily recognize it in the extTable. In our case we used FileCheck -- that's not a particularly good name, because we might want to check the existence of several files, but we could have named it anything we deemed useful. Whatever name you choose is returned as the value of the extTable.extEntry.extNames.1 object. Because the file /tmp/vxprint.error exists, filecheck.sh returns a 0, which appears in the table as the value of extTable.extEntry.extResult.1. You can also have the agent return a line of output from the program. Change filecheck.sh to perform an ls -la on the file if it exists:$ snmpwalk sunserver2 public .188.8.131.52.4.1.2021.8 enterprises.ucdavis.extTable.extEntry.extIndex.1 = 1 enterprises.ucdavis.extTable.extEntry.extNames.1 = "FileCheck" enterprises.ucdavis.extTable.extEntry.extCommand.1 = "/opt/local/shell_scripts/filecheck.sh /tmp/vxprint.error" enterprises.ucdavis.extTable.extEntry.extResult.1 = 0 enterprises.ucdavis.extTable.extEntry.extOutput.1 = "" enterprises.ucdavis.extTable.extEntry.extErrFix.1 = 0
When we poll the agent, we see the output from the script in the extOutput value the agent returns:#!/bin/sh # FileName: /opt/local/shell_scripts/filecheck.sh if [ -f $1 ]; then ls -la $1 exit 0 fi exit 1
This simple trick works only if the script returns a single line of output. If your script returns more than one line of output, insert an OID in front of the string name in the exec command.enterprises.ucdavis.extTable.extEntry.extOutput.1 = \ " 16 -rw-r--r-- 1 root other 2476 Feb 3 17:13 /tmp/vxprint.error."
Here's the next command from our snmpd.conf file:
This command runs the program fancycheck.sh, with the identifying string FancyCheck. We won't bother to list fancycheck.sh; it's just like filecheck.sh, except that it adds a check to determine the file type. The OID identifies where in the MIB tree the agent will place the result of running the command. It needs to be in the ucdavis enterprise (.184.108.40.206.4.1.2021). We recommend that you follow the ucdavis enterprise ID with your own enterprise number, to prevent collisions with objects defined by other sources and avoid overwriting one of ucdavis's subtrees. Follow your enterprise number with another number to identify this particular command. In this case, our enterprise ID is 2789 and we assign the arbitrary number 51 to this command. Thus, the complete OID is .220.127.116.11.4.1.2021.2789.51.exec .18.104.22.168.4.1.2021.2789.51 FancyCheck /opt/local/shell_scripts/fancycheck.sh \ /core
Here are the results from walking the .22.214.171.124.4.1.2021.2789.51 subtree:
Notice that we have a few additional lines in our output. 27126.96.36.199 is the exit number, 27188.8.131.52 and 27184.108.40.206 are the output from the command, and 27220.127.116.11 is the errorFix value. These values can be useful when you are trying to debug your new extension. (Unfortunately, snmpwalk can give you only the numeric OID, not the human-readable name, because snmpwalk doesn't know what 2789.51.x is.)$ snmpwalk sunserver2 public .18.104.22.168.4.1.2021.2789.51 enterprises.ucdavis.2722.214.171.124 = 1 enterprises.ucdavis.27126.96.36.199 = "FancyCheck" enterprises.ucdavis.27188.8.131.52 = "/opt/local/shell_scripts/fancycheck.sh /core" ucdavis.27184.108.40.206 = 0 ucdavis.27220.127.116.11 = "-rw-r--r-- 1 root other 346708 Feb 14 16:30 /core." ucdavis.2718.104.22.168 = "/core:..ELF 32-bit MSB core file SPARC Version 1, from 'httpd'." ucdavis.2722.214.171.124 = 0
The last task for Net-SNMP's extensible agent is to perform some disk-space monitoring. This is a great option that lets you check the availability of disk space and return multiple (useful) values. The disk option takes a filesystem mount point followed by a number. Here is what our entry looks like in snmpd.conf:
The definition of the disk option from UCD-SNMP-MIB.txt is "Minimum space required on the disk (in kBytes) before the errors are triggered." Let's first take a look on sunserver2 to see what the common df program returns:# Check disks for their mins disk / 100000
To see what SNMP has to say about the disk space on our server, run snmpwalk against the ucdavis.diskTable object (.126.96.36.199.4.1.2021.9). This returns virtually the same information as the df command:$ df -k / Filesystem kbytes used avail capacity Mounted on /dev/dsk/c0t0d0s0 432839 93449 296110 24% /
As you can see, the Net-SNMP agent has many customizable features that allow you to tailor your monitoring without having to write your own object definitions. Be sure to review $NET_SNMP_HOME/share/snmp/mibs/UCD-SNMP-MIB.txt for complete definitions of all Net-SNMP's variables. While we touched on only a few customizable options here, you will find many other useful options in the EXAMPLE.conf file that comes with the Net-SNMP package.$ snmpwalk sunserver2 public .188.8.131.52.4.1.2021.9 enterprises.ucdavis.diskTable.dskEntry.dskIndex.1 = 1 enterprises.ucdavis.diskTable.dskEntry.dskPath.1 = "/" Hex: 2F enterprises.ucdavis.diskTable.dskEntry.dskDevice.1 = "/dev/dsk/c0t0d0s0" enterprises.ucdavis.diskTable.dskEntry.dskMinimum.1 = 100000 enterprises.ucdavis.diskTable.dskEntry.dskMinPercent.1 = -1 enterprises.ucdavis.diskTable.dskEntry.dskTotal.1 = 432839 enterprises.ucdavis.diskTable.dskEntry.dskAvail.1 = 296110 enterprises.ucdavis.diskTable.dskEntry.dskUsed.1 = 93449 enterprises.ucdavis.diskTable.dskEntry.dskPercent.1 = 24 enterprises.ucdavis.diskTable.dskEntry.dskErrorFlag.1 = 0 enterprises.ucdavis.diskTable.dskEntry.dskErrorMsg.1 = ""
Copyright © 2002 O'Reilly & Associates. All rights reserved.