Fault	Attribute	MIB	Source
Link Down or Link Up trap from any DS1, DS3, or Ethernet interface. Raises major and normal alarms respectively.	IfTable: IfIndex, ifType, ifAdminStatus, ifOperStatus	IF-MIB	SNMP trap; Link Down trap is cleared by one or more Link Up traps for the same interface.
Cold Start trap from the device. Raises warning alarm.	ColdStart trap	SNMPv2-MIB	SNMP trap.
Warm Start trap from the device. Raises warning alarm.	WarmStart trap	SNMPv2-MIB	SNMP trap.
Authentication Failure trap from the device. Raises major alarm.	AuthenticationFailure trap	SNMPv2-MIB	SNMP trap.
Card OIR trap from the device. Raises warning alarm and performs discovery on the affected device.	cefcFRUInserted trap cefcFRURemoved trap	CISCO- ENTITY-FRU-CONTROL- MIB	SNMP trap.
Card inserted or removed in the device. Raises normal alarms.	alarmDirectory: entPhysicalContainedIn trap	ENTITY-MIB	Internal.
Environment Monitoring Traps from the device. Raises critical alarm for the shutdown trap, and major alarm for all the other traps.	EnvMonShutdownNotification trap EnvMonVoltageNotification trap EnvMonTemperatureNotification trap EnvMonFanNotification trap EnvMonRedundantSupplyNotification trap	CISCO-ENVMON-MIB	SNMP trap.
Loss or re-establishment of communication with device ¹. Raises major and normal alarms respectively.	Not applicable		Internal. Communication lost alarm cleared by the communication established alarm.
Device or card commissioned or decommissioned ². Raises informational alarm in both cases.	Not applicable		Internal.
Server disk usage above the major threshold. Raises major alarm.	Not applicable		Internal. Cleared when disk usage is below the major threshold.
Server disk usage above the critical threshold. Raises critical alarm.	Not applicable		Internal. Cleared when disk usage is below the critical threshold.
Graceful Shutdown operation was interrupted. Raises major alarm.	Not applicable		Internal.
Accept Traffic operation was interrupted. Raises major alarm.	Not applicable		Internal.

¹See the "Overview of Presence Polling and Loss of Communication with a Device" section.
²See the "Overview of the Commission/Decommission Function for a Chassis" section.

Overview of Presence Polling and Loss of Communication with a Device

You can detect communication loss with a managed device by using presence polling. Loss of communication can occur for various reasons:

Network delays.
Problem with the communication link between EMS and the device, but the device may still be operating properly.
The device is overloaded, resulting in slow or no response.
The device has a problem and is unable to respond to presence polling.

Presence Polling Retries

When Cisco UGM first detects loss of communication to a managed device, it does not immediately transition the device to the errored state but retries presence polling. Select the number of retries as described in the "Setting Number of Retries Before Loss of Communication" section.

Presence Polling Intervals

Presence polling uses an interval specified in the "Setting Presence Polling Intervals for Devices in Normal and Errored States" section. If all the communication attempts prove unsuccessful, the device transitions to the errored state. An internal alarm event (communicationLost) with a Major severity level is raised against the affected device.

The default presence polling intervals are:

900 seconds during the normal state

915 seconds during the errored state

Duration of Communication Loss

When communication is re-established, the device returns to a normal state, and an internal alarm event (communicationEstablished) with a Normal severity level is raised against the affected device.

If communication is restored after the duration specified in the "Setting Loss of Communication Duration" section, Cisco UGM discovers the device's subcomponents to detect any card inventory changes that may have occurred during the loss of communication.

If communication is restored within the specified duration, Cisco UGM transitions the device to the normal state.

Setting Presence Polling Intervals for Devices in Normal and Errored States

Step 1 In Map View, choose ASEMSConfig > EMS > Settings.

Step 2 Enter the interval at which a device should be polled in the normal state.

The interval should be an integer value that is 300 or larger (representing seconds). The default is 900 seconds.

Note This value depends on the total number of managed devices in your network. You may need to change this value a few times in order to determine the optimum setting for your network.

Step 3 Enter the interval at which a device should be polled in the errored state.

The interval should be an integer value that is 300 or larger (representing seconds). The default is 915 seconds.

Note This value depends on the total number of managed devices in your network.

Do not enter the same value as for devices in the normal state. A different value avoids overlapping polling intervals for normal and errored states.

Step 4 Click Apply.

Setting Number of Retries Before Loss of Communication

When Cisco UGM first detects loss of communication to a managed device, it does not immediately transition the device to the errored state, but retries presence polling by using the polling interval specified in the "Setting Presence Polling Intervals for Devices in Normal and Errored States" section. If these communication attempts are unsuccessful, the device transitions to the errored state.

Step 1 In Map View, select ASEMSConfig > EMS > Settings.

Step 2 Enter the number of times Cisco UGM tries to re-establish connectivity before transitioning the device into the errored state.

The number entered should be an integer value that is 0 or larger. A value of 0 disables retries; the default is 1.

Note A large value causes a delay before loss of communication with a device is detected.

Step 3 Click Apply.

Setting Loss of Communication Duration

Step 1 In Map View, choose ASEMSConfig > EMS > Settings.

Step 2 Enter a time interval for which communication must be lost in order to start discovery.

The interval should be an integer value that is 15 or larger (representing minutes). The default is 15 minutes.

Note A large value results in card inventory changes that are not detected.

If communication is restored after this interval, Cisco UGM initiates discovery of the device's subcomponents to detect any card inventory changes that may have occurred during the loss of communication.

If communication is restored within this interval, Cisco UGM transitions the device to the normal state.

Step 3 Click Apply.

Overview of the Event Browser

You can start the Event Browser from the Launchpad or from the pop-up menu for the individual object within Map Viewer.

With the Event Browser, you can perform these tasks:

Query (filter) events
Sort events
Acknowledge events
Clear events
Start services on events

You can see all events—regardless of your access privilege. In the Event Browser window, you can check the Ack (acknowledge) box next to an event to communicate to other users that you are planning to deal with that particular event. When you resolve the event, click the Clear box so that other users are informed of this.

Note Only the most severe alarm event against an object appears next to its icon within Map Viewer.

You can view additional alarm details by using the Event browser. For more information, refer to the Cisco Element Management Framework User's Guide.

Using the Event Browser

Step 1 In the Map Viewer, note the color coding of status dots to represent the occurrence of alarm events against the objects.

See the "Overview of Alarm Events" section for an explanation of the colors.

Step 2 Right-click the object whose list of alarm events you want to view and choose Tools > Open Event Browser.

Using the Query Editor

If you do not want to view all events in the system, set up a query by using the Query Editor to view only specific events.

The criteria that you use to specify a query are on individual tabs. The Event Browser is updated with only those events that match the query criteria. A progress bar indicates that Cisco UGM is querying events and the window is being updated.

Caution Any changes that you make to a query are not stored when you exit the Event Browser.

If you have specified different queries, you can open more than one Event Browser session at a time.

For details about the Query Editor refer to the Cisco Element Manager Framework User's Guide.

To access the Query Editor from the Event Browser, choose Edit > Query Setup.

Overview of Alarm Events

In the Map Viewer tree, you can see raised alarm events by the presence of colored dots next to tree objects in the left pane and by colored annotations against the object icons in the right pane.

The dots are color coded to reflect the following severity levels (highest to lowest): critical, major, minor, informational, and normal.

The defined color coding is:

Red = Critical
Orange = Major
Yellow = Minor
Cyan = Warning
White = Informational
Green = Normal (no events)

A device or card object can be in either commissioned or decommissioned state within Cisco UGM.

If an object is in a commissioned state, alarm events against that object are propagated to the physical tree in the Map Viewer and appear in the parent objects to the region level.

For decommissioned objects, alarm events are not propagated up to the physical tree in the Map Viewer.

For details on commissioning and decommissioning objects, see the "Overview of the Commission/Decommission Function for a Chassis" section.

The following table describes Cisco UGM alarm events, their severity, explanation, and recovery procedures.

Table 9-2: Cisco UGM Alarm Events

Alarm Event	Alarm Severity	Explanation
ciscoColdStart	Warning	You started the device object from a power-off state. Note Clear this event manually.
ciscoWarmStart	Warning	You restarted the device object from an on state. Note Clear this event manually.
ciscoLinkDown	Major	A DS1 or Ethernet interface is down.
ciscoLinkUp	Normal	A DS1 or Ethernet interface is up.
ciscoAuthenticationFailure	Major	The device received an SNMP message that was improperly authenticated.
cardInserted	Warning	You inserted a new card in the device; Cisco UGM initiates discovery on the device.
cardRemoved	Warning	You removed a card from the device; Cisco UGM initiates discovery on the device.
Card inserted in slot	Informational	You inserted a new card in the device; Cisco UGM completes discovery on the device.
Card removed in slot	Informational	You removed a card from the device; Cisco UGM completes discovery on the device.
envMonShutdown	Critical	A critical environmental condition is detected and a device shutdown is imminent.
envMonVoltage	Major	A voltage threshold was exceeded on the device.
envMonTemperature	Major	A temperature threshold was exceeded on the device.
envMonFan	Major	The fan on the device has failed.
envMonRedundantSupply	Major	The power supply on the device has failed.
communicationLost	Major	Cisco UGM lost SNMP connectivity with the device.
communicationEstablished	Normal	Cisco UGM established SNMP connectivity with the device.
entityDecommisioned	Informational	Device or card object has been decommissioned.
entityCommissioned	Informational	Device or card object has been commissioned.
fileSysAboveMajor	Major	Server disk usage is over the user-defined major threshold¹.
fileSysAboveCritical	Critical	Server disk usage is over the user-defined critical threshold².
fileSysBelowMajor	Normal	Server disk usage is below the user-defined major threshold.
fileSysBelowCritical	Normal	Server disk usage is below the user-defined critical threshold.
gracefulShutdownInterrupted	Major	During a Graceful Shutdown operation, loss of communication with the device occurred or it was decommissioned. Note Clear this event manually.
acceptTrafficInterrupted	Major	During an Accept Traffic operation, loss of communication with the device occurred or it was decommissioned. Note Clear this event manually.

¹For details on changing this threshold, see the "Example: Sample Configuration File for Fault Management" section.
²For details on changing this threshold, see the "Example: Sample Configuration File for Fault Management" section.

Clearing Alarm Events

If you manually clear an alarm event for an object in the Event Browser, that object appears in the Map Viewer with an alarm notification reflecting the next highest alarm present for that object. This change in alarm severity appears in the Map Viewer, even if the fault condition has not actually been corrected.

Cisco UGM does not generate all alarm events again, even if the alarm conditions are still present; therefore, be cautious in clearing alarm events.

Step 1 In the Map Viewer, note the color coding of status dots to represent the occurrence of alarm events against the objects.

See the "Overview of Alarm Events" section.

Step 2 Right-click the object whose list of alarm events you want to view and choose Tools > Open Event Browser.

You can acknowledge and clear individual alarm events by clicking the appropriate box next to each event.

Overview of Trap Forwarding

Cisco UGM monitors UDP port 162 for all S NMPv1 and v2c traps sent from all managed devices configured to send traps to it, and then forwards them to the specified host destinations.

Cisco UGM forwards SNMP v1 and v2 traps to multiple remote hosts, but SNMP v2 traps are forwarded as SNMP v1 traps.

For each remote host, configure a list of trap specifiers that identify specific SNMP traps (consisting of Enterprise ID, Generic ID, and Specific ID).

Cisco UGM maintains a list of host destinations that you define. Also define specific SNMP traps for each host destination.

Enter a wildcard (*) for any field of a trap specifier.

Add new remote hosts or new trap specifiers by using the Trap Forwarding Deployment Wizard.

Update existing remote hosts or trap specifier fields by using the Trap Forwarding Properties Dialog.

Delete existing remote hosts or trap specifiers from the Map Viewer.

Click Accept Saved Setting (in the Trap Forwarding Properties Dialog box) for trap forwarding changes to take effect.

Specifying New Trap Forwarding Hosts

By using the Trap Forwarding Deployment Wizard, you can:

Specify host destinations and traps to be forwarded.
Deploy host destinations and traps.

Note The default is no trap forwarding.

Step 1 Choose ASEMSConfig > TrapForwarding > Deploy Trap Forwarding Hosts.

Step 2 Follow the instructions provided by the Deployment wizard.

Step 3 In the Map viewer window, choose ASEMSConfig > Trap Forwarding > Trap Forwarding Properties.

Step 4 To enable trap forwarding, click Accept Saved Setting.

Specifying New Trap Specifiers for a Trap Forwarding Host

Step 1 From the Map Viewer, open ASEMSConfig.

Step 2 Expand the Trap Forwarding tree by clicking on the + (plus) sign.

Step 3 Open the Trap Specifiers Deployment Wizard.

Step 4 Right-click the host destination for which you wish to add a new trap specifier and select Deploy Trap Specifiers.

Step 5 Follow the instructions provided by the Deployment wizard.

Step 6 In the Map Viewer, choose ASEMSConfig > Trap Forwarding > Trap Forwarding Properties.

Step 7 To update trap forwarding, click Accept Saved Setting.

The trap forwarding action triggered reflects any changes made (and saved) in this dialog box. Any previously specified trap forwarding action is replaced.

Changing Previously Specified Trap Forwarding Data

Step 1 In the Map Viewer, choose ASEMSConfig > Trap Forwarding > Trap Forwarding Properties.

Step 2 Enter your changes.

Step 3 Click the Save icon from the dialog toolbar, or choose File > Save.

Step 4 To update trap forwarding, click Accept Saved Setting.

The trap forwarding action triggered reflects any changes made (and saved) in this dialog. Any previously specified trap forwarding action is replaced.

Removing Previously Specified Trap Forwarding Data

Step 1 From the Map Viewer, open ASEMSConfig.

Step 2 Expand the Trap Forwarding tree by clicking the + (plus) sign.

Step 3 Expand any listed host destination by clicking the + (plus) sign.

Step 4 Right-click the object to be deleted (a host destination, or a specific trap specifier for a given host destination) and choose Deployment > Delete Objects.

Step 5 In the Map Viewer, choose ASEMSConfig > Trap Forwarding > Trap Forwarding Properties.

Step 6 To update trap forwarding, click Accept Saved Setting.

The trap forwarding action triggered reflects any changes made (and saved) in this dialog. Any previously specified trap forwarding action is replaced.

Tip To deactivate or disable all trap forwarding, you must delete all host destinations and click Accept Saved Setting.

To resume trap forwarding, re-enter the host destinations.

See the "Specifying New Trap Forwarding Hosts" section.

Example: Cisco UGM Trap Mapping Tables

Table 9-3: Cisco AS5350 Trap Mapping

Class Mapping	Enterprise	Generic ID	Severity	Color
ciscoColdStart	1.3.6.1.4.1.9.1.313	0	warning	Cyan
ciscoWarmStart	1.3.6.1.4.1.9.1.313	1	warning	Cyan
ciscoLinkDown	1.3.6.1.4.1.9.1.313	2	major	Orange
ciscoLinkUp	1.3.6.1.4.1.9.1.313	3	normal	Green
ciscoAuthenticationFailure	1.3.6.1.4.1.9.1.313	4	major	Orange

Table 9-4: Cisco AS5400 Trap Mapping

Class Mapping	Enterprise	Generic ID	Severity	Color
ciscoColdStart	1.3.6.1.4.1.9.1.274	0	warning	Cyan
ciscoWarmStart	1.3.6.1.4.1.9.1.274	1	warning	Cyan
ciscoLinkDown	1.3.6.1.4.1.9.1.274	2	major	Orange
ciscoLinkUp	1.3.6.1.4.1.9.1.274	3	normal	Green
ciscoAuthenticationFailure	1.3.6.1.4.1.9.1.274	4	major	Orange

Table 9-5: Cisco AS5800 Trap Mapping

Class Mapping	Enterprise	Generic ID	Severity	Color
ciscoColdStart	1.3.6.1.4.1.9.1.188	0	warning	Cyan
ciscoWarmStart	1.3.6.1.4.1.9.1.188	1	warning	Cyan
ciscoLinkDown	1.3.6.1.4.1.9.1.188	2	major	Orange
ciscoLinkUp	1.3.6.1.4.1.9.1.188	3	normal	Green
ciscoAuthenticationFailure	1.3.6.1.4.1.9.1.188	4	major	Orange

Table 9-6: Cisco AS5850 Trap Mapping

Class Mapping	Enterprise	Generic ID	Severity	Color
ciscoColdStart	1.3.6.1.4.1.9.1.308	0	warning	Cyan
ciscoWarmStart	1.3.6.1.4.1.9.1.308	1	warning	Cyan
ciscoLinkDown	1.3.6.1.4.1.9.1.308	2	major	Orange
ciscoLinkUp	1.3.6.1.4.1.9.1.308	3	normal	Green
ciscoAuthenticationFailure	1.3.6.1.4.1.9.1.308	4	major	Orange

Overview of the Commission/Decommission Function for a Chassis

About Commissioning a Chassis

Commission a device to return it to a normal (commissioned) state within the EMS.

When you commission a device, Cisco UGM starts discovery on the device to resolve any card inventory changes that may have occurred while it was in the decommissioned state. When discovery is completed, the device returns to the normal or errored state depending on whether commissioning was successful.

Note When a device is commissioned, all its subcomponents (cards and ports) also transition into the commissioned state.

About Decommissioning a Chassis

With Cisco UGM, you can decommission a device from any state. You can decommission a device due to one of these causes:

The device was manually deployed.
You decommissioned the device to suspend reporting alarm events when the device was rebooted or undergoing maintenance.

When you decommission a device, no actual changes are made to the device, which still sends traps to Cisco UGM. However, the resulting alarm events are not reported and do not initiate any actions or status changes. Presence and performance polling are also suspended, and Cisco UGM does not allow any configuration changes or software and firmware image downloads for the device.

Note When a chassis is decommissioned, all its subcomponents (cards and ports) also transition into the decommissioned state.

Overview of the Commission/Decommission Function for a Card

About Commissioning a Card

Commission a card to return it to a normal (commissioned) state within the system.

When you commission a card, Cisco UGM reconciles its status with that of the actual card on the device. When this is completed, the card returns to either the normal or errored state. If the card was removed from the device, the corresponding card object is deleted.

Note When a parent device is commissioned, all its subcomponents (cards and ports) also transition into the commissioned state. Likewise, when a card is commissioned, all its ports are also commissioned.

About Decommissioning a Card

You can decommission a card from any state due to one of these causes:

The parent device containing the card was decommissioned.
You decommissioned the card to suspend reporting alarm events when the card was rebooted or undergoing maintenance.

When you decommission a card, no actual changes are made to the card, which still sends traps to Cisco UGM. However, the resulting alarm events are not reported and do not initiate any actions or status changes.

When a parent device is decommissioned, all its subcomponents (cards and ports) also transition into the decommissioned state. Likewise, when a card is decommissioned, all its ports are also decommissioned.

Commissioning and Decommissioning a Device or Card

Step 1 Right-click the device or card object that you want to commission or decommission.

Step 2 Choose AS5xxx object> Chassis > Chassis Commissioning.

Choose Card object > Card Commissioning.

Step 3 Click Commission or Decommission.

Tip Decommissioned devices appear as shaded icons in the right-hand pane of the Map Viewer.

Overview of Exporting Alarm Events

With Cisco UGM, you can capture and export all alarm data to an ASCII text file; this file can then be examined locally by an external system or retrieved by an external system by using File Transfer Protocol (FTP). The external system is responsible for parsing the contents of this file.

Exporting SNMP traps consists of capturing traps from managed devices and writing them to a text file.

Note You cannot forward internally generated Cisco UGM alarm events cannot be forwarded through SNMP; you can export these alarm events by writing them to the ASCII text file.

You can access the Alarm File Export function to schedule alarm data export, specify where the exported data is to be stored, how and when the file ages, and also specify a string to delimit exported data.

Exporting Alarm Events to a File

Step 1 From the Map viewer choose ASEMSConfig > File Export > Open File Export Properties > Alarm.

Step 2 In the Export Type field, select Continuous.

Step 3 Enter a storage path for the file.

Step 4 Select an action to be performed when file aging occurs:

none—Disables aging; File Age and Aging Directory fields are ignored.
delete—Deletes the aged file from the disk.
move—Moves the aged file into aging directory.
moveTarCompress—Compresses the aged file, and then adds it to the FileExport.tar file which, if it does not already exist, is created in the Aging Directory.

Step 5 Specify the maximum size (in KBytes) of a file before the selected aging action begins. Export then continues to the newly created file.

Step 6 Specify where the file is moved to (or moveTarCompressed to) when aging occurs.

If you enter a non-existent directory path, it is automatically created.
This field does not apply to the delete aging action.
The directory string that you enter must end with a trailing / (forward slash).
If the Action field is set to moveTarCompress, a tar file named FileExport.tar is created in the Aging Directory for the aged files.

Step 7 Click Save:

Saves user-specified data from this dialog.
Changes are validated and applied to the system (if valid).
Generates an Action Report containing results of this action.

Example: Alarm Data Export Format and Sample

Alarm export data is formatted as follows:

<Date>|<Time>|<DataType>|<AlarmName>|<AlarmSeverity>|<AffectedObject>|

Sample:

2000/09/08|08:32:59 
EDT|InternalAlarm|communicationEstablished|normal|Physical:/Kanata/AS5
350-1|
2000/09/08|08:33:05 
EDT|InternalAlarm|communicationEstablished|normal|Physical:/Kanata/AS5
400-1|
2000/09/08|08:33:06 
EDT|InternalAlarm|communicationEstablished|normal|Physical:/Kanata/AS5
800-1|
2000/09/08|08:37:53 EDT|InternalAlarm|fileSysBelowMajor|normal|:/|
2000/09/08|08:37:53 EDT|InternalAlarm|fileSysBelowCritical|normal|:/|
2000/09/08|10:17:45 
EDT|SNMPv1|envMonRedundantSupply|major|Physical:/Kanata/AS5800-1|
2000/09/08|10:18:41 
EDT|SNMPv1|ciscoLinkUp|normal|Physical:/Kanata/AS5800-1|
2000/09/08|10:18:41 
EDT|SNMPv1|ciscoLinkUp|normal|Physical:/Kanata/AS5800-1|
2000/09/10|14:36:45 
EDT|SNMPv1|cardInserted|warning|Physical:/Kanata/AS5350-1|
2000/09/10|14:37:06 
EDT|SNMPv1|ciscoLinkUp|normal|Physical:/Kanata/AS5350-1|
2000/09/10|14:57:28 
EDT|SNMPv1|ciscoLinkUp|normal|Physical:/Kanata/AS5350-1|

2000/09/11|17:58:32 
EDT|SNMPv1|ciscoLinkUp|normal|Physical:/Kanata/AS5800-1|
2000/09/11|17:58:35 
EDT|SNMPv1|ciscoLinkUp|normal|Physical:/Kanata/AS5800-1|
2000/09/11|18:10:18 
EDT|SNMPv1|ciscoLinkDown|major|Physical:/Kanata/AS5800-1|
2000/09/11|18:11:20 
EDT|SNMPv1|ciscoLinkUp|normal|Physical:/Kanata/AS5800-1|
2000/09/11|18:15:07 
EDT|InternalAlarm|entityCommissioned|informational|Physical:/Kanata/AS
5400-1|
2000/09/11|18:23:19 
EDT|SNMPv1|envMonRedundantSupply|major|Physical:/Kanata/AS5800-1|
2000/09/11|18:23:59 
EDT|SNMPv1|ciscoLinkUp|normal|Physical:/Kanata/AS5800-1|
2000/09/11|18:24:00 
EDT|SNMPv1|ciscoLinkUp|normal|Physical:/Kanata/AS5800-1|
2000/09/12|10:20:23 
EDT|SNMPv1|ciscoLinkDown|major|Physical:/Kanata/AS5800-1|

Example: Sample Configuration File for Fault Management

You can view and edit some Cisco UGM attributes by changing a configuration file in ASCII format; the file is located at:

<CEMFROOT>/config/ASMainCtrl/ASMainCtrlUserData.ini

Sample of the ASMainCtrlUserData.ini file showing items relevant to fault management in Cisco UGM:

===================================================
; Configurable controller settings.
; ===================================================
 
; This section defines settings for file-system monitoring:
; * MajorThreshold    : If file-system usage exceeds this percentage,
;                  major alarm is raised.
; * CriticalThreshold : If file-system usage exceeds this percentage,
:                  critical alarm is raised.
; * MonitoringInterval: How often each file-system is checked in
:                       minutes. If the value is 0, self-monitoring
;                  is disabled for all file-systems.
;
; - Threshold percentages must be integer values > 0 and < 100.
; - MonitoringInterval must be integer value >= 0.
;
[SelfMonitor]
MajorThreshold = 90
CriticalThreshold = 95
MonitoringInterval = 10

Table of Contents