Miscellaneous Features

The High-Priority (Hi-Pri) login feature provides a solution to ensure that network floods do not cause outages to occur.

Introduction

The network flooding control feature is an enhancement to switch software that prevents network traffic floods from aborting the processor card CPU and thus keeps the user traffic flowing. This ensures that multiple node failures no longer occur due to overload conditions that were created in these networks. The secondary changes are to provide a high-priority console login to allow you to view and correct flooding problems, and build tolerance into the communication between nodes and between hubs and feeders to allow user traffic to continue flowing when this communication breaks down.

Problem Description for Which Network Flooding Control Enhancement Provides a Solution

A problem occurred that resulted in a flood of network messages being sent to most of the network. The problem was caused by a combination of things on the BPX. A firmware bug caused a standby BXM to loop real traffic back to the bus. An active BXM card in a Y-cable hot-standby pair had its firmware upgraded. In doing so, a card switchover occurred their correctly put the card in a standby state but caused all traffic to loop back toward the bus without being blocked. Software had programmed the networking channels on the card so that traffic destined to leave the card would be sent back to the active card. The BPX crosspoint architecture allows standby cards to loop their traffic back to themselves as well as send it to the true destination. The BXM normally rejects cells not destined for the slot they are in except in the case of a hot standby. As a result of these steps, a loop was formed where traffic would loop continuously on the standby card but would also be sent to the active card for transmission out the trunks. This caused a very high rate of duplicate network messages to many nodes downstream from the trunks on this BXM card.

This network traffic flooding resulted in overloading the processor cards at multiple nodes. This overload exhausted critical resources within the processor cards which caused the nodes to abort. The resulting aborts caused CC switchovers but then these processor cards aborted as well. The second abort resulted in derouting all connections at each node.

One factor prolonging the outage was the difficulty in locating and disabling the source of the traffic flood. Aborts continued to occur as the flooding continued. The user interface at each node was unavailable as the nodes serviced the overload of traffic and aborted. This made isolation of the flood difficult. (Physically removing trunk cards at nodes ultimately isolated the source of the flood.)

An additional factor that prolonged the outage was the inability of the routing mechanism to quickly route so many connections at once. The ineffectiveness of the single threaded routing and its back-off collision mechanism led to an unacceptably slow rate of routing (and restoring the user traffic flow). (Manual intervention to shut off routing at key nodes reduced the collision rate and allowed the routing mechanism to efficiently restore all connections.)

This network flooding control enhancement is meant to solve the above problem with the following requirements:

Note that the BXM firmware is expected to be upgraded (carefully) to "W" or beyond to prevent the known flood from reoccurring.

Configuring the High-Priority Login Mode Feature

You do not need to configure anything to get the functionality of the network flooding control feature enhancement or the high-priority login feature.

Using the High-Priority Login Feature

A flood of network traffic can lead to a node becoming unreachable from other nodes in the network. The high-priority login feature allows you to log in at the console port and execute a small set of commands. You log in as follows:

At this point you may detect excessive network messages using the nwstats command or see excessive network handler processing using the dspprf command.

To lessen the CPU use of the network handler task and allow lower priority tasks to execute, you can use the cnfnhparm command to decrease the loop count before the network handler task suspends processing.

If the source of the traffic flood cannot be quickly located and shut off, you can disable LMI error detection using the addfdrlp command on the hub and at all connected feeders. After the network returns to its normal state, you can re-enable LMI at the hub and feeder nodes using the delfdrlp command. You can see the loopback state of the feeder trunk LMI using the dspnode command.

Functional Description

Software Loop Prevention

The network channel programming on the BPX now blocks trunk channels that loop incoming traffic back to the same trunk. This eliminates the possibility that undetected hardware loopbacks create a flood of traffic on the trunk.

For each node in the network there exists one channel on each BPX trunk to receive control traffic for that node and forward it to the one trunk that transmits the traffic for that node. To avoid looping back traffic that unexpectedly arrives on the transmit trunk, a CLP object was set for that channel. On BNI cards the firmware interpreted this to turn off the receive part of the channel. The BXM firmware does not have this functionality. The software now sets the receive VPI/VCI to 0/0. This has the same effect on BXM firmware as the CLP object had on BNI firmware (the receive part of the channel is turned off). The BXM firmware does not sink cells with VPI/VCI equal to 0/0.

Duplicate Coerced Message Dumping

The network message handler checks for receiving duplicate network messages without sequence numbers (coerced messages) within a small amount of time. If duplicates are detected, they are quickly discarded without acknowledgment. Duplicates are considered coerced messages received within one second of each other. This has the effect of limiting the remaining flow to other parts of the software to one coerced message per second. Floods of network messages that use sequence numbers appear as messages with duplicate sequence numbers. Duplicate messages are already handled efficiently. The nwstats screen shows "Dropped flooding msgs."

Network Message Read Limit

A configurable limit is added to the network handler to control the number of cells that may be read from the SAR receive queue before giving up the CPU to lower priority tasks. This has the effect of limiting the amount of CPU usage by this high-priority task even when floods of network traffic are present. The command and its syntax for controlling this feature are defined below.

The setting of this parameter to a low number may lead to the dropping of network traffic, possibly resulting in comm breaks, comm fails, or background test failures.

High-Priority Console Login

A special high-priority console user login is created to allow you to log in and execute some commands on the node even during periods of node congestion. The console login executes as high-priority before the user logs in. When you log in as "StrataCom" and the first command typed is the new command is hipri, then the user task stays in high-priority mode. If the user logs in using another account or uses "StrataCom" but does not use the hipri command first, then the user task reverts to the normal (lower) priority.

The following error message is displayed when the hipri command is used by a non-Cisco login.

The following error message is displayed when hipri is not the first command immediately after login.

The following error message is displayed when you try to use command but not from the control port.

The high-priority user task executes above all tasks but the resource handler. This allows this feature to execute even in cases of network message flooding, connection routing, extreme CommBus usage, and so on. Notification is given when high-priority mode is in use by the "High-Priority!" string on the dsplog screen. A sample screen is shown in Example 4 under the "dsplog" section on page 14-77.

Only a subset of the user commands is allowed to run during a high-priority login. Due to the high-priority of this task, some commands may not work correctly or may affect other features in the system. For that reason, the list of commands is limited and are blocked at the command line. The user receives the following message when an invalid command is attempted from high-priority:

Table 17-1 lists the commands allowed in high-priority mode for the StrataCom user level:

Table 17-1 High-Priority Mode StrataCom User Level Commands

addfdrlp	bye	cbstats	cbtrace	ccb	cnw
cnfnhparm	dcb	dcct	delfdrlp	dlcon	dm
dncd	dspalms	dnib	dnw	dspalms	dspcd
dspcderrs	dspcds	dsplog	dspnds	dspnode
dspnw	dspprf	dspprfhist	dspqs	dspsust	dspswlog
dsptrkerrs	dsptrks	dsptrkstats	dsptrkutl	dspusertask	dspusertasks
dvc	help or "?"	killuser	logoutuser	nwstats	nwtrace
off1	off2	off3	on1	on2	on3
pm	resetcd	resetsys	runrev	stopjob	switchcc
vt	"." (history)

ARP Table Expansion

The ARP cache table size has been increased to provide more efficient management of IP to Ethernet (MAC) addresses and prevent processor overloads from excessive ARP messages.

Address Resolution Protocol (ARP) is used by IP hosts on an Ethernet LAN to determine the Ethernet (MAC) addresses of fellow hosts. This protocol will, using Ethernet broadcast packets, allow for mapping an IP address to an Ethernet address. To assist in maintaining the mappings, an ARP cache is usually resident on each IP host. By eavesdropping on ARP messages, each IP host can build its ARP cache quickly and efficiently.

When large numbers of IP hosts are resident on the same physical Ethernet, lots of ARP broadcast messages can be normal. Each new translation of IP address to Ethernet address is placed in a local ARP cache entry on the BPX node. Previously, this ARP cache had a size limit of four entries. In situations where a large number of ARP translations exist on the Ethernet, bumping of ARP cache entries to make room for new entries is necessary. In fact, a sort of thrashing in the ARP cache can occur.

Increasing the table size to 16 entries improves the performance of the processor when more than 4 physical devices are on the same LAN segment. ARP broadcasts are minimized as are updates to the ARP cache. This is expected to address the large number of Cisco WAN Manager workstations that a node can support.

Comm Fail Tolerance

The trunk keep-alive mechanism, also known as the Comm Fail test, allows you to select whether or not connections are derouted on keep-alive time-outs. Previously when the Comm Fail test failed, all connections on the trunk were derouted affecting user traffic. This test runs in addition to the physical line alarm mechanism.

In the event of a network flood, the network handler will inevitably end up dropping numerous network messages. Among these will be messages for the comm break and comm fail tests, leading to a failure of the tests and the declaration of comm breaks with other nodes and comm fails on its trunks.

To provide more tolerance to a flood of network messages, the Comm Fail test functions so that the default for physical trunks is to leave connections routed in spite of a failure detected by the Comm Fail test. Network alarms and log events are still generated for Comm Fail failures, but connections are not derouted.

In the case of virtual trunks, the Comm Fail test may be the only indication that a virtual trunk crossing an ATM cloud is not passing traffic. For this reason, virtual trunks must continue to deroute connections on Comm Fail failures.

Control of whether Comm Fail test failures cause deroutes on physical trunks is provided by the cnfnodeparm command. A new parameter Reroute on Comm Fail indicates whether connections should be derouted on failures. If enabled, a Comm Fail test failure on any local trunk results in all nodes rerouting the connections they own that are currently on that trunk. If this is not enabled, a Comm Fail test failure will not result in the rerouting of the connections. A comm fail on a virtual trunk will always result in the rerouting of all the connections on the trunk, regardless of the setting of the enable flag.

Regardless of the Reroute on Comm Fail parameter setting, a trunk that fails the Comm Fail test is still declared as failed. Route-op still runs and will consider this trunk unusable for network traffic. Network clock routing also considers the trunk unusable for clocking and builds a route around this trunk. These operations continue to work as in releases previous to Release 9.2.

LMI Failure Prevention-Manual Command

A manual command is added to IGX and AXIS feeder software to allow control over the endpoint connection status. If the BPX cannot communicate LMI messages with its feeders, then the LMI status at the feeders must be maintained to keep the connections "active" to their external devices.

If the BPX hub is flooded with network messages, then LMI/ILMI communication with its feeders may be interrupted. LMI normally runs a keep-alive between the hub node and feeder node. If the keep-alive fails, then the other end changes the status of all connections to failed. If the outage is only due to a network message flood, then it is desirable to override this mechanism to keep the connection status as active.

The BPX and IGX software now has the addfdrlp and delfdrlp commands. On the BPX hub with attached feeders, the delfdrlp command clears any communication failures on the specified feeder and sends messages to the remote nodes (the routing nodes for the other end of the feeder connections) informing them of this clearing. In addition, the BPX no longer sends any status updates to the feeder yet it continues to acknowledge any feeder LMI messages received. The dspnode command indicates loopbacks on feeders.

where: slot is the slot number for the feeder trunk and port is the port number for the feeder trunk

The BPX command delfdrlp restores the BPX's feeder LMI protocol to the normal state and triggers an update of connection status toward the feeder.

where: slot is the slot number for the feeder trunk and port is the port number for the feeder trunk

The following log messages occurs as a result of using the feeder loopback commands.

On the IPX/IGX feeder, the addfdrlp command clears any communication failure on the feeder to the routing node (hub). It also, clears any ingress (coming from the routing node) A-bit failures. In addition, the feeder does not send the routing nodes any status updates but continues to acknowledge any routing node LMI messages received.

where: slot is the slot number for the feeder trunk and port is the port number for the feeder trunk

The IGX command delfdrlp restores the routing node's LMI protocol to the normal state and triggers an update of connection status toward the routing node.

where: slot is the slot number for the feeder trunk and port is the port number for the feeder trunk

cnfnodeparm Screen

Figure 17-1 is a sample cnfnodeparm command screen. More than one screen is needed to show all the parameters for this command.

Table Of Contents