|
Table Of Contents
General Model of Problem Solving
Maintaining Network Information
Third Party Troubleshooting Tools
Using General Diagnostic Commands
Configuring Online Diagnostics
Troubleshooting Overview
This chapter gives a brief overview of the various areas that might require troubleshooting. This chapter includes the following sections:
• General Model of Problem Solving
• Maintaining Network Information
• Network and System Management
• Third Party Troubleshooting Tools
• Using General Diagnostic Commands
Basic troubleshooting processes, such as troubleshooting Ethernet connections, that are not specific to the Cisco ONS 15540 are not described in this document. This information is found online in other troubleshooting guides such as the Cisco IOS Internetwork Troubleshooting Guide.
General Model of Problem Solving
When troubleshooting a network environment, define the specific symptoms, identify all potential problems that could be causing the symptoms, and then systematically eliminate each potential problem (from most likely to least likely) until the symptoms disappear.
Figure 1-1 illustrates the general problem-solving model. This process is not a rigid outline for troubleshooting an internetwork. It is a foundation on which you can build a problem-solving process for your environment.
Figure 1-1 General Model of Problem Solving
The following steps detail the problem-solving process outlined in Figure 1-1:
Step 1 Analyze the network problem and create a clear problem statement. Define symptoms and potential causes.
Step 2 Gather the facts you need to help isolate possible causes.
Step 3 Consider possible causes based on the facts you gathered.
Step 4 Create an action plan based on those causes. Begin with the most likely problem and devise a plan in which you manipulate only one variable.
Step 5 Implement the action plan, performing each step carefully while testing to see whether the symptom disappears.
Step 6 Analyze the results to determine whether the problem is resolved.
Step 7 Terminate the process if the process is resolved.
Step 8 Create an action plan based on the next most probable cause on your list if the problem is not resolved. Return to Step 4 and repeat the process until the problem is solved.
Make sure that you undo anything you changed while implementing your action plan. Remember that you want to change only one variable at a time.
Note If you exhaust all the common causes and actions (either those outlined in this publication or others that you have identified in your environment), contact customer service. See Appendix A, "Technical Support," for additional information.
Maintaining Network Information
Maintaining the following details about your network helps with troubleshooting your system:
•Maintain an accurate physical and logical map of your internetwork that outlines the physical location of all of the devices on the network and how they are connected, as well as a logical map of network addresses, network numbers, and subnetworks.
•List all network protocols implemented in your network as well as a list of the network numbers, subnetworks, zones, and areas that are associated with them.
•Note which protocols are being routed and what the correct, up-to-date configuration information is for each protocol.
•Document all the points of contact to external networks, including any connections to the Internet. For each external network connection, note what routing protocol is being used.
•Document normal network behavior and performance so that you can compare current problems with a baseline.
Initial Troubleshooting
Before you start the troubleshooting process, confirm that the network and client connections were designed correctly using the information in the Cisco ONS 15540 ESP Planning and Design Guide and the interfaces were configured correctly using the information in the
Cisco ONS 15540 ESP Configuration Guide and Command Reference.Next confirm the integrity of the hardware and its installation by performing the following:
•Reseat the cable.
•Clean the cable, connectors, couplers, and attenuators.
•Confirm that the Tx and Rx fiber optic connections are not mixed.
•Reseat the transponder modules and mux/demux modules to the optical backplane.
•Confirm all modules and motherboards are completely seated or the captive screws are tightened securely to completely mate the optical fiber connectors to the backplane.
•Check the signal level at each input and output to check for too much or too little attenuation.
•Verify that the mux/demux modules and transponder modules are in their proper slots.
•Verify that you are using the proper east and west mux/demux motherboard.
Network and System Management
This section describes the network management tools available for the Cisco ONS 15540.
CiscoWorks 2000 supports a suit of network management applications of which the following are supported on the Cisco ONS 15540:•CiscoView
•DFM (Device Fault Manager)
CiscoView
CiscoView is a device management application providing dynamic status, monitoring, and configuration information for a range of Cisco internetworking products including the
Cisco ONS 15540. CiscoView displays a physical view of a device chassis, with color-coding of modules and ports for at-a-glance status. Monitoring capabilities display performance and other statistics. Configuration capabilities allow changes to devices if security privileges are granted.Cisco ONS 15540 is supported by Embedded CiscoView and server based CiscoView. Online help for CiscoView is available for the server based CiscoView.
DFM
DFM reports faults that occur on Cisco devices, often identifying fault conditions before users of network services realize that the condition exists. DFM analysis technology differs from the traditional rules-based approach to event analysis. DFM analysis uses a top-down approach that starts by identifying the fault conditions that affect managed systems and are important to identify and analyze. Each fault condition causes a set of symptoms—a problem signature—that occurs within the faulty element and in related elements. DFM creates a causality mapping between the fault conditions and the symptoms. After the fault conditions and their symptoms are identified, this information is coded in the analysis model.
Because the event information necessary to diagnose fault conditions is present in the analysis model, DFM monitors only the events necessary to diagnose the condition. DFM simplifies event analysis: there are no rules to write and the analysis model guarantees that critical fault conditions are always identified.
DFM can operate as an independent management system or can integrate with existing management applications to add fault management to the functionality already in place.
For troubleshooting information relating to security implementations and information about configuring and using TACACS+, refer to the Cisco IOS Security Configuration Guide and
Cisco IOS Security Command Reference.Third Party Troubleshooting Tools
In many situations, third-party troubleshooting tools can be helpful. For example, attaching an optical analyzer to a network is less intrusive than using the debug commands, which are processor intensive.
Here are some typical third-party tools used for troubleshooting internetworks:
•Optical cleaning kit—Keeps your optical cable connections clean. This should be in every tool kit that has anything to do with optical equipment. Several problems you encounter will typically be associated with dirty cables.
•Optical power meter—Measures the optical power coming from and going into a piece of equipment. This is the standard operating procedure for installing and troubleshooting optical equipment. Your optical power meter must be able to measure signals at 850 nm, 1310 nm, and 1550 nm.
Note Optical power meters need to be recalibrated once per year.
•TDR (time domain reflectometer)—Locates open circuits, short circuits, crimps, kinks, sharp bends, impedance mismatches, and other defects in metallic cables. A TDR reflects a signal off the end of the cable. Opens, shorts, and other problems reflect back the signal at different amplitudes, depending on the problem. A TDR measures the time it takes for the signal to reflect and calculates the distance to a fault in the cable. TDRs can also measure the length of a cable, and some TDRs can calculate the rate of propagation based on a configured cable length.
•OTDR (optical time domain reflector/reflectometer)—Checks end-to-end loss and detects fiber breaks, splice points in the optical fiber, and fiber attenuation. This tool is essential for initial network startup and later troubleshooting fiber breaks.
•BERT (bit error rate tester)—Tests OC-3, OC-12, and OC-48 ports for end connectivity of the wavelength if the client equipment is not yet available. BERT usually has a built-in power meter to test optical power of the circuit.
•Fiber microscope—Checks the fiber interface for dirt or anything else that could degrade the optical connection.
•Patch cables— Loops back the trunk side. You should keep an assortment of multimode and single-mode patch cables with you, including 1550 nm SM trunk side cables with MU-to-SC interfaces and SC-to-SC coupler. Use attenuators as needed.
•Fixed attenuators—Adds fixed attenuation levels to connections. Five attenuators with 5 dB at
1310 nm and five with 10 dB at 1310 nm, are a good start.•Spectrum analyzer—Views the channel spectrum or analyzes light according to wavelength. It is useful when you suspect channel cross talk and for certifying equipment and performing periodic laser tests for stability.
•Network monitors—Tracks packets crossing a network, providing an accurate picture of network activity. Network monitors do not decode the contents of frames. They are useful for creating a baseline of normal performance. Monitors collect information such as packet sizes, the number of packets, error packets, overall usage of a connection, the number of hosts and their MAC addresses, and details about communications between hosts and other devices. This data can be used to create profiles of LAN traffic and assist in locating traffic overloads, planning for network expansion, detecting intruders, and distributing traffic more efficiently.
Using General Diagnostic Commands
You can use the show, debug, ping, and traceroute commands to monitor and troubleshoot your internetwork.
show Commands
You can use the show commands to perform many functions such as the following:
•Monitors the behavior of your Cisco ONS 15540 during initial installation
•Monitors normal network operation
•Isolates problem interfaces, nodes, media, or applications
•Determines when a network is congested
•Determines the status of servers, clients, or other neighbors
Table 1-1 lists some of the most commonly used show commands:
For more information about show commands, refer to the
Cisco ONS 15540 ESP Configuration Guide and Command Reference and the
Cisco IOS Configuration Fundamentals Command Reference publication.debug Commands
The debug privileged EXEC commands provide information about the traffic on (or not seen) on an interface, error messages generated by nodes on the network, protocol-specific diagnostic packets and cells, and other useful troubleshooting data.
Caution Be careful when using debug commands. Many of these commands are processor intensive and can cause serious network problems (such as degraded performance or loss of connectivity) if they are enabled on an already heavily loaded system. When you finish using a debug command, remember to disable it with its specific no debug command (or use the no debug all command to turn off all debugging).
In many situations, third-party diagnostic tools can be more useful and less intrusive than using debug commands. See the "Third Party Troubleshooting Tools" section.
ping Commands
To check host reachability and network connectivity, use the ping user EXEC or privileged EXEC command. This command can be used to confirm basic network connectivity on IP networks.
For IP, the ping command sends ICMP (Internet Control Message Protocol) echo messages. If a station receives an ICMP echo message, it sends an ICMP echo reply message back to the source.
Using the extended command mode of the ping privileged EXEC command, you can specify the supported IP header options, which allow the Cisco ONS 15540 to perform a more extensive range of test options. To enter ping extended command mode, enter the ping command at the command prompt followed by a return.
To see how the command works under normal conditions, use the ping command when the network is functioning properly. When you are troubleshooting, you can then see the difference between normal and abnormal operation.
For detailed information about using the ping and extended ping commands, refer to the
Cisco IOS Configuration Fundamentals Command Reference publication.traceroute Command
The traceroute user EXEC command discovers the routes packets follow when traveling to their destinations. With the traceroute privileged EXEC command, the supported IP header options are specified, and the Cisco ONS 15540 can perform a more extensive range of test options.
The traceroute command works by using the error message generated by a Cisco ONS 15540 when a datagram exceeds its TTL (Time-To-Live) value. First, probe datagrams are sent with a TTL value of one. This causes the first Cisco ONS 15540 to discard the probe datagrams and send back
time exceeded
error messages. The traceroute command then sends several probes, and displays the round-trip time for each. After every third probe, the TTL increases by one.Each outgoing packet can result in one of two error messages. A
time exceeded
error message indicates that an intermediate Cisco ONS 15540 has seen and discarded the probe. Aport unreachable
error message indicates that the destination node has received the probe and discarded it because it could not deliver the packet to an application. If the timer goes off before a response comes in, the traceroute command displays an asterisk (*).The traceroute command terminates when the destination responds, when the maximum TTL is exceeded, or when the user interrupts the traceroute command with the escape sequence.
To see how the command works under normal conditions, use the traceroute command when the network is functioning properly. When you are troubleshooting, you can then see the difference between normal and abnormal operation.
For detailed information about using the traceroute command, refer to the Cisco ONS 15540 ESP Configuration Guide and Command Reference. For additional information on using debug commands refer to the Cisco IOS Debug Command Reference.
Online Diagnostics
This section describes the online diagnostics available for troubleshooting your Cisco ONS 15540. Online diagnostics provide the following types of tests:
•Accessibility tests between the processor and the modules.
•OIR (online insertion and removal) diagnostic tests.
The Cisco ONS 15540 displays an error message on the console when it detects a hardware failure or problem.
Note Online diagnostic tests only run on the active processor.
Accessibility Test
The accessibility tests ensure connectivity, at a configurable interval, between the following:
•Mux/demux modules
•Mux/demux motherboards
•Transponder modules
•Line card motherboards
•Active processor card
•Standby processor card, if it is present
OIR Test
OIR tests check the functioning of the processor and interfaces on a per-port basis. The processor performs these tests when the system boots up and when you insert a module or motherboard into a slot. The OIR test sends a packet to the interface loopback and expects to receive it within a certain time period. If the packet does not reach the port within the expected time period, or the received packet is corrupted, an error is registered and the port is changed to an administratively down state. Packets that are 1000 bytes in size are used in the test.
Configuring Online Diagnostics
To configure online diagnostics, use the following global configuration commands:
Examples
The following example shows how to enable all online diagnostic tests:
Switch# diag online
The following example shows how to enable online diagnostic tests for the components in slot 3:
Switch# diag online slot 3
The following example shows how to enable debugging for online diagnostics:
Switch# debug diag online
Displaying the Online Diagnostics Configuration and Results
To display the online diagnostics configuration and results, use the following EXEC command:
Command Purposeshow diag online [detail | slot slot]
Displays information about the online diagnostic tests and the test results.
Example
The following example shows how to display detailed access test information:
Switch# show diag online
---------------------------------------------------
Online Diagnostics was DISABLED at 0 minutes
This information is the LAST status before disabling
----------------------------------------------------
Online Diagnostics Current Summary Information
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
On ACTIVE CPU card Slot: 6
CPU Uptime: 19 hours, 58 minutes
Slot CardType Enabled Bootup/ Periodic Previous
Insertion Background Failures
tests tests
~~~~~~ ~~~~~~~~~~~~~~ ~~~~~~~ ~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~
0/*/* Mx-DMx-Mthrbd Yes Pass Pass No
1/*/* Mx-DMx-Mthrbd Yes Pass Pass No
2/*/* XpndrMotherboard Yes Pass Pass No
2/ 0/* NPlugXpndrMonitor Yes Pass Pass No
2/ 1/* NPlugXpndrMonitor Yes Pass Pass No
2/ 2/* NPlugXpndrMonitor Yes Pass Pass No
3/*/* XpndrMotherboard Yes Pass Pass No
3/ 0/* NPlugXpndrMonitor Yes Pass Pass No
3/ 1/* NPlugXpndrMonitor Yes Pass Pass No
3/ 2/* NPlugXpndrMonitor Yes Pass Pass No
3/ 3/* NPlugXpndrMonitor Yes Pass Pass No
4/*/* XpndrMotherboard Yes Pass Pass No
4/ 0/* NPlugXpndrMonitor Yes Pass Pass No
6/*/* Queens CPU Yes Pass Pass No
Example
The following example shows how to display diagnostic test status and details:
Switch# show diag online details
Online Diagnostics Detailed Information
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
On ACTIVE CPU card Slot: 6
CPU Uptime: 19 hours, 58 minutes
___________________________________________________________________________
Slot[0]:Mx-DMx-Mthrbd
Enabled: Yes
Online Insertion Tests
Slot CardType TestType Status LastRunTime LastFailTime
~~~~~~ ~~~~~~~~~~~~~~ ~~~~~~~~~ ~~~~~~ ~~~~~~~~~~~ ~~~~~~~~~~~~~
0/*/* Mx-DMx-Mthrbd lrcAccess Pass 0 minutes never
idpromAccess Pass
Online Background Tests
Slot CardType TestType Status LastRunTime LastFailTime
~~~~~~ ~~~~~~~~~~~~~~ ~~~~~~~~~ ~~~~~~ ~~~~~~~~~~~ ~~~~~~~~~~~~~
0/*/* Mx-DMx-Mthrbd lrcAccess Pass19 hours, 58 never
idpromAccess Pass
___________________________________________________________________________
Slot[1]:Mx-DMx-Mthrbd
Enabled: Yes
Online Insertion Tests
Slot CardType TestType Status LastRunTime LastFailTime
~~~~~~ ~~~~~~~~~~~~~~ ~~~~~~~~~ ~~~~~~ ~~~~~~~~~~~ ~~~~~~~~~~~~~
1/*/* Mx-DMx-Mthrbd lrcAccess Pass 0 minutes never
idpromAccess Pass
Online Background Tests
Slot CardType TestType Status LastRunTime LastFailTime
~~~~~~ ~~~~~~~~~~~~~~ ~~~~~~~~~ ~~~~~~ ~~~~~~~~~~~ ~~~~~~~~~~~~~
1/*/* Mx-DMx-Mthrbd lrcAccess Pass19 hours, 58 never
idpromAccess Pass
___________________________________________________________________________
Slot[2]:XpndrMotherboard
Enabled: Yes
Online Insertion Tests
Slot CardType TestType Status LastRunTime LastFailTime
~~~~~~ ~~~~~~~~~~~~~~ ~~~~~~~~~ ~~~~~~ ~~~~~~~~~~~ ~~~~~~~~~~~~~
2/*/* XpndrMotherboard lrcAccess Pass 0 minutes never
idpromAccess Pass
2/ 0/* NPlugXpndrMonitor scAccess Pass 0 minutes never
idpromAcces Pass
2/ 1/* NPlugXpndrMonitor scAccess Pass 0 minutes never
idpromAcces Pass
Slot CardType TestType Status LastRunTime LastFailTime
~~~~~~ ~~~~~~~~~~~~~~ ~~~~~~~~~ ~~~~~~ ~~~~~~~~~~~ ~~~~~~~~~~~~~
2/ 2/* NPlugXpndrMonitor scAccess Pass 0 minutes never
idpromAcces Pass
Online Background Tests
Slot CardType TestType Status LastRunTime LastFailTime
~~~~~~ ~~~~~~~~~~~~~~ ~~~~~~~~~ ~~~~~~ ~~~~~~~~~~~ ~~~~~~~~~~~~~
2/*/* XpndrMotherboard lrcAccess Pass19 hours, 58 never
idpromAccess Pass
2/ 0/* NPlugXpndrMonitor scAccess Pass19 hours, 58 never
idpromAcces Pass
2/ 1/* NPlugXpndrMonitor scAccess Pass19 hours, 58 never
idpromAcces Pass
2/ 2/* NPlugXpndrMonitor scAccess Pass19 hours, 58 never
idpromAcces Pass
___________________________________________________________________________
Slot[3]:XpndrMotherboard
Enabled: Yes
Online Insertion Tests
Slot CardType TestType Status LastRunTime LastFailTime
~~~~~~ ~~~~~~~~~~~~~~ ~~~~~~~~~ ~~~~~~ ~~~~~~~~~~~ ~~~~~~~~~~~~~
3/*/* XpndrMotherboard lrcAccess Pass 0 minutes never
idpromAccess Pass
3/ 0/* NPlugXpndrMonitor scAccess Pass 0 minutes never
idpromAcces Pass
3/ 1/* NPlugXpndrMonitor scAccess Pass 5 minutes never
idpromAcces Pass
3/ 2/* NPlugXpndrMonitor scAccess Pass 0 minutes never
idpromAcces Pass
3/ 3/* NPlugXpndrMonitor scAccess Pass 0 minutes never
idpromAcces Pass
Online Background Tests
Slot CardType TestType Status LastRunTime LastFailTime
~~~~~~ ~~~~~~~~~~~~~~ ~~~~~~~~~ ~~~~~~ ~~~~~~~~~~~ ~~~~~~~~~~~~~
3/*/* XpndrMotherboard lrcAccess Pass19 hours, 58 never
idpromAccess Pass
3/ 0/* NPlugXpndrMonitor scAccess Pass19 hours, 58 never
idpromAcces Pass
3/ 1/* NPlugXpndrMonitor scAccess Pass19 hours, 58 never
idpromAcces Pass
3/ 2/* NPlugXpndrMonitor scAccess Pass19 hours, 58 never
idpromAcces Pass
3/ 3/* NPlugXpndrMonitor scAccess Pass19 hours, 58 never
idpromAcces Pass
___________________________________________________________________________
Slot[4]:XpndrMotherboard
Enabled: Yes
Slot CardType TestType Status LastRunTime LastFailTime
~~~~~~ ~~~~~~~~~~~~~~ ~~~~~~~~~ ~~~~~~ ~~~~~~~~~~~ ~~~~~~~~~~~~~
Online Insertion Tests
Slot CardType TestType Status LastRunTime LastFailTime
~~~~~~ ~~~~~~~~~~~~~~ ~~~~~~~~~ ~~~~~~ ~~~~~~~~~~~ ~~~~~~~~~~~~~
4/*/* XpndrMotherboard lrcAccess Pass 0 minutes never
idpromAccess Pass
4/ 0/* NPlugXpndrMonitor scAccess Pass 0 minutes never
idpromAcces Pass
Online Background Tests
Slot CardType TestType Status LastRunTime LastFailTime
~~~~~~ ~~~~~~~~~~~~~~ ~~~~~~~~~ ~~~~~~ ~~~~~~~~~~~ ~~~~~~~~~~~~~
4/*/* XpndrMotherboard lrcAccess Pass19 hours, 58 never
idpromAccess Pass
4/ 0/* NPlugXpndrMonitor scAccess Pass19 hours, 58 never
idpromAcces Pass
___________________________________________________________________________
Slot[6]:Queens CPU
Enabled: Yes
Online Insertion Tests
Slot CardType TestType Status LastRunTime LastFailTime
~~~~~~ ~~~~~~~~~~~~~~ ~~~~~~~~~ ~~~~~~ ~~~~~~~~~~~ ~~~~~~~~~~~~~
6/*/* Queens CPU srcStatus Pass 0 minutes never
PCIAccess Pass
PCMCIAAccess Pass
Online Background Tests
Slot CardType TestType Status LastRunTime LastFailTime
~~~~~~ ~~~~~~~~~~~~~~ ~~~~~~~~~ ~~~~~~ ~~~~~~~~~~~ ~~~~~~~~~~~~~
6/*/* Queens CPU srcStatus Pass19 hours, 58 never
PCIAccess Pass
PCMCIAAccess Pass
Posted: Wed Mar 22 04:56:05 PST 2006
All contents are Copyright © 1992--2006 Cisco Systems, Inc. All rights reserved.
Important Notices and Privacy Statement.