|
The chapter reviews basic corrective maintenance procedures for the VCO/4K system. It covers the hierarchy of possible causes for system malfunctions and the diagnostic tools available. It also includes system reset procedures for nonredundant and redundant systems. The chapter discusses Cisco Systems' repair-by-replacement policy and the concept of field replaceable units. The chapter ends with flowcharts that aid in resolving host communications issues.
Fault isolation involves identifying a problem, analyzing its cause, and applying the appropriate solution. VCO/4K systems incorporate extensive error messaging and logging facilities which help in the identification process. Problems tend to have multiple causes which must be identified, individually analyzed, tested, and confirmed. Repair attempts that simply replace components on a hit-or-miss basis usually mask rather than resolve actual causes of system malfunctions.
Because the VCO/4K functions as a server to a host computer, fault isolation must also take into account the state of the host computer and its application software at the time a fault is discovered. Troubleshooting thus requires knowledge of the host computer system, the diagnostic capabilities of the application software, the error logging and diagnostic capabilities of the VCO/4K, and basic telephone network test and service procedures.
The "Maintenance Aids" section briefly describes the system log of error and status messages, available through the VCO/4K system administration menus. Daily review of system logs reveals clues as to possible system problems. However, a true indication of overall system performance requires a history of system performance.
Cisco recommends that you keep the daily printed output of error and status logs for a month. The logs record all system error and status messages output by the system. They provide an excellent history of performance problems and maintenance activities requiring system reinitialization.
Note To assure that a continuous hard copy record of the system error log is always available, Cisco recommends not turning off (deselecting or powering off) the system printer except for maintenance purposes. You can also write system log files to either floppy or hard disk for later use depending on the File System Configuration screen selections you make (refer to the Cisco VCO/4K System Administrator's Guide for more information). |
When performance monitoring indicates a system problem, you should compare the symptoms of the problem against a possible hierarchy of causes. Except for the human error factor, the causes of a system malfunction are either external or internal.
External causes of malfunctions include:
Internal causes of malfunctions include:
VCO/4K software includes diagnostic tools to help isolate the possible causes of a problem. These tools include: error logs, status LEDs, alarm conditions, and administrative maintenance and diagnostic routines.
The diagnostic tools must be complemented by diagnostic routines incorporated into the host application software. The VCO/4K command set includes support for the development of host-controlled diagnostics, including the ability to remove ports from service, monitor card status, and initiate alarms. Thus, the host can trigger events in the VCO/4K that can have the effect of placing portions of the system out of service. Replacing cards and performing other corrective maintenance procedures does not cure a fault caused by the host application.
The most likely cause of a system malfunction remains the human factor. Failure to follow recommended procedures for installing, programming, and maintaining the system, results in problems which can sometimes be very difficult to trace.
The VCO/4K is a system of integrated components. Its operation depends on office data entered into the system database. It is coupled to external CO facilities through a main distribution frame that should be carefully mapped and updated as changes are made to the system configuration.
The technical documentation set contains information about, and organizational tools for, installing and maintaining a VCO/4K system. Technicians responsible for maintaining the system should be thoroughly familiar with the following documents:
Technicians should also obtain copies of the documentation set for the host computer system and its application software package. Knowledge of communication protocols and the I/O interface to the VCO/4K is also important.
This section presents a hierarchy of external causes of malfunctions. Causes appear in the order they are most likely to occur.
When individual CO interface circuits fail, calls are blocked from obtaining service or completing a connection to the terminating number. Traffic reports log the loss of service.
When a block of interface circuits fails, the problem is usually the failure of a VCO/4K interface card. The exception to this general rule is the failure of a digital span, which causes the loss of up to 24 channels. A digital span can be lost at the channel bank, the digital switch, or at its interface point with the VCO/4K.
Because the VCO/4K acts as a peripheral device connected to the host, any hardware or software problems occurring at the host translate into problems with the VCO/4K. Such problems can manifest themselves in the following ways:
The host application must be able to generate its own error messages. This is particularly true whenever the host issues a command to trigger an alarm on the VCO/4K. Such alarms are usually the result of a failure (in call processing or communications) detected by the host application software. A detailed error message should indicate why the alarm was triggered so that you can quickly isolate and remedy the cause.
Problems with peripheral equipment can cause the following operational failures in a VCO/4K system:
The principal causes of problems related to peripheral equipment are improper installation, improper cabling, and/or loss of setup parameters. The Cisco VCO/4K Hardware Installation Guide specifies the cabling and setup parameters required for interface with the VCO/4K. Users must enter peripheral operating parameters in the system database through the Peripheral Configuration screen (refer to the Cisco VCO/4K System Administrator's Guide for instructions). These parameters must match the setup parameters defined at the peripheral (refer to the OEM documentation supplied with the peripheral for setup instructions).
VDTs usually experience keyboard and monitor problems because of frequent use. Printer mechanisms wear out over time, and modems can be damaged by line surges over power or CO connections.
Loss of input power to the power entry module results in failure of the VCO/4K. Intermittent power surges and sags, as well as induced noise, can produce the following problems:
This section presents a hierarchy of internal causes of malfunctions. Causes appear in the order they are most likely to occur.
Problems with the database can result in the following:
Tracing database problems requires a very detailed examination of database entries across all of the individual menus associated with a potential problem.
Bus errors can occur as follows:
These occurrences display error messages identifying the affected bus and cards.
Intermittent bus errors can be the result of:
Persistent bus errors can be a sign of:
CPU, memory, and peripheral interface problems can be traced to the Combined Controller Assembly (CPU and SWI). Combined Controller Assembly problems can be caused by improper jumper settings on the card, card failure, or bus faults. Problems associated with the Combined Controller include:
If the Combined Controller fails to establish communications with the NBC during initialization, the CPU performs a Phase 4 reboot. A message on the system console indicates that a reboot is beginning. If this series of events recurs, there could be a problem with SWI and NBC3 communications.
Mass storage problems are associated with read/write operations from or to the floppy or hard disk drive. The mass storage complex includes the Combined Controller, which houses the floppy drive, and the Storage/Control I/O module, where the hard drive is installed.
Mass storage problems cause the following events to occur:
Causes of hardware failures on individual circuit cards can be:
VCO/4K circuit cards include one or more PROMs. The PROMs contain coded firmware that interacts with the VCO/4K system software to control operation. Refer to the Cisco VCO/4K Card Technical Descriptions for the locations of PROMs on VCO/4K circuit cards. The system software release notes lists the firmware revision levels required on all circuit cards. The system does not function properly without the correct firmware.
If you experience system problems after loading the new system software or when replacing a circuit card, check for firmware compatibility. Always refer to the configuration information contained in the release notes. Obtain the correct firmware PROMs from Cisco Systems and install them on all affected circuit cards, including those held as spares.
The VCO/4K provides diagnostic tools to facilitate fault isolation. These tools consist of error and status logs, status LEDs, alarm conditions, and diagnostic test routines run from the system administration console.
The role of error and status logs in the fault isolation process is described in the "System Log" section. Remote maintenance access to log files allows Cisco technical support and/or administrators of multisystem installations to quickly review the recent performance of a system.
Note During periods of high traffic volume, remote maintenance by way of a modem might not be desirable. Modem access can overload the Combined Controller, causing calls to be dropped or lost. |
Note The operational status of LEDs on peripheral and specialized telecommunications equipment varies according to manufacturer. Review OEM manuals for detailed information. |
VCO/4K systems support an alarm condition scheme consistent with the alarm requirements described in Bellcore specification OTGR: Network Maintenance: Network Element.
System-wide alarm conditions are divided into four severity levelsfatal, critical, major, and minor. Fatal alarms cause a system switchover (in redundant systems) or a system reset (in nonredundant systems).
Critical, major, and minor alarm conditions require action to resolve the problem. Recovery from a major alarm may require component replacement and a controller reset, thus placing the system out of service. Minor alarms might require software and/or hardware changes before the condition is eliminated and the alarm is reset.
The host can set two additional auxiliary alarms by sending a Set/Reset Host Alarms ($C0 03) command. Refer to the Cisco VCO/4K Standard Programming Reference and Cisco VCO/4K Extended Programming Reference.
Alarm condition indicators appear:
Note The Audible Cutoff (Y/N) option on the System Alarms Display screen disables the Major Alarm LED indicator on the AAC as well as the external audible alarms. It does not clear the alarm condition. |
The Cisco VCO/4K Card Technical Descriptions and Cisco VCO/4K Mechanical Assemblies describe major and minor alarm conditions for individual circuit cards and subsystems.
The following system administration screens provide indications of system alarms:
The system log file provides information on general alarm conditions. It contains combination messages with both ALM and FRM prefix codes to indicate alarm conditions. These messages are written to the log file only at the initial occurrence of the alarm condition; similarly, messages are generated only for the clearing of the last occurrence of the alarm. In addition to these messages, an optional periodic alarm report can be written to the log file five minutes after system initialization and at 30-minute intervals thereafter. This option is activated or deactivated in the System Features screen (refer to the Cisco VCO/4K System Administrator's Guide).
The Cisco VCO/4K System Administrator's Guide discusses administration screens that display alarm conditions and system log file alarm messages.
The Alarm Condition ($F0) report notifies the host of alarms. This report provides the same level of information to the host as the System Alarms Display provides to the system administrator. Alarm codes within the report map to the same ALM alarm messages that appear on the System Alarms Display and in logfile messages. Refer to the Cisco VCO/4K Standard Programming Reference and Cisco VCO/4K Extended Programming Reference for a description of the $F0 report.
The Diagnostics menu offers the following options:
For a complete description of these functions and usage instructions, refer to the Cisco VCO/4K System Administrator's Guide.
The Cisco VCO/4K System Administrator's Guide and the Cisco VCO/4K Hardware Installation Guide provide detailed procedures for booting the system from hard or floppy disk. The following sections describe the maintenance implications of a system reset.
Resets are not required to service the Combined Controller (where the floppy drive resides) and/or the Storage/Control I/O module (where the hard drive resides), or to replace an NBC3.
An enhanced redundancy feature enables the standby controller to process the new SETUP redundancy information. Both the active and standby controllers consistently track all ports in a stable or setup state, as well as conference calls.
A standby controller can be serviced while the active controller maintains system operation.
Note To avoid an inadvertent reset or switchover between controllers, set the Select switch on the AAC to the active controller sidenot in the AUTO position. Return the Select switch to the AUTO position after you have completed servicing one side. |
Automatic synchronization utilities copy and restore files from the active to the standby controller prior to restoring the standby controller to service. You can reboot standby controllers from hard disk or floppy disk without disrupting system operation.
The following service circuit and trunk cards require a software download from hard disk prior to being brought into service:
During initial system power up (cold reset), the software downloads are broadcast simultaneously to each card type. The system is restored to operation after all downloads have been completed. If an individual downloadable circuit card is removed and replaced, it is selectively downloaded when its power-up sequence is completed before being activated.
The Cisco repair-by-replacement policy provides maximum system availability with minimum downtime. The technician can remove and replace field-replaceable units (FRUs) to bring the system back to normal operation as quickly as possible. Components removed from service can be returned to the factory for quick turnaround repair.
To maintain maximum system availability, Cisco encourages the customer to purchase spares of critical components to have on hand when a component failure is isolated and replacement is required. The Cisco VCO/4K Hardware Planning Guide lists spare components available from Cisco Systems.
The Cisco VCO/4K Hardware Planning Guide lists spare components that can be replaced in the field by trained technicians. It lists the recommended spares for the VCO/4K system. Items not in the list can only be serviced or replaced by the factory or by Cisco Systems field engineers.
For more information about troubleshooting, refer to the Cisco VCO/4K Troubleshooting Guide.
Posted: Sat Sep 28 16:48:43 PDT 2002
All contents are Copyright © 1992--2002 Cisco Systems, Inc. All rights reserved.
Important Notices and Privacy Statement.