Developing UCP Network Topology

A fault-tolerant system is one that operates continuously with acceptable results even in the event of unexpected failures. A fault-tolerant system must be able to:

Detect a fault

Perform the normal operation and obtain acceptable results after one or more system components have failed
Report the problem

Fault tolerance in UCP:

Relies on the redundancy (duplicate instances) of UCP services. Client services must ensure that backup services are selected in the event of a service failure.
Requires two instances of the same service running simultaneously.

The NOC services such as DNS, ActiveWeb Information Broker, the mother cache, and the directory store) will cover a number of POPs. Failure in any of these components will impact a much larger portion of a network. For this reason, Cisco recommends the installation of high-end Sun workgroup systems or small-to-medium-sized Sun Ultra Enterprise systems at the central or NOC locations.

Load Balancing for Improved Performance and Reliability

The UCP architecture permits two instances of the same service to coexist so loads can be balanced for more efficient processing.

When trying to reach an external service, UCP components configured for load balancing (such as the Protocol Gateway Service) automatically adjust the load by accessing two external service components. UCP maintains a hashed table of entries sent to various configured external service components. It periodically ranks the external components based on response times, the number of milliseconds that packets wait in a queue to access the external service, the number of timeouts generated from an external service, and so on.

The external service with the highest rank becomes the primary external service. As this service slows down, the client program will shift the load to the other service.

The following sections discuss redundancy on a service-by-service basis.

Control Adapter

The Control Adapter (CA) on each system can be configured to watch the various services that are running and restart any when it detects the failure of one to report the restart to the NCC via an exception event. If repeated restarts fail, the CA reports a restart failure using an exception event.

Protocol Gateway Service

Cisco recommends that two or more instances of the Protocol Gateway Service (PGS) be running at all times. In the event of a fault or failure, the PGS is first expected to respond gracefully and report the problem; in case of a failure, a dependent component or subsystem must be able to use the backup PGS without any interruption or loss of data by UCP.

To withstand hardware failure, PGS redundancy is required.

CiscoSe cure Access Control Server Authentication and Authorization

Cisco recommends that two instances of the CiscoSecure Access Control Server (ACS) be running at all times. Each PGS is configured to recognize each instance of a CiscoSecure ACS. The PGS records and maintains each CiscoSecure ACS processing time for each packet. A failure in one instance will cause the PGS to use other instances of CiscoSecure ACS.

Dynamic Host C onfiguration Protocol

Cisco recommends that two or more instances of the Dynamic Host Configuration Protocol (DHCP) service be configured on a local POP. Each instance will control a subset of all available IP addresses represented in one or more IP pools. Each instance of a DHCP server must support all possible user type pools.

Note IP pools in the two DHCP services should not include the same IP address range.

ActiveWeb Information Broker

Multiple instances of the ActiveWeb Information Broker (IB) cannot be used with the same UCP servers and services. There should be two IB servers for one or more POPs on the network---one dedicated to processing heartbeat events, and the other to process the remainder of event types for all publishers and subscribers in the network.

Active Software does not support redundancy for the IB. Therefore, a hardware failure in the system will halt traffic on the IB until another broker replaces the down server. Cisco recommends either:

Using a reliable hardware platform
Configuring another broker system with the same configuration parameters as the working one and replacing the down server with this second one

Because UCP services and servers can be configured with the IP address of only one IB, the replacement system should have the same IP address, and should be connected to the same subnet as the down server. The impact on UCP operation is the loss of events in the IB volatile memory and any events sent by publishers while the down server is replaced with another.

Mother Cache

Two or more instances of the mother cache are required to support fault tolerance. All instances of mother cache services subscribe to all event types that contain mother cache updates, but only one instance is configured to publish update events to the local caches. The backup mother caches monitor the primary mother cache; if the primary mother cache fails, one of the backup servers reconfigures itself to publish events and assumes the role of the primary mother cache.

Local (POP-Level) Caches

At a local POP, you can configure UCP to run two cache services. To do so, you must configure all instances of UCP caches to subscribe to the same event types. All services that interact with a cache outside the information bus must be configured with a list of cache services at UCP installation time. If the primary cache fails, a timeout on the client component (the CiscoSecure ACS translator) causes the client to resend the request to the next cache service in line.

Do main Name System

UCP uses a primary and secondary DNS server; both subscribe to the same events, allowing each server to be a backup for the other.

Illustrated Network Topologies

The network topology diagrams on the following pages are provided to illustrate environments into which UCP can be integrated:

Figure 2-1: LANs with Redundancy (ADSL Environment)

Figure 2-2: LANs without Redundancy (ASDL Environment)

Figure 2-3: LANs with Redundancy (Dial Environment)

Figure 2-4: LANs without Redundancy (Dial Environment)

Table of Contents

Developing UCP Network Topology