Two Data Center Architecture

The two data center architecture is based on a standard Serviceguard configuration with half of the nodes in one data center, and the other half in another data center. Nodes can be located in separate data centers in the same building, or even separate buildings within the limits of FibreChannel technology. Configurations with two data centers have the following requirements:

There must be an equal number of nodes (1 or 2) in each data center.
In order to maintain cluster quorum after the loss of an entire data center, you must configure dual cluster lock disks (one in each data center). Since cluster lock disks are only supported for up to 4 nodes, the cluster can contain only 2 or 4 nodes. The Serviceguard Quorum Server cannot be used in place of dual cluster disks, as the Quorum Server must reside in a third data center. Therefore, a three data center cluster is a preferable solution, if dual cluster lock disks cannot be used, or if the cluster must have more than 4 nodes. When using dual cluster lock disks, there exists a chance of Split Brain Syndrome (where the nodes in each data center form two separate clusters, each with exactly one half of the cluster nodes) if all communication between the two data centers is lost and all nodes remain running.
The Serviceguard Quorum Server prevents the possibility of split brain, however the Quorum Server must reside in a third site. Therefore a three data center cluster is a preferable solution, to prevent split brain, and the only solution if dual cluster lock disks cannot be used, or if the cluster must have more than 4 nodes.
Two data center configurations are not supported if SONET is used for the cluster interconnects between the Primary data centers.
To protect against the possibility of a split cluster inherent when using dual cluster locks, at least two (three preferred) independent paths between the two data centers must be used for heartbeat and cluster lock I/O. Specifically, the path from the first data center to the cluster lock at the second data center must be different than the path from the second data center to the cluster lock at the first data center. Preferably, at least one of the paths for heartbeat traffic should be different from each of the paths for cluster lock I/O.
No routing is allowed for the networks between data centers.
MirrorDisk/UX mirroring for LVM and VxVM mirroring are supported for clusters of 2 or 4 nodes. However, the dual cluster lock devices can only be configured in LVM Volume Groups.
There can be separate networking and FibreChannel links between the two data centers, or both networking and Fibre Channel can go over DWDM links between the two data centers. See the section below “Network and Data Replication Links Between the Data Centers” for more details.
CVM 3.5 and CVM 4.1 mirroring is supported for Serviceguard and Extended Cluster for RAC clusters. However, the dual cluster lock devices must still be configured in LVM Volume Groups. Since cluster lock disks are only supported for up to 4 nodes, the cluster can contain only 2 or 4 nodes.
MirrorDisk/UX mirroring for Shared LVM volume groups is supported for Extended Cluster for RAC clusters containing 2 nodes.
FibreChannel Direct Fabric Attach (DFA) is recommended over FibreChannel Arbitrated loop configurations, due to the superior performance of DFA, especially as the distance increases. Therefore Fibre Channel switches are preferred over Fibre Channel hubs.
Any combination of the following FibreChannel capable disk arrays may be used: HP StorageWorks Virtual Arrays, HP StorageWorks Disk Array XP, Enterprise Virtual Arrays (EVA) or EMC Symmetrix Disk Arrays. Refer to the HP Configuration Guide (available through your HP representative) for a list of supported FibreChannel hardware.
Application data must be mirrored between the primary data centers. If MirrorDisk/UX is used, Mirror Write Cache (MWC) must be the Consistency Recovery policy defined for all mirrored logical volumes. This will allow for resynchronization of stale extents after a node crash, rather than requiring a full resynchronization. For SLVM (concurrently activated) volume groups, Mirror Write Cache must not be defined as the Consistency Recovery policy for mirrored logical volumes (that is, NOMWC must be used). This means that a full resynchronization may be required for shared volume group mirrors after a node crash, which can have a significant impact on recovery time. To ensure that the mirror copies reside in different data centers, it is recommended to configure physical volume groups for the disk devices in each data center, and to use Group Allocation Policy for all mirrored logical volumes.
Due to the maximum of 3 images (1 original image plus two mirror copies) allowed in MirrorDisk/UX, if JBODs are used for application data, only one data center can contain JBODs while the other data center must contain disk arrays with hardware mirroring. Note that having three mirror copies will affect performance on disk writes. VxVM and CVM 3.5 mirroring does not have a limit on the number of mirror copies, so it is possible to have JBODS in both data centers, however increasing the number of mirror copies may adversely affect performance on disk writes.
Veritas Volume Manager (VxVM) from mirroring is supported for distances of up to 100 kilometers for clusters of 16 nodes. However, VxVM supports up to 10 kilometers for clusters of 16 nodes on supported versions of HP-UX. Ensure that the mirror copies reside in different data centers and the DRL (Dirty Region Logging) feature is used. Raid 5 mirrors are not supported. It is important to note that the data replication links between the data centers VxVM can only perform a full resynchronization (that is, it cannot perform an incremental synchronization) when recovering from the failure of a mirror copy or loss of connectivity to a data center. This can have a significant impact on performance and availability of the cluster if the disk groups are large.
Veritas CVM version 3.5 mirroring is supported for Serviceguard, Serviceguard OPS Edition, or Serviceguard Extension for RAC clusters (SGeRAC) for distances up to 10 kilometers for 2, 4, 6, or 8 node clusters, and up to 100 kilometers for 2 node clusters.
Since CVM 3.5 does not support multiple heartbeats and allows only one heartbeat network to be defined for the cluster, you must make the heartbeat network highly available, using a standby LAN to provide redundancy for the heartbeat network. The heartbeat subnet should be a dedicated network, to ensure that other network traffic will not saturate the heartbeat network. The CVM Mirror Detachment Policy must be set to “Global”. CVM 4.1 supports multiple heartbeat subnets.
For clusters using Veritas CVM 3.5, only a single heartbeat subnet is supported, so it is required to have both Primary and Standby LANs configured for the heartbeat subnet on all nodes. For SGeRAC clusters, it is recommended to have an additional network for Oracle RAC cache fusion traffic. It is acceptable to use a single Standby network to provide backup for both the heartbeat network and the RAC cache fusion network, however it can only provide failover capability for one of these networks at a time.
Serviceguard Extension for Faster Failover (SGeFF) is not supported in a two data center architecture, which requires a two-node cluster and the use of a quorum server. For more detailed information on SGeFF, refer to the Serviceguard Extension for Faster Failover Release Notes and the “ Optimizing Failover Time in a Serviceguard Environment” white paper.




	NOTE: Refer to Table 1-2 “Supported Distances Extended Distance Cluster Configurations” for the maximum supported distances between data centers for Extended Distance Cluster configurations. For more detailed configuration information on Extended Distance Cluster, refer to the HP Configuration Guide (available through your HP representative). For the most up-to-date support and compatibility information see the SGeRAC for SLVM, CVM & CFS Matrix and Serviceguard Compatibility and Feature Matrix on `http://docs.hp.com` -> High Availability -> Serviceguard Extension for Real Application Cluster (ServiceGuard OPS Edition) -> Support Matrixes

Two Data Center FibreChannel Implementations

FibreChannel Using Hubs

In a two data center configuration, shown in Figure 2-1 “Two Data Centers with FibreChannel Hubs”, it is required to use a cluster lock disk, which is only supported for up to 4 nodes. This configuration can be implemented using any HP-supported FibreChannel devices. Disks must be available from all nodes using redundant links. Not all links are shown in Figure 2-1 “Two Data Centers with FibreChannel Hubs”.

Figure 2-1 Two Data Centers with FibreChannel Hubs

The two cluster lock disks should be located on separate FibreChannel loops to guard against single point of failure. The lock disks can also be used as data disks. They must be connected to all nodes using redundant links (not all links are shown in Figure 2-1 “Two Data Centers with FibreChannel Hubs”).

Nodes can connect to disks in the same data center using short wave ports, and hubs can connect between data centers using long-wave ports. This gives you a maximum distance of 10 kilometers between data centers, making it possible to locate data centers in different buildings.

FibreChannel Using Switches

The two data center architecture is also possible over longer distances using FibreChannel switches. Figure 2-2 “Two Data Centers with FibreChannel Switches and FDDI” is one example of a switched two data center configuration using FibreChannel and FDDI networking.

Figure 2-2 Two Data Centers with FibreChannel Switches and FDDI

DWDM with Two Data Centers

Figure 2-3 “Two Data Centers with DWDM Network and Storage” is an example of a two data center configuration using DWDM for both storage and networking.

Figure 2-3 Two Data Centers with DWDM Network and Storage

Advantages and Disadvantages of a Two Data Center Architecture

The advantages of a two data center architecture are:

Lower cost.
Only two data centers are needed, meaning less space and less coordination between operations staff.
No arbitrator nodes are needed.
All systems are connected to both copies of data, so that if a primary disk fails but the primary system stays up, there is a greater availability because there is no package failover.

The disadvantages of a two data center architecture are:

There is a slight chance of split brain syndrome. Since there are two cluster lock disks, a split brain syndrome would occur if the following happened simultaneously:
- All heartbeat networks fail.
- All disk links fail. The disk link from data center A to cluster lock disk B fails (see Figure 2-1 “Two Data Centers with FibreChannel Hubs”.), and the disk link from data center B to cluster lock disk A fails.
The chances are slight, however these events happening at the same time would result in split brain syndrome and probable data inconsistency. Planning different physical routes for both network and data connections or adequately protecting the physical routes greatly reduces the possibility of split brain syndrome.
Software mirroring increases CPU overhead.
The cluster must be either two or four nodes with cluster lock disks. Larger clusters are not supported due to cluster lock requirements.
Although it is a low cost solution, it does require some additional cost:
- FibreChannel links are required for both local and remote connectivity.
- All systems must be connected to multiple copies of the data and to both cluster lock disks.

Two Data Center Architecture

Technical documentation

» Table of Contents

» Glossary

» Index

Two Data Center FibreChannel Implementations

FibreChannel Using Hubs

FibreChannel Using Switches

DWDM with Two Data Centers

Advantages and Disadvantages of a Two Data Center Architecture