Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
Arbitration For Data Integrity in Serviceguard Clusters: > Chapter 1 Arbitration for Data Integrity in Serviceguard Clusters

To Arbitrate or Not to Arbitrate


Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Index

Arbitration is not always used to determine cluster membership. Some cluster software products rely exclusively on the use of multiple cluster membership communication links (heartbeats). These algorithms are described in the following sections.

No Arbitration—Multiple Paths

Some approaches do not use arbitration, but instead rely on multiple membership paths to ensure that the heartbeat or essential intra-cluster communication remains unbroken. In this approach, the event of a node failing entirely is considered more likely than the event of several LAN paths all failing at the same time. Such systems assume that a loss of communication means a node failure, and packages are allowed to fail over when a loss of heartbeat is detected.

This model is illustrated in Figure 1 and Figure 2. In Figure 1, three separate LAN failures would be required to break communication between the cluster nodes. This assumes that hubs are separately powered, of course, and that other HA design criteria are met.

Figure 1-1 Multiple Heartbeat Failures

Multiple Heartbeat Failures

In Figure 2, on the other hand, a single node failure would result in the loss of heartbeat communication. In the no-arbitration model, the loss of heartbeat would be interpreted by the cluster manager as a failure of node 1, and therefore the cluster could re-form with packages failing over from node 1 to node 2.

Figure 1-2 Single Node Failure

Single Node Failure

No Arbitration—Multiple Media

It is possible to define multiple membership paths for intra-cluster communication that employ different types of communication from node to node. One path could use conventional LAN links, while a second path might employ a disk.

This model is illustrated in Figure 3. Both a LAN connection and a disk link provide redundant membership communication.

Figure 1-3 Multiple Paths with Different Media

Multiple Paths with Different Media

Note that the configuration could be expanded to include multiple disk links plus multiple LAN links, as in Figure 4. Such a configuration would require the loss of at least 4 links for the heartbeat to be lost.

Figure 1-4 Additional Multiple Paths with Different Media

Additional Multiple Paths with Different Media

No Arbitration—Risks

When all is said and done, it may be very unlikely that intra-node communication would be lost in the above configurations, but it is still possible that heartbeat could disappear, with both nodes still running, and this scenario can cause data corruption.

The risk of split brain syndrome is considerably greater with extended distance clusters and disaster tolerant solutions in which nodes are located in different data centers at some distance from each other. For these types of solution, some form of arbitration is essential.

The unlikely but possible scenario of split brain can be definitively avoided with an arbitration device. In other words, the risk of data corruption can be eliminated. The HP Serviceguard family of clustering software includes this level of protection.

Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© Hewlett-Packard Development Company, L.P.