Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
Managing Serviceguard Fifteenth Edition > Chapter 8 Troubleshooting Your Cluster

Troubleshooting Approaches

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Index

The following sections offer a few suggestions for troubleshooting by reviewing the state of the running system and by examining cluster status data, log files, and configuration files. Topics include:

  • Reviewing Package IP Addresses

  • Reviewing the System Log File

  • Reviewing Configuration Files

  • Reviewing the Package Control Script

  • Using cmquerycl and cmcheckconf

  • Using cmscancl and cmviewcl

  • Reviewing the LAN Configuration

NOTE: HP recommends you use Serviceguard Manager as a convenient way to observe the status of a cluster and the properties of cluster objects: from the System Management Homepage (SMH), select the cluster you need to troubleshoot.

Reviewing Package IP Addresses

The netstat -in command can be used to examine the LAN configuration. The command, if executed on ftsys9 after ftsys10 has been halted, shows that the package IP addresses are assigned to lan0 on ftsys9 along with the primary LANIP address.

ftsys9>netstat -inIPv4:
Name Mtu Network     Address        Ipkts   Ierrs Opkts Oerrs Coll
ni0#      0    none        none           0       0     0       0     0
ni1*      0    none        none           0       0     0       0     0
lo0       4608 127         127.0.0.1      10114   0     10      0     0
lan0      1500 15.13.168   15.13.171.14   959269  0     33069   0     0
lan0:1    1500 15.13.168   15.13.171.23   959269  0     33069   0     0
lan0:2    1500 15.13.168   15.13.171.20   959269  0     33069   0     0
lan1*     1500 none        none           418623  0     55822   0     0
IPv6:
Name Mtu Address/Prefix Ipkts Opkts
lan1*  1500 none 0 0
lo0  4136 ::1/128 10690 10690

Reviewing the System Log File

Messages from the Cluster Manager and Package Manager are written to the system log file. The default location of the log file is /var/adm/syslog/syslog.log. Also, package-related messages are logged into the package log file. The package log file is located in the package directory, by default. You can use a text editor, such as vi, or the more command to view the log file for historical information on your cluster.

It is always a good idea to review the syslog.log file on each of the nodes in the cluster when troubleshooting cluster problems.

This log provides information on the following:

  • Commands executed and their outcome.

  • Major cluster events which may, or may not, be errors.

  • Cluster status information.

NOTE: Many other products running on HP-UX in addition to Serviceguard use the syslog.log file to save messages. The HP-UX System Administrator’s Guide provides additional information on using the system log.

Sample System Log Entries

The following entries from the file /var/adm/syslog/syslog.log show a package that failed to run due to a problem in the pkg5_run script. You would look at the pkg5_run.log for details.

Dec 14 14:33:48 star04 cmcld[2048]: Starting cluster management protocols.
Dec 14 14:33:48 star04 cmcld[2048]: Attempting to form a new cluster
Dec 14 14:33:53 star04 cmcld[2048]: 3 nodes have formed a new cluster
Dec 14 14:33:53 star04 cmcld[2048]: The new active cluster membership is:
    star04(id=1) , star05(id=2), star06(id=3)
Dec 14 17:33:53 star04 cmlvmd[2049]: Clvmd initialized successfully.
Dec 14 14:34:44 star04 CM-CMD[2054]: cmrunpkg -v pkg5
Dec 14 14:34:44 star04 cmcld[2048]: Request from node star04 to start
    package pkg5 on node star04.
Dec 14 14:34:44 star04 cmcld[2048]: Executing '/etc/cmcluster/pkg5/pkg5_run
    start' for package pkg5.
Dec 14 14:34:45 star04 LVM[2066]: vgchange -a n /dev/vg02
Dec 14 14:34:45 star04 cmcld[2048]: Package pkg5 run script exited with
    NO_RESTART.
Dec 14 14:34:45 star04 cmcld[2048]: Examine the file
    /etc/cmcluster/pkg5/pkg5_run.log for more details.

The following is an example of a successful package starting:

Dec 14 14:39:27 star04 CM-CMD[2096]: cmruncl
Dec 14 14:39:27 star04 cmcld[2098]: Starting cluster management protocols.
Dec 14 14:39:27 star04 cmcld[2098]: Attempting to form a new cluster
Dec 14 14:39:27 star04 cmclconfd[2097]: Command execution message
Dec 14 14:39:33 star04 cmcld[2098]: 3 nodes have formed a new cluster
Dec 14 14:39:33 star04 cmcld[2098]: The new active cluster membership is:
   star04(id=1), star05(id=2), star06(id=3)
Dec 14 17:39:33 star04 cmlvmd[2099]: Clvmd initialized successfully.
Dec 14 14:39:34 star04 cmcld[2098]: Executing '/etc/cmcluster/pkg4/pkg4_run
   start' for package pkg4.
Dec 14 14:39:34 star04 LVM[2107]: vgchange /dev/vg01
Dec 14 14:39:35 star04 CM-pkg4[2124]: cmmodnet -a -i 15.13.168.0 15.13.168.4
Dec 14 14:39:36 star04 CM-pkg4[2127]: cmrunserv Service4 /vg01/MyPing 127.0.0.1
    >>/dev/null
Dec 14 14:39:36 star04 cmcld[2098]: Started package pkg4 on node star04.

Reviewing Object Manager Log Files

The Serviceguard Object Manager daemon cmomd logs messages to the file /var/opt/cmom/cmomd.log. You can review these messages using the cmreadlog command, as follows:

cmreadlog /var/opt/cmom/cmomd.log

Messages from cmomd include information about the processes that request data from the Object Manager, including type of data, timestamp, etc.

Reviewing Serviceguard Manager Log Files

From the System Management Homepage (SMH), click Tools, then select Serviceguard Manager, select the cluster you are interested and then choose View -> Operation Log.

Reviewing the System Multi-node Package Files

If you are running Veritas Cluster Volume Manager and you have problems starting the cluster, check the log file for the system multi-node package. For Cluster Volume Manager (CVM) 3.5, the file is VxVM-CVM-pkg.log. For CVM 4.1 and later, the file is SG-CFS-pkg.log.

Reviewing Configuration Files

Review the following ASCII configuration files:

  • Cluster configuration file.

  • Package configuration files.

Ensure that the files are complete and correct according to your configuration planning worksheets.

Reviewing the Package Control Script

Ensure that the package control script is found on all nodes where the package can run and that the file is identical on all nodes. Ensure that the script is executable on all nodes. Ensure that the name of the control script appears in the package configuration file, and ensure that all services named in the package configuration file also appear in the package control script.

Information about the starting and halting of each package is found in the package’s control script log. This log provides the history of the operation of the package control script. By default, it is found at /etc/cmcluster/<package_name>/control_script.log; but another location may have been specified in the package configuration file’s script_log_file parameter. This log documents all package run and halt activities. If you have written a separate run and halt script for a legacy package, each script will have its own log.

Using the cmcheckconf Command

In addition, cmcheckconf can be used to troubleshoot your cluster just as it was used to verify the configuration.

The following example shows the commands used to verify the existing cluster configuration on ftsys9 and ftsys10:

 cmquerycl -v -C /etc/cmcluster/verify.ascii -n ftsys9 -n ftsys10 
 cmcheckconf -v -C /etc/cmcluster/verify.ascii 

The cmcheckconf command checks:

  • The network addresses and connections.

  • The cluster lock disk connectivity.

  • The validity of configuration parameters of the cluster and packages for:

    • The uniqueness of names.

    • The existence and permission of scripts.

It doesn’t check:

  • The correct setup of the power circuits.

  • The correctness of the package configuration script.

Using the cmscancl Command

The command cmscancl displays information about all the nodes in a cluster in a structured report that allows you to compare such items as IP addresses or subnets, physical volume names for disks, and other node-specific items for all nodes in the cluster. cmscancl actually runs several different HP-UX commands on all nodes and gathers the output into a report on the node where you run the command.

To run the cmscancl command, the root user on the cluster nodes must have the .rhosts file configured to allow the command to complete successfully. Without that, the command can only collect information on the local node, rather than all cluster nodes.

The following are the types of configuration data that cmscancl displays for each node:

Table 8-1 Data Displayed by the cmscancl Command

Description

Source of Data

LAN device configuration and status

lanscan command

network status and interfaces

netstat command

file systems

mount command

LVM configuration

/etc/lvmtab file

LVM physical volume group data

/etc/lvmpvg file

link level connectivity for all links

linkloop command

binary configuration file

cmviewconf command

 

Using the cmviewconf Command

cmviewconf allows you to examine the binary cluster configuration file, even when the cluster is not running. The command displays the content of this file on the node where you run the command.

Reviewing the LAN Configuration

The following networking commands can be used to diagnose problems:

  • netstat -in can be used to examine the LAN configuration. This command lists all IP addresses assigned to each LAN interface card.

  • lanscan can also be used to examine the LAN configuration. This command lists the MAC addresses and status of all LAN interface cards on the node.

  • arp -a can be used to check the arp tables.

  • landiag is useful to display, diagnose, and reset LAN card information.

  • linkloop verifies the communication between LAN cards at MAC address levels. For example, if you enter

     linkloop -i4 0x08000993AB72 

    you should see displayed the following message:

    Link Connectivity to LAN station: 0x08000993AB72  OK
  • cmscancl can be used to verify that primary and standby LANs are on the same bridged net.

  • cmviewcl -v shows the status of primary and standby LANs.

Use these commands on all nodes.

Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© Hewlett-Packard Development Company, L.P.