Sun Microsystems
Products & Services
 
Support & Training
 
 

Previous Previous     Contents     Index     Next Next
Chapter 3

Determining Cluster Validity

This chapter describes how to verify whether a group of nodes form a cluster, and whether the cluster is functioning correctly. Before you perform maintenance tasks or change the cluster configuration, verify that the cluster is functioning correctly. When you have completed maintenance tasks, verify that the cluster is still functioning correctly.

This chapter is divided into the following sections:

Defining Minimum Criteria for a Cluster Running Highly Available Services

A Foundation Services cluster can run the following highly available services: Reliable NFS and the Reliable Boot Service. For information about highly available services, see the Netra High Availability Suite Foundation Services 2.1 6/03 Overview.

A highly available cluster has the following features:

If your cluster has diskless nodes, the Reliable Boot Service must be running on the master node and the vice-master node.

Verifying Services on Peer Nodes

When performing administration tasks, regularly verify that your cluster is running correctly by performing the procedures described in this section.

ProcedureTo Verify That the Cluster Has a Master Node and a Vice-Master Node

  1. Log in to a master-eligible node as superuser.

  2. Type:

    # nhcmmstat -c all

    The nhcmmstat command displays information in the console window about all of the peer nodes. The information includes the role of each node. The peer nodes must include a master node and a vice-master node. For more information, see the nhcmmstat(1M) man page.

    • If there is a master node but no vice-master node, reboot the second master-eligible node:

      # init 6

      Verify that the second master-eligible node has become the vice-master node:

      # nhcmmstat -c all

      If the second master-eligible node does not become the vice-master node, see the Netra High Availability Suite Foundation Services 2.1 6/03 Troubleshooting Guide.

    • If there is neither a master node nor a vice-master node, you do not have a highly available cluster. Verify your cluster configuration by examining the nhfs.conf file and the cluster_nodes_table file for configuration errors.

      For more information, see the nhfs.conf(4) and cluster_nodes_table(4) man pages.

    • If there are two master nodes, you have a split brain error scenario. To investigate the cause of split brain, see the Netra High Availability Suite Foundation Services 2.1 6/03 Troubleshooting Guide.

ProcedureTo Verify That an nhcmmd Daemon Is Running on Each Peer Node

  1. Log in to a peer node.

  2. Verify that an nhcmmd daemon is running on the node:

    # pgrep -x nhcmmd

    • If a process identifier is returned, the daemon is running.

    • If a process identifier is not returned, the daemon is not running.

    To investigate the cause of daemon failure, see the Netra High Availability Suite Foundation Services 2.1 6/03 Troubleshooting Guide.

  3. Repeat Step 1 and Step 2 on each peer node.

ProcedureTo Verify That the Cluster Has a Redundant Ethernet Network

  1. Log in to a peer node as superuser.

  2. Verify that the peer nodes are communicating through a network:

    # nhadm check starting

    If any peer node is not accessible from any other peer node, the nhadm command displays an error message in the console window.

  3. Search the system log files for this message:

    [ifcheck] Interface interface-name used for cgtp has failed

    This message is created by the nhcmmd daemon if the peer nodes are not communicating through a redundant network.

    If the redundant network fails, examine the card, cable, and route table associated with the link. Investigate the system log files for relevant error messages.

ProcedureTo Verify That the Master Node and Vice-Master Node Are Synchronized

  1. Log in to a master node as superuser.

  2. Test whether the vice-master node is synchronized with the master node:

    # /usr/opt/SUNWesm/sbin/scmadm -S -M

    • If the scmadm command reaches the replicating state, the vice-master node is synchronized with the master node.

    • If the scmadm command does not reach the replicating state, the vice-master node is not synchronized with the master node.

  3. If the master and vice-master nodes are not synchronized, verify if the RNFS.EnableSync parameter is set in to FALSE in the nhfs.conf file.

    If the RNFS.EnableSync parameter is set to FALSE and if you want to trigger synchronization:

    1. Trigger synchronization:

      # nhenablesync

      For information on nhenableysnc, see nhenablesync(1M).

    2. Repeat Step 2.

    If the RNFS.EnableSync parameter is not set to FALSE but the vice-master node remains unsynchronized, see the Netra High Availability Suite Foundation Services 2.1 6/03 Troubleshooting Guide.

    For more information about the scmadm command, see the scmadm(1M) man page. For more information about the RNFS.EnableSync parameter, see the nhfs.conf(4) man page.

Previous Previous     Contents     Index     Next Next