![]() |
|||
![]() |
![]() ![]() |
![]() |
![]() ![]() |
![]() |
![]() ![]() |
![]() |
| ||||||||||||||||||||||||||||||||||||||||
The Vice-Master Node Remains Unsynchronized After StartupAfter the startup of the master node and vice-master node, the data on the master node is copied to the vice-master node. In this way, the master node and vice-master node are synchronized. If the master node and vice-master node are not synchronized after startup, perform the following procedure.
|
# nhcmmstat -c all |
The nhcmmstat tool displays information about the roles of the peer nodes. The peer nodes should include a master node and a vice-master node. For more information about nhcmmstat, see the nhcmmstat(1M) man page.
If your cluster has a valid master node and vice-master node, go to Step 2.
If your cluster has no master node or vice-master node, you do not have a cluster. Verify your cluster configuration by examining the nhfs.conf and cluster_nodes_table files for configuration errors.
If your cluster has a master node but no vice-master node, reboot the master-eligible node that is not master:
# init 6 |
Verify that the second master-eligible node has become the vice-master node:
# nhcmmstat -c all |
Confirm that the master node and vice-master node are unsynchronized:
# /usr/opt/SUNWesm/sbin/scmadm -S -M |
If the scmadm tool does not reach the replicating state, the master node and vice-master node are unsynchronized. For more information, see the nhscmadm(1M) man page.
Determine whether an nhcrfsd daemon is running on each master-eligible node:
# pgrep -x nhcrfsd |
If a process identifier is returned, the nhcrfsd daemon is running. Go to Step 5.
If a process identifier is not returned, the nhcrfsd daemon is not running. Perform the procedure in To Recover From Daemon Failure.
On the master node and vice-master node, verify that the mount point is set correctly.
The mount point is set by the RNFS.Share property in the /etc/opt/SUNWcgha/nhfs.conf file. If the mount point is set correctly, the usr, root, and swap parameters in the RNFS.Share property have the following access permissions, respectively: ro, rw, and rw.
For each node, confirm that the IP address of the cgtp0 interface is specified in the /etc/hosts file.
If you cannot resolve this problem, contact your customer support center.
When a monitored daemon fails, the Daemon Monitor triggers a recovery response. The recovery response is often to restart the failed daemon. If the daemon fails to restart correctly, the Daemon Monitor reboots the node. The failure of a monitored daemon is the most common cause of a node reboot.
If the system recovers correctly, the daemon core and error message might be the only evidence of the failure. You must take the failure seriously even though the system has recovered.
For a list of recovery responses made by the Daemon Monitor, see the nhpmd(1M) man page.
For information about how to recover from the failure of a monitored daemon, see To Recover From Daemon Failure.
Table 4-2 summarizes some causes of daemon failure during the startup of master-eligible nodes.
Table 4-2 Causes of Daemon Failure at Startup of Master-Eligible Nodes
Failed Daemon | Possible Causes at Startup |
---|---|
nhcrfsd | One of the following files on the master node contains errors: /etc/vfstab, cluster_nodes_table, or nhfs.conf. |
The local file system of the failing node is mounted or unmounted incorrectly. | |
The network interface of the failing node is incorrectly configured. | |
nhcmmd | One of the following files on the master node contains errors: cluster_nodes_table or nhfs.conf. |
The cgtp0 interface of the failing node is incorrectly configured. | |
The cgtp0 interface of the failing node could not be initialized. | |
The failing node cannot connect to the nhprobed daemon. | |
The failing node cannot access the /etc/services file. | |
The failing node cannot write to the cluster_nodes_table file when it is to be elected as master node. | |
nhprobed | The failing node cannot obtain information about the network interfaces. |
The failing node cannot access the /etc/services file. | |
The failing node cannot create the required threads, sockets, or pipe. | |
nhwdtd | The failing node does not have a required platform-specific plugin for the nhwdtd daemon. |
The failing node does not have a platform-specific package for hardware watchdog support. | |
Platform-specific hardware watchdog does not work on the failing node. | |
in.dhcpd | A datastore location does not exist. |
A datastore location is not mounted on the failing node. | |
The failing node cannot find the dhcptab file in the datastore. | |
nhnsmd | The nhfs.conf file on the master node contains errors. |
The following procedure describes what to do if the Node Management Agent (NMA) exits during the startup of the master-eligible nodes.
Confirm that the Java Dynamic Management Kit connector has an allocated server port number.
If the server port number is already allocated, go to Step 2.
If the server port number is not already allocated, do the following:
In the nma.properties file on each peer node, allocate a port number for the Java Dynamic Management Kit connector.
Ensure that the port number is unique.
# /etc/opt/SUNWcgha/init.d/nma stop |
If the NMA fails to restart, see NMA Not Restarted After Failure.
Examine the system log files for the following messages:
If the log files contain the following message, confirm that the /etc/services file contains an entry for the cmm-api.
CMM statistics (JNI). Unable to access CMM statistics (can't access cmm-api service port number). |
If the log files contain the following message, correct the /etc/netconfig configuration.
CMM statistics (JNI). Unable to access CMM statistics (can't access tcp netconfig). |
If the log files contain the following message, an RPC error occurred during an access to the CMM statistics.
CMM statistics (JNI) Failed to get stats from CMM :[rpc return code] |
Use the RPC return code to diagnose and correct the problem.
If the log files contain the following message, a call to the CMM succeeded from an RPC point of view. However, the CMM internals were unable to return valid statistics.
CMM statistics (JNI) Failed to get stats from CMM : [CMM status] |
Check the status of the nhcmmd daemon and its processes.
If the log files contain the following message, RPC failed while attempting to access CMM statistics.
CMM statistics (JNI) rpc call failed |
Correct the RPC configuration.
If the log files contain the following message, CGTP is unavailable:
KSTAT (JNI). Unable to launch CGTP. CGTP statistics not available. |
Confirm that the redundant network is available and that the network configuration is correct.
Restart the NMA on all nodes:
# /etc/opt/SUNWcgha/init.d/nma stop |
If the NMA fails to restart, see NMA Not Restarted After Failure.
If you cannot resolve this problem, contact your customer support center.
![]() ![]() |