![]() |
|||
![]() |
![]() ![]() |
![]() |
![]() ![]() |
![]() |
![]() ![]() |
![]() |
| ||||||||||||||||||||||
|
# halt ok> |
Alternatively, type the following command:
# Control-] telnet> send brk Type 'go' to resume ok> |
The ok prompt is returned.
Boot in single user mode:
ok> boot -s # |
Search the messages displayed on the console of the failing node for an indication of the problem.
The error messages should indicate the cause of the problem. Use the error messages to identify the failing daemon or failing service. For a summary of error messages and their possible causes, see Appendix A, Error Messages.
If the error is a configuration error, the following message is displayed:
Error in configuration |
The text following the message should indicate the type of configuration error. Verify that the configuration of the nhfs.conf file for the node is consistent with the information in the nhfs.conf(4) man page.
Confirm that the /etc/opt/SUNWcgha/not_configured file does not exist on the failing node.
If the file does not exist, go to Step 4.
If the file exists, delete it and reboot the node:
# init 6 |
Confirm that the cluster_node_table file on the master node contains an entry for the failing node.
If the file contains an entry for the failing node, go to Step 5.
If the file does not contain an entry for the failing node, verify the installation and configuration.
If you cannot resolve this problem, contact your customer support center.
If a failover occurs during the boot or reboot of a diskless node, the DHCP files can be corrupted. If this problem occurs, see A Diskless Node Does Not Reboot After Failover.
Dataless nodes boot from a local disk and run customer applications locally. Dataless nodes access the Foundation Services through the cluster network and send data to the master node.
This section describes what to do if the Solaris operating system or the Foundation Services do not start on a dataless node. Only use this section when you have a running cluster that contains a master node and a vice-master node.
If the Solaris operating system does not start on a dataless node, use the error messages and the Solaris documentation set to resolve the problem. If the Foundation Services do not start on a dataless node, perform the following procedure.
Stop the continuous reboot cycle if such a cycle is running.
For information, see Step 1 of To Investigate Why the Foundation Services Do Not Start on a Diskless Node.
Search the messages on the console of the failing node for an indication of the problem.
For information, see Step 2 of To Investigate Why the Foundation Services Do Not Start on a Diskless Node.
Confirm that the /etc/opt/SUNWcgha/not_configured file does not exist on the failing node.
For information, see Step 3 of To Investigate Why the Foundation Services Do Not Start on a Diskless Node.
Confirm that the cluster_node_table file on the master node contains an entry for the failing node.
For information, see Step 4 of To Investigate Why the Foundation Services Do Not Start on a Diskless Node.
If you cannot resolve this problem, contact your customer support center.
When a monitored daemon fails, the Daemon Monitor triggers a recovery response. The recovery response is often to restart the failed daemon. If the daemon fails to restart correctly, the Daemon Monitor reboots the node. The failure of a monitored daemon is the most common cause of a node reboot.
If the system recovers correctly, the daemon core and error message might be the only evidence of the failure. You must take the failure seriously even though the system has recovered.
For a list of recovery responses made by the Daemon Monitor, see the nhpmd(1M) man page.
For information about how to recover from the failure of a monitored daemon, see To Recover From Daemon Failure.
Table 5-1 summarizes some causes of daemon failure during the startup of diskless nodes and dataless nodes.
Table 5-1 Causes of Daemon Failure on Diskless Nodes and Dataless Nodes at Startup
Failed Daemon | Possible Causes at Startup |
---|---|
nhcmmd | One of the following files on the master node contains errors: cluster_nodes_table or nhfs.conf. |
The cgtp0 interface of the failing node is configured incorrectly. | |
The cgtp0 interface of the failing node could not be initialized. | |
The failing node cannot connect to the nhprobed daemon. | |
The failing node cannot access the /etc/services file. | |
The failing node exceeded the time-out value. | |
nhprobed | The failing node cannot obtain information about the network interfaces. |
The failing node cannot access the /etc/services file. | |
The failing node cannot create the required threads, sockets, or pipe. | |
nhwdtd | The failing node does not have a required platform-specific plugin for the nhwdtd daemon. |
The failing node does not have a platform-specific package for hardware watchdog support. | |
Platform-specific hardware watchdog support does not work on the failing node. |
![]() ![]() |