![]() |
|||
![]() |
![]() ![]() |
![]() |
![]() ![]() |
![]() |
![]() ![]() |
![]() |
| |||||||||||||||||||
Chapter 7Recovering From Node Reboot at RuntimeFor information about the causes of node reboot at runtime, see A Monitored Daemon Fails Causing a Node to Reboot at Runtime. A Monitored Daemon Fails Causing a Node to Reboot at RuntimeWhen a monitored daemon fails, the Daemon Monitor triggers a recovery response. The recovery response is often to restart the failed daemon. If the daemon fails to restart correctly, the Daemon Monitor reboots the node. The failure of a monitored daemon is the most common cause of a node reboot. If the system recovers correctly, the daemon core and error message might be the only evidence of the failure. You must take the failure seriously even though the system has recovered. For a list of recovery responses made by the Daemon Monitor, see the nhpmd(1M) man page. For a summary of the causes of daemon failure during startup, see A Monitored Daemon Fails Causing a Master-Eligible Node to Reboot at Startup and A Monitored Daemon Fails Causing a Diskless Node or Dataless Node to Reboot at Startup. Table 7-1 and Table 7-2 summarize the events that can cause a monitored daemon to fail at runtime. To recover from daemon failure, perform the procedure in To Recover From Daemon Failure. Table 7-1 Possible Causes of Daemon Failure at Runtime
Table 7-2 Causes of Daemon Failure on Master-Eligible Nodes During Failover or Switchover
|
[ID 615790 local0.notice] "rpc" Failed to stay up. |
For information about which nametag launches which daemon, see the nhpmd(1M) man page.
Identify the cause of the daemon failure.
Use the information obtained in Step 1, Step 2, Table 7-1, and Table 7-2.
Fix the underlying problem, if necessary.
Confirm that the recovery procedure has been carried out by searching the system log files for local0 information.
If your system log file is not configured for local0 information, reconfigure it.
For information, see the Netra High Availability Suite Foundation Services 2.1 6/03 Cluster Administration Guide.
If local0 information is logged to a file, search the file for the string "nhpmd".
Lines containing the string "nhpmd" describe the recovery response performed by the Daemon Monitor.
![]() ![]() |