Sun Microsystems
Products & Services
 
Support & Training
 
 

Previous Previous     Contents     Index     Next Next
Chapter 4

Recovering From Startup Problems on Master-Eligible Nodes

If you have installed the Foundation Services, but are unable to start up the master-eligible nodes, see the following sections:

A Master-Eligible Node Does Not Boot

If a master-eligible node does not boot after installation, the cause could be one of the following problems:

  • Incorrect hardware configuration

  • Incorrect Solaris operating system configuration

  • Incorrect Foundation Services configuration

If the Solaris operating system does not start on a master-eligible node, use the error messages and the Solaris documentation set to resolve the problem. If the Foundation Services do not start on a master-eligible node, perform the following procedure.

ProcedureTo Investigate Why the Foundation Services Do Not Start on a Master-Eligible Node

  1. Stop the continuous reboot cycle if such a cycle is running:

    1. Access the console of the failing node.

    2. Type the following command:

      # halt
      ok>

      Alternatively, type the following command:

      # Control-]
      telnet> send brk
      Type  'go' to resume
      ok>

      The ok prompt is returned.

    3. Become superuser:

      ok> boot -s
      #

  2. Search the error messages on the console of the failing node for an indication of the problem.

    The error messages should indicate the cause of the error. For a summary of error messages and their possible causes, see Appendix A, Error Messages.

    If the error is a configuration error, the following message is displayed:

    Error in configuration

    The text following the message should indicate the type of configuration error. Verify that the configuration of the nhfs.conf file for the node is consistent with the information in thenhfs.conf(4) man page.

  3. Confirm that the /etc/opt/SUNWcgha/not_configured file does not exist on the failing node.

    • If the file does not exist, go to Step 4.

    • If the file exists, delete it and reboot the node:

      # init 6

  4. If the Watchdog Timer is enabled, confirm that it is configured correctly.

    1. On the console of the failing node, get the ok prompt:

    2. Confirm that the nhfs.conf file contains the parameter WATCHDOG.NhasWatchdog=true.

      If WATCHDOG.NhasWatchdog=false, go to Step 5.

    3. Confirm that you have installed the correct hardware watchdog packages on the node.

      For information about the required packages and patches, see the Netra High Availability Suite Foundation Services 2.1 6/03 README.

    4. Confirm that the value of the WATCHDOG.OsTimeout parameter in the nhfs.conf file is not too low.

      If you suspect that the WATCHDOG.OsTimeout parameter is too low, increase the value of the parameter.

    5. Reboot the node:

      ok> boot
      #

  5. If your hardware includes an OpenBoot™ PROM diag-switch, confirm that it is set to false:

    1. On the console of the failing node, get the ok prompt:

    2. Run:

      ok> printenv diag-switch?

    • If the diag-switch is set to false, go to Step 6.

    • If the diag-switch is set to true, set it to false and reboot the node:

      ok> setenv diag-switch? false
      ok> boot
      #

  6. If you cannot resolve this problem, contact your customer support center.

Previous Previous     Contents     Index     Next Next