Sun Microsystems
Products & Services
 
Support & Training
 
 

Previous Previous     Contents     Index     Next Next
Chapter 5

Recovering From Startup Problems on Diskless Nodes and Dataless Nodes

If you have started up the master-eligible nodes but are unable to startup the diskless nodes or dataless nodes, see the following sections:

A Diskless Node Does Not Boot at Startup

Diskless nodes boot using the Solaris Dynamic Host Configuration Protocol (DHCP) servers provided by the Reliable Boot Service. To boot diskless nodes you must have a cluster containing at least one master-eligible node running the Foundation Services. When you are booting diskless nodes, you can use the snoop utility to see the parameters transmitted by the DHCP server to the diskless node.

This section describes what to do when the Solaris operating system or Foundation Services do not start on a diskless node.

ProcedureTo Investigate Why the Solaris Operating System Does Not Start on a Diskless Node

Use this procedure when the Solaris operating system does not start on a diskless node.

  1. Confirm that the spanning tree protocol is disabled.

    For Cisco 29x0 switches, do the following:

    1. Telnet to the Ethernet switch.

    2. Type the following command:

      # enable
      Password <user-password>

    3. Type the following command:

      # show run

    4. Search the output on the console for the following line:

      no spanning-tree vlan <vlanid>

      If the display contains this line, the spanning tree is disabled.

    • If the spanning tree is disabled, go Step 2.

    • If the spanning tree is not disabled, disable it.

      For information, see the Netra High Availability Suite Foundation Services 2.1 6/03 Hardware Guide.

  2. Confirm that the DHCP configuration is correct.

    1. Access the consoles of the master node and vice-master node.

    2. On each console, confirm that the /etc/inet/dhcpsvc.conf file exists and has the correct attributes:

      # nhadm check configuration
      DAEMON_ENABLED=TRUE
      RUN_MODE=server
      RESOURCE=SUNWnhrbs
      PATH=/SUNWcgha/remote/var/dhcp
      CONVER=1
      INTERFACE=nic0,nic1

      The RESOURCE parameter must be set to RESOURCE=SUNWnhrbs. By default, this parameter is set to RESOURCE=SUNWfiles.

      The PATH parameter must point to a directory in a replicated file system. By default, the directory is PATH=/SUNWcgha/remote/var/dhcp.

      If the file does not have the correct attributes, do the following:

      • Edit or create the /etc/inet/dhcpsvc.conf file, setting the attributes as stated previously.

      • Stop and restart the DHCP daemon:

        # /etc/rc3.d/HA.S34dhcp stop
        # /etc/rc3.d/HA.S34dhcp start

    3. If you are installing your cluster manually, confirm that the path has DHCP container files with the following name:

      SUNWnhrbs1_10_x_1_0, SUNWnhrbs1_10_x_2_0, and SUNWnhrbs1_dhcptab, where x is the domain identity.

      You do not need to perform this step if you are installing your cluster using the nhinstall tool.

      If the DHCP container files do not have the specified name, regenerate them, taking care to use the correct values for subnet1 and subnet2. For information, see "To Configure DHCP for a Diskless Node" in the Netra High Availability Suite Foundation Services 2.1 6/03 Custom Installation Guide.

    4. If you are using a static address assignment, confirm that the MAC address or client ID of the diskless node is configured correctly.

      Refer to the DHCP table on the master node.

  3. On the console of the diskless node, confirm that the following OpenBoot PROM parameter is set:

    boot-device net:dhcp,,,,,5 net2:dhcp,,,,,5

  4. Confirm that the vendor type of the diskless node is recognized by the master node.

    1. Access the console of the master node.

    2. Type the following command:

      # snoop -v -d nic0 ether mac-adr-of-dl-node | grep -i dhcp

      or

      ok> dev
      ok> .properties
      => property

      The vendor type of the diskless node is returned as a string.

    3. Search for the same string in the DHCP table on the master node.

  5. On the console of the master node, confirm that the directory /tftpboot is present.

    If this directory is not present, the following error message is written to the system log files:

    Timeout waiting for BOOTP/DHCP reply. Retrying...
    TFTP Error Access violation

    If the /tftpboot directory is not present on the vice-master node, the diskless node does not boot after a switchover. To set up the /tftpboot directory on the vice-master node, see the Netra High Availability Suite Foundation Services 2.1 6/03 Custom Installation Guide.

  6. On the console of the master node, confirm that the following directory contains a file for each diskless node interface: diskless_file_system/root/diskless_nodeid/etc/

    The files could be named as follows:

    hostname.hme0
    hostname.hme1

    If the directory does not contain a file for an interface, the interface cannot be configured.

  7. Examine the access permissions of root, swap, and usr in the nhfs.conf file on the master node.

    If your cluster was installed manually or by the nhinstall tool, confirm that the following access permissions are set:

    share -F nfs -o rw,root=diskless_node_id-nic0:diskless_node_id-nic1:
    		diskless_node_id-cgtp0 /export/root/diskless_node_id
    share -F nfs -o rw,root=diskless_node_id-nic1:diskless_node_id-nic1:
    		diskless_node_id-cgtp0 /export/swap/diskless_node_id
    share -F nfs -o ro /export/exec/Solaris_X_sparc.all/usr

    where X is the version of the Solaris operating system installed on the cluster.

  8. If you cannot resolve this problem, contact your customer support center.

Previous Previous     Contents     Index     Next Next