4 Cluster Administration

This chapter discusses the following topics:

Configuring a NetRAIN virtual interface for a cluster interconnect (Section 4.1)

Tuning the LAN interconnect for optimal performance (Section 4.2)

Obtaining network adapter configuration information (Section 4.3)

Monitoring activity on the LAN interconnect (Section 4.4)

Migrating from Memory Channel to a LAN interconnect (Section 4.5)

Migrating from a LAN interconnect to Memory Channel (Section 4.6)

Troubleshooting LAN interconnect problems (Section 4.7)

4.1 Configuring a NetRAIN Virtual Interface for a Cluster LAN Interconnect

If you do not configure the cluster interconnect from redundant array of independent network adapters (NetRAIN) virtual interfaces during cluster installation, you can do so afterwards. However, the requirements and rules for configuring a NetRAIN virtual interface for use in a cluster interconnect differ from those documented in the Tru64 UNIX Network Administration: Connections manual.

Unlike a typical NetRAIN virtual device, a NetRAIN device for the cluster interconnect is set up completely within the ics_ll_tcp kernel subsystem in /etc/sysconfigtab and not in /etc/rc.config. This allows the interconnect to be established very early in the boot path, when it is needed by cluster components to establish membership and transfer I/O.

Caution

Never change the attributes of a member's cluster interconnect NetRAIN device outside of its /etc/sysconfigtab file (that is, by using an ifconfig command or the SysMan Station, or by defining it in the /etc/rc.config file and restarting the network). Doing so will put the NetRAIN device outside of cluster control and may cause the member system to be removed from the cluster. See Section 4.7.4 for more information.

To configure a NetRAIN interface for a cluster interconnect after cluster installation, perform the following steps on each member:

To eliminate the LAN interconnect as a single point of failure, one or more Ethernet switches are required for the cluster interconnect (two are required for a no-single-point-of-failure (NSPOF) LAN interconnect configuration), in addition to redundant Ethernet adapters on the member configured as a NetRAIN set. If you must install additional network hardware, halt and turn off the member system. Install the network cards on the member and cable each to different switches, as recommended in Section 2.1. Turn on the switches and reboot the member. If you do not need to install additional hardware, you can skip this step.

Use the ifconfig -a command to determine the names of the Ethernet adapters to be used in the NetRAIN set.

If you intend to configure an existing NetRAIN set for a cluster interconnect (for example, one previously configured for an external network), you must first undo its current configuration:
1. Use the rcmgr delete command to delete the following variables from the member's /etc/rc.config file: NRDEV_x, NRCONFIG_x, NETDEV_x, IFCONFIG_x, variables associated with the device.
2. Use the rcmgr set command to decrement the NR_DEVICES and NUM_NETCONFIG variables.

Edit the /etc/sysconfigtab file to add the new adapter. For example, change:

ics_ll_tcp:
 
ics_tcp_adapter0 = ee0

to:

ics_ll_tcp:
 
ics_tcp_adapter0 = nr0
ics_tcp_nr0[0] = ee0
ics_tcp_nr0[1] = ee1

Reboot the member. The member is now using the NetRAIN virtual interface as its physical cluster interconnect.

Use the ifconfig command to show the NetRAIN device defined with the CLUIF flag. For example:

# ifconfig nr0
nr0: flags=1000c63<UP,BROADCAST,NOTRAILERS,RUNNING,MULTICAST,SIMPLEX,CLUIF>
     NetRAIN Attached Interfaces: ( ee0 ee1 ) Active Interface: ( ee0 )
    inet 10.1.0.2 netmask ffffff00 broadcast 10.1.0.255 ipmtu 1500

Repeat this procedure for each remaining member.

4.2 Tuning the LAN Interconnect

This section provides guidelines for tuning the LAN interconnect.

Caution

Do not tune a NetRAIN virtual interface being used for a cluster interconnect using those mechanisms used for other NetRAIN devices (including ifconfig, niffconfig, and niffd command options or netrain or ics_ll_tcp kernel subsystem attributes). Doing so is likely to disrupt cluster operation. The cluster software ensures that the NetRAIN device for the cluster interconnect is tuned for optimal cluster operation.

4.2.1 Improving Cluster Interconnect Performance by Setting Its ipmtu Value

Some applications may receive some performance benefit if you set the IP maximum transfer unit (ipmtu) for the cluster interconnect virtual interface (ics0) on each member to the same value used by its physical interface (membern-tcp0). The recommended value depends on the type of cluster interconnect in use.

For 100 Mb/s Ethernet, the ipmtu value should be set to 1500.

For Memory Channel, the ipmtu value should be set to 7000.

To view the current ipmtu settings for the virtual and physical cluster interconnect devices, use the following command:

# ifconfig -a
ee0: flags=1000c63<UP,BROADCAST,NOTRAILERS,RUNNING,MULTICAST,SIMPLEX,CLUIF>
     inet 10.1.0.100 netmask ffffff00 broadcast 10.1.0.255 ipmtu 1500 
 
ics0: flags=1100063<UP,BROADCAST,NOTRAILERS,RUNNING,NOCHECKSUM,CLUIF>
     inet 10.0.0.1 netmask ffffff00 broadcast 10.0.0.255 ipmtu 7000

Because this cluster member is using the ee0 Ethernet device for its physical cluster interconnect device, change the ipmtu for its virtual cluster interconnect device (ics0) from 7000 to 1500.

To set the ipmtu value for the ics0 virtual device, perform the following procedure:

Add the following line to the /etc/inet.local file on each member, supplying an ipmtu value:
```
ifconfig ics0 ipmtu value
 
```

Restart the network on each member using the rcinet restart command.

4.3 Obtaining Network Adapter Configuration Information

To display information from the datalink driver for a network adapter, such as its name, speed, and operating mode, use the SysMan Station or the hwmgr -get attr -cat network command. In the following example, tu2 is the client network adapter running at 10 Mb/s in half-duplex mode and ee0 and ee1 are a NetRAIN virtual interface configured as the LAN interconnect and running at 100 Mb/s in full-duplex mode:

# hwmgr -get attr -cat network | grep -E 'name|speed|duplex'
  name = tu2
  media_speed = 10
  full_duplex = 0
  user_name = (null) (settable)
  name = ee0
  media_speed = 100
  full_duplex = 1
  user_name = (null) (settable)
  name = ee1
  media_speed = 100
  full_duplex = 1
  user_name = (null) (settable)

4.4 Monitoring LAN Interconnect Activity

Use the netstat command to monitor the traffic across the LAN interconnect. For example:

# netstat -acdnots -I nr0
nr0 Ethernet counters at Mon Apr 30 14:15:15 2001
 
           65535 seconds since last zeroed
      3408205675 bytes received
      4050893586 bytes sent
         7013551 data blocks received
         6926304 data blocks sent
         7578066 multicast bytes received
          115546 multicast blocks received
         3182180 multicast bytes sent
           51014 multicast blocks sent
               0 blocks sent, initially deferred
               0 blocks sent, single collision
               0 blocks sent, multiple collisions
               0 send failures
               0 collision detect check failure
               0 receive failures
               0 unrecognized frame destination
               0 data overruns
               0 system buffer unavailable
               0 user buffer unavailable
nr0: access filter is disabled

Use the ifconfig -a and niffconfig -v commands to monitor the status of the active and inactive adapters in a NetRAIN virtual interface.

# ifconfig -a
ee0: flags=1000c63<UP,BROADCAST,NOTRAILERS,RUNNING,MULTICAST,SIMPLEX,CLUIF>
     NetRAIN Virtual Interface: nr0 
     NetRAIN Attached Interfaces: ( ee1 ee0 ) Active Interface: ( ee1 )
 
ee1: flags=1000c63<UP,BROADCAST,NOTRAILERS,RUNNING,MULTICAST,SIMPLEX,CLUIF>
     NetRAIN Virtual Interface: nr0 
     NetRAIN Attached Interfaces: ( ee1 ee0 ) Active Interface: ( ee1 )
 
ics0: flags=1100063<UP,BROADCAST,NOTRAILERS,RUNNING,NOCHECKSUM,CLUIF>
     inet 10.0.0.200 netmask ffffff00 broadcast 10.0.0.255 ipmtu 15u00 
 
lo0: flags=100c89<UP,LOOPBACK,NOARP,MULTICAST,SIMPLEX,NOCHECKSUM>
     inet 127.0.0.1 netmask ff000000 ipmtu 4096 
 
nr0: flags=1000c63<UP,BROADCAST,NOTRAILERS,RUNNING,MULTICAST,SIMPLEX,CLUIF>
     NetRAIN Attached Interfaces: ( ee1 ee0 ) Active Interface: ( ee1 )
     inet 10.1.0.2 netmask ffffff00 broadcast 10.1.0.255 ipmtu 1500 
 
sl0: flags=10<POINTOPOINT>
 
tu0: flags=c63<UP,BROADCAST,NOTRAILERS,RUNNING,MULTICAST,SIMPLEX>
     inet 16.140.112.176 netmask ffffff00 broadcast 16.140.112.255 ipmtu 1500 
 
tun0: flags=80<NOARP>

# niffconfig -v
Interface:   ee1, description: NetRAIN internal, status:      UP, event:   ALERT, state: GREEN
         t1: 3, dt: 2, t2: 10, time to dead: 3, current_interval: 3, next time: 1
Interface:   nr0, description: NetRAIN internal, status:      UP, event:   ALERT, state: GREEN
         t1: 3, dt: 2, t2: 10, time to dead: 3, current_interval: 3, next time: 1
Interface:   ee0, description: NetRAIN internal, status:      UP, event:   ALERT, state: GREEN
         t1: 3, dt: 2, t2: 10, time to dead: 3, current_interval: 3, next time: 2
Interface:   tu0, description:                 , status:      UP, event:   ALERT, state: GREEN
         t1: 20, dt: 5, t2: 60, time to dead: 30, current_interval: 20, next time: 20

4.5 Migrating from Memory Channel to LAN

This section discusses how to migrate a cluster that uses Memory Channel as its cluster interconnect to a LAN interconnect.

Replacing a Memory Channel interconnect with a LAN interconnect requires some cluster downtime and interruption of service.

Note

If you are performing a rolling upgrade (as described in the Cluster Installation manual) from TruCluster Server Version 5.1 to TruCluster Server Version 5.1A and intend to replace the Memory Channel with a LAN interconnect, plan on installing the LAN hardware on each member during the roll. Doing so allows you to avoid performing steps 1 through 4 in the following procedure.

To prepare to migrate an existing cluster using the Memory Channel interconnect to using a LAN interconnect, perform the following procedure for each cluster member:

Halt and turn off the cluster member.

Install the network adapters. Configure any required switches or hubs.

Turn on the cluster member.

Boot the member over Memory Channel into the existing cluster.

At this point, you can configure the newly installed Ethernet hardware as a private conventional subnet shared by all cluster members. You can verify that the hardware is configured properly and operates correctly before setting it up as a LAN interconnect. Do not use the rcmgr command or statically edit the /etc/rc.config file to permanently set up this network. Because this test network must not survive the reboot of the cluster over the LAN interconnect, use ifconfig commands on each member to set it up.

To configure the LAN interconnect, perform the following steps:

On each member, make backup copies of the member-specific /etc/sysconfigtab and /etc/rc.config files.

On each member, inspect the member-specific /etc/rc.config file, paying special attention to the NETDEV_x and NRDEV_x configuration variables. Because the network adapters used for the LAN interconnect must be configured very early in the boot process, they are defined in /etc/sysconfigtab (see next step) and must not be defined in /etc/rc.config. This applies to NetRAIN devices also. Decide whether you are configuring new devices or reconfiguring old devices for the LAN interconnect. If the latter, you must make appropriate edits to the NRDEV_x, NRCONFIG_x, NETDEV_x, IFCONFIG_x, NR_DEVICES and NUM_NETCONFIG variables so that the same network device names do not appear both in the /etc/rc.config file and the ics_ll_tcp stanza of the /etc/sysconfigtab file.

On each member, set the clubase kernel attribute cluster_interconnect to tcp and the following ics_ll_tcp kernel attributes as appropriate for the member's network configuration. For example:
```
clubase:
cluster_interconnect = tcp
#
ics_ll_tcp:
ics_tcp_adapter0 = nr0
ics_tcp_nr0[0] = ee0
ics_tcp_nr0[1] = ee1
ics_tcp_inetaddr0 = 10.1.0.1
ics_tcp_netmask0 = 255.255.255.0
 
```
For a cluster that was rolled to TruCluster Server Version 5.1A from TruCluster Server Version 5.1, also edit the cluster_node_inter_name attribute of the clubase kernel subsystem. For example:
```
clubase:
cluster_node_inter_name = pepicelli-ics0
 
```

Edit the clusterwide /etc/hosts file so that it contains the IP name and IP address of the cluster interconnect low-level TCP interfaces. For example:

127.0.0.1           localhost
16.140.112.238      pepicelli.zk3.dec.com       pepicelli
16.120.112.209      deli.zk3.dec.com            deli
10.0.0.1            pepicelli-ics0
10.1.0.1            member1-icstcp0
10.0.0.2            pepperoni-ics0               
10.1.0.2            member2-icstcp0
16.140.112.176      pepperoni.zk3.dec.com       pepperoni

For a cluster that was rolled to TruCluster Server Version 5.1A from TruCluster Server Version 5.1, edit the clusterwide /etc/hosts.equiv file and the clusterwide /.rhosts file, changing the mc0 entries to ics0 entries. For example, change:
```
deli.zk3.dec.com
pepicelli-mc0
pepperoni-mc0
 
```
to:
```
deli.zk3.dec.com
pepicelli-ics0
member1-icstcp0
pepperoni-ics0
member2-icstcp0
 
```

For a cluster that was rolled to TruCluster Server Version 5.1A from TruCluster Server Version 5.1, use the rcmgr set command to change the CLUSTER_NET variable in the /etc/rc.config file on each member. For example:
```
# rcmgr get CLUSTER_NET
pepicelli-mc0
# rcmgr set CLUSTER_NET pepicelli-ics0
 
```

Halt all cluster members.

Boot all cluster members, one at a time.

4.6 Migrating from LAN to Memory Channel

This section discusses how to migrate a cluster that uses a LAN interconnect as its cluster interconnect to Memory Channel.

To configure the Memory Channel, perform the following steps:

On each member, make a backup copy of the member-specific /etc/sysconfigtab file.

On each member, set the clubase kernel attribute cluster_interconnect to mct.

Halt all cluster members.

If Memory Channel hardware is installed in the cluster, reboot all cluster members one at a time.
If Memory Channel hardware is not yet installed in the cluster:
1. Power off all members.
2. Install and configure Memory Channel adapters, cables, and hubs as described in the Cluster Hardware Configuration manual.
3. When the Memory Channel hardware has been properly set up, reboot all cluster members one at a time.

4.7 Troubleshooting

This section discusses the following problems that can occur due to a misconfigured LAN interconnect and how you can resolve them:

A booting member joins the cluster but appears to hang later during the boot (Section 4.7.1).

A booting member hangs while trying to join the cluster (Section 4.7.2).

A booting member panics with an "ics_broadcast_setup" message (Section 4.7.3).

A booting member displays an "ifconfig ioctl (SIOCIFADD): Function not implemented: nr0" message (Section 4.7.4).

A booting member displays hundreds of broadcast errors and panics an existing member (Section 4.7.5).

An ifconfig nrx switch command fails with a "No such device nr0" message (Section 4.7.6).

An application running in the cluster cannot bind to a well-known port (Section 4.7.7).

4.7.1 Booting Member Joins Cluster But Appears to Hang Before Reaching Multi-User Mode

If a new member appears to hang at boot time sometime after joining the cluster, the speed or operational mode of the booting member's LAN interconnect adapter is probably inconsistent with that of the LAN interconnect. This problem can result from the adapter failing to autonegotiate properly, from improper hardware settings, or from faulty Ethernet hardware. To determine whether this problem exists, pay close attention to console messages of the following form on the booting member:

ee0: Parallel Detection, 10 Mbps half duplex
ee0: Autonegotiated, 100 Mbps full duplex

For a cluster interconnect running at 100 Mb/s in full-duplex mode, the first message may indicate a problem. The second message indicates that autonegotiation has completed successfully.

The autonegotiation behavior of the Ethernet adapters and switches that are configured in the interconnect may cause unexpected hangs at boot time if you do not take the following considerations into account:

Autonegotiation settings must be the same on both ends of any given cable. That is, if an Ethernet adapter is configured for autonegotiation, the switch port to which it is connected must also be configured for autonegotiation. Similarly, if the adapter is cross-cabled to another member's adapter, the other member's adapter must be set to autonegotiate. If you violate this rule (for example, by setting one end to 100 Mb/s full-duplex, and the other to autonegotiate), the member set to autonegotiate may set itself to half-duplex mode while booting and cluster transactions will experience delays.

Supported 100 Mb/s Ethernet network adapters in AlphaServer systems can use two different drivers: ee and tu.
Network adapters in the DE50x family (which have a console name of the form ew x0) are based on the DECchip 21140, 21142, and 21143 chipsets and use the tu driver. If the network adapter uses the tu driver, it may or may not support autonegotiation.

Note

DE500-XA adapters do not support autonegotiation. Proper autonegotiation succeeds more often with DE500-BA and DE504 adapters than with DE500-AA adapters.
To use autonegotiation, set the ewx0_mode console variable to auto and set the port on the switch connected to the network adapter for autonegotiation.
With network adapters using the tu driver, it may be easier to force the adapter to use 100 Mb/s full-duplex mode explicitly. To force the adapter to use 100 Mb/s full-duplex mode, set the ewx0_mode variable to FastFD. In this case, you must use a switch that allows autonegotiation to be disabled and set the port on the switch connected the network adapter for 100 Mb/s full-duplex. See tu(7) and the switch's manual for more information.
Network adapters in the DE60x family (which have a console name of the form ei x0) use the ee driver. If the network adapter uses the ee driver, it by default uses IEEE 802.3u autonegotiation to determine which speed setting to use. Make sure that the port on the switch to which the network adapter is connected is set for autonegotiation. See ee(7) and the switch's manual for more information.

4.7.2 Booting Member Hangs While Trying to Join Cluster

If a new member hangs at boot time while trying to join the cluster, the new member might be disconnected from the cluster interconnect. The following may have caused the disconnect:

A cable is unplugged.

You specified an existing Ethernet adapter as the physical cluster interconnect interface to clu_add_member, but that adapter is not connected to other members (and perhaps is used for a purpose other than as a LAN interconnect, such as a client network).

You specified an address for the cluster interconnect physical device that is not on the same subnet as those of other cluster members. For example, you may have specified an address on the cluster interconnect virtual subnet (ics0) for the member's cluster interconnect physical device.

You specified a different interconnect type for this member (for example, the cluster_interconnect attribute in its clubase kernel subsystem is mct), whereas the rest of the cluster specifies tcp).

One of the following messages is typically displayed on the console:

CNX MGR: cannot form: quorum disk is in use.  Unable to establish contact
         with members using disk.

or

CNX MGR: Node pepperoni id 2 incarn 0xa3a71 attempting to form or join cluster deli

Perform the following steps to resolve this problem:

Halt the booting member.

Make sure the adapter is properly connected to the LAN interconnect.

Mount the new member's boot partition on another member. For example:
```
# mount root2_domain#root /mnt
 
```

Examine the /mnt/etc/sysconfigtab file. The attributes listed in Table C-1 must be set correctly to reflect the member's LAN interconnect interface.

Edit /mnt/etc/sysconfigtab as appropriate.

Unmount the member's boot partition:
```
# umount /mnt
 
```

Reboot the member.

4.7.3 Booting Member Panics with "ics_broadcast_setup" Message

If you boot a new member into the cluster and it panics with an "ics_broadcast_setup: sobind failed error=49" message, you may have specified a device that does not exist as the member's physical cluster interconnect interface to clu_add_member.

Perform the following steps to resolve this problem:

Halt the booting member.

Mount the new member's boot partition on another member. For example:
```
# mount root2_domain#root /mnt
 
```

Examine the /mnt/etc/sysconfigtab file. The attributes listed in Table C-1 must be set to correctly reflect the member's LAN interconnect interface.

Edit /mnt/etc/sysconfigtab as appropriate.

Unmount the member's boot partition:
```
# umount /mnt
 
```

Reboot the member.

4.7.4 Booting Member Displays "ifconfig ioctl (SIOCIFADD): Function not implemented: nr0" Message

If you boot a new member into the cluster and it displays the "ifconfig ioctl (SIOCIFADD): Function not implemented: nr0" message shortly after the installation tasks commence, a NetRAIN virtual interface used for the cluster interconnect has probably been misconfigured. Perhaps you have edited the /etc/rc.config file to apply traditional NetRAIN admin to the LAN interconnect. In this case, the NetRAIN configuration in the /etc/rc.config file is ignored and the NetRAIN interface defined in /etc/sysconfigtab is used as the cluster interconnect.

Note

If the address you specify in /etc/rc.config for the cluster interconnect NetRAIN device is on the same subnet as that used by the cluster interconnect virtual device (ics0), the boot will display hundreds of instances of the following message after the ifconfig message:
WARNING: ics_socket_event: reconfig: error 54 on channel xx,
                assume node 2 is down
 
Eventually, the cluster will remove either the booting member or an existing member (if only one member is up) with one of the following panics:
CNX QDISK: Yielding to foreign owner with quorum.
 
CNX MGR: this node removed from cluster.
 

As discussed in Section 4.1, you must never configure a NetRAIN set that is used for a cluster interconnect in the /etc/rc.config file. (The NetRAIN virtual interface for the cluster interconnect is configured in the /etc/sysconfigtab file.)

Perform the following steps to resolve this problem:

Use the rcmgr delete command to edit the newly booted member's /cluster/members/{memb}/etc/rc.config file to remove the NRDEV_x, NRCONFIG_x, NETDEV_x, and IFCONFIG_x variables associated with the device.

Use the rcmgr set command to decrement the NR_DEVICES and NUM_NETCONFIG variables that doubly define the cluster interconnect NetRAIN device.

Reboot the member.

4.7.5 Many Broadcast Errors on Booting or Booting New Member Panics Existing Member

The Spanning Tree Protocol (STP) must be disabled on all Ethernet switch ports connected to the adapters on cluster members, whether they are single adapters or included in the NetRAIN virtual interfaces. If this is not the case, cluster members may be flooded by broadcast messages that, in effect, create denial-of-service symptoms in the cluster. You may see hundreds of instances of the following message when booting the first and subsequent members:

arp: local IP address 10.1.0.100 in use by hardware address 00-00-00-00-00-00

These messages will be followed by:

CNX MGR: cnx_pinger: broadcast problem: err 35

Booting additional members into this cluster may result in a hang or panic of existing members, especially if a quorum disk is configured. During the boot, you may see the following message:

CNX MGR: cannot form: quorum disk is in use. Unable to establish
contact with members using disk.

However, after 30 seconds or so, the member may succeed in discovering the quorum disk and form its own cluster, while the existing members hang or panic.

4.7.6 Cannot Manually Fail Over Devices in a NetRAIN Virtual Interface

NetRAIN monitors the health of inactive interfaces by checking whether they are receiving packets and, if necessary, by sending probe packets from the active interface. If an inactive interface becomes disconnected, NetRAIN may mark it as DEAD. If you pull the cables on the active adapter, NetRAIN attempts to activate the DEAD standby adapter. Unless there is a real problem with this adapter, the failover works properly.

However, a manual NetRAIN switch operation (for example, ifconfig nr0 switch) behaves in a different way. In this case, NetRAIN does not attempt to fail over to a DEAD adapter when there are no healthy standby adapters. The ifconfig nr0 switch command returns a message such as the following:

ifconfig ioctl (SIOCIFSWITCH) No such device nr0

You may see this behavior in a dual-switch configuration if one switch is power cycled and you immediately try to manually fail over an active adapter from the other switch. After the switch that has been powered on has initialized itself (in a few minutes or so), manual NetRAIN failover should behave properly. If the failover does not work correctly, examine the cabling of the switches and adapters and use the ifconfig and niffconfig commands to determine the state of the interfaces.

4.7.7 Applications Unable to Map to Port

By default, the communications subsystem in a cluster using a LAN interconnect uses port 900 as a rendezvous port for cluster broadcast traffic and reserves ports 901 through 910 and 912 through 917 for nonbroadcast channels. If an application uses a hardcoded reference to one of these ports, it will fail to bind to the port.

To remedy this situation, change the ports used by the LAN interconnect. Edit the ics_tcp_rendezvous_port and ics_tcp_ports attributes in the ics_ll_tcp subsystem, as described in sys_attrs_ics_ll_tcp(5), and reboot the entire cluster. The rendezvous port must be identical on all cluster members; the nonbroadcast ports may differ across members, although administration is simplied by defining the same ports on each member.