1    Managing Clusters Overview

Managing a TruCluster Server cluster is similar to managing a standalone Tru64 UNIX system. Of the more than 600 commands and utilities for system administration, fewer than 20 apply exclusively to clusters. You use most of those commands when creating a cluster, adding a new member to a cluster, or making an application highly available. If you know how to manage a Tru64 UNIX system, you already know most of what is needed to manage a TruCluster Server cluster.

This manual describes the relatively few situations where managing a cluster is different. For documentation about the other management procedures, see the Tru64 UNIX System Administration guide.

Before reading further, familiarize yourself with the material in the TruCluster Server Cluster Technical Overview. An understanding of the information in that manual is necessary to managing a cluster.

The chapter discusses the following topics:

In most cases, the fact that you are administering a cluster rather than a single system becomes apparent because of the occasional need to manage one of the following aspects of the TruCluster Server:

In addition to the previous items, there are some command-level exceptions when a cluster does not appear to the user like a single computer system. For example, when you execute the wall command, the message is sent only to users who are logged in on the cluster member where the command executes. To send a message to all users who are logged in on all cluster members, use the wall -c command.

1.1    Commands and Utilities for Clusters

Table 1-1 lists commands that are specific to managing TruCluster Server systems. These commands manipulate or query aspects of a cluster. You can find descriptions for these commands in the reference pages.

Table 1-1:  Cluster Commands

Function Command Description
Create and configure cluster members clu_create(8) Creates an initial cluster member on a Tru64 UNIX system.
  clu_add_member(8) Adds a member to a cluster.
  clu_delete_member(8) Deletes a member from a cluster.
  clu_check_config(8) Verifies that the TruCluster Server has been properly installed, and that the cluster is correctly configured.
  clu_get_info(8) Displays information about a cluster and its members.
Define and manage highly available applications caad(8) Starts the CAA daemon.
  caa_profile(8) Manages an application availability profile and performs basic syntax verification.
  caa_register(8) Registers an application with CAA.
  caa_relocate(8) Manually relocates a highly available application from one cluster member to another.
  caa_start(8) Starts a highly available application registered with the CAA daemon.
  caa_stat(1) Provides status on applications registered with CAA.
  caa_stop(8) Stops a highly available application.
  caa_unregister(8) Unregisters a highly available application.
Manage cluster alias cluamgr(8) Creates and manages cluster aliases.
Manage quorum and votes clu_quorum(8) Configures or deletes a quorum disk, or adjusts quorum disk votes, member votes, or expected votes.
Manage context-dependent symbolc links (CDSLs) mkcdsl(8) Makes or checks CDSLs.
Manage device request dispatcher drdmgr(8) Gets or sets distributed device attributes.
Manage Cluster File System (CFS) cfsmgr(8) Manages a mounted file system in a cluster.
Query the status of Memory Channel imcs(1) Reports the status of the Memory Channel application programming interface (API) library, libimc.
  imc_init(1) Initializes and configures the Memory Channel API library, libimc, on the current host.

1.2    Commands and Features That Are Different in a Cluster

The following tables list Tru64 UNIX commands and subsystems that have cluster-specific options, or that behave differently in a cluster than on a standalone Tru64 UNIX system.

In general, commands that manage processes are not cluster-aware and can be used only to manage the member on which they are executed.

Table 1-2 describes the differences in commands and utilities that manage files systems and storage.

In a standalone Tru64 UNIX system, the root file system (/) is root_domain#root. In a cluster, the root file system is always cluster_root#root. The boot partition for each cluster member is rootmemberID_domain#root.

For example, on the cluster member with member ID 6, the boot partition, /cluster/members/member6/boot_partition, is root6_domain#root.

Table 1-2:  File Systems and Storage Differences

Command Differences
addvol(8)

In a single system, you cannot use addvol to expand root_domain. However, in a cluster, you can use addvol to add volumes to the cluster_root domain.

You can remove volumes from the cluster_root domain with the rmvol command.

Logical Storage Manager (LSM) volumes cannot be used within the cluster_root domain. An attempt to use the addvol command to add an LSM volume to the cluster_root domain fails.

bttape(8)

The bttape utility is not supported in clusters.

For more information about backing up and restoring files, see Section 9.8.

df(1)

The df command does not account for data in client caches. Data in client caches is synchronized to the server at least every 30 seconds. Until synchronization occurs, the physical file system is not aware of the cached data and does not allocate storage for it.

iostat(1)

The iostat command displays statistics for devices on a shared or private bus that are directly connected to the member on which the command executes.

Statistics pertain to traffic that is generated to and from the local member.

LSM

voldisk(8)

volencap(8)

volreconfig(8)

volstat(8)

volmigrate(8)

volunmigrate(8)

The voldisk list command can give different results on different members for disks that are not under LSM control (that is, autoconfig disks). The differences are typically limited to disabled disk groups. For example, one member might show a disabled disk group and another member might not display that disk group at all.

In a cluster, the volencap swap command places the swap devices for an individual cluster member into an LSM volume. Run the command on each member whose swap devices you want to encapsulate.

The volreconfig command is required only when you encapsulate members' swap devices. Run the command on each member whose swap devices you want to encapsulate. When encapsulating the cluster_usr domain with the volencap command, you must shut down the cluster to complete the encapsulation. The volreconfig command is called during the cluster reboot; you do not need to run it separately.

The volstat command returns statistics only for the member on which it is executed.

The volmigrate command modifies an Advanced File System (AdvFS) domain to use LSM volumes for its underlying storage. The volunmigrate command modifies any AdvFS domain to use physical disks instead of LSM volumes for its underlying storage.

For more information on LSM in a cluster, see Chapter 10.

mount(8)

Network File System (NFS) loopback mounts are not supported. For more information, see Section 7.6.2.3.

Other commands that run through mountd, like umount and export, receive a Program unavailable error when the commands are sent from external clients and do not use the default cluster alias or an alias listed in /etc/exports.aliases.

Prestoserve

presto(8)

dxpresto(8X)

prestosetup(8)

prestoctl_svc(8)

Prestoserve is not supported in a cluster.
showfsets(8)

The showfsets command does not account for data in client caches. Data in client caches is synchronized to the server at least every 30 seconds. Until synchronization occurs, the physical file system is not aware of the cached data and does not allocate storage for it.

Fileset quotas and storage limitations are enforced by ensuring that clients do not cache so much dirty data that they exceed quotas or the actual amount of physical storage.

UNIX File System (UFS)

Memory File System (MFS)

A UFS file system is served for read-only access based on connectivity. Upon member failure, CFS selects a new server for the file system. Upon path failure, CFS uses an alternate device request dispatcher path to the storage.

A cluster member can mount a UFS file system read/write. The file system is accessible only by that member. There is no remote access; there is no failover. MFS file system mounts, whether read-only or read/write, are accessible only by the member that mounts it. The server for an MFS file system or a read/write UFS file system is the member that initializes the mount.

verify(8)

You can use the verify command to learn the cluster root domain, but the -f and -d options cannot be used.

For more information, see Section 9.11.1.

Table 1-3 describes the differences in commands and utilities that manage networking.

Table 1-3:  Networking Differences

Command Differences

Berkeley Internet Name Domain (BIND)

bindconfig(8)

bindsetup(8)

svcsetup(8)

The bindsetup command was retired in Tru64 UNIX Version 5.0. Use the sysman dns command or the equivalent command, bindconfig, to configure BIND in a cluster.

BIND client configuration is clusterwide. All cluster members have the same client configuration.

Only one member of a cluster can be a BIND server. A BIND server is configured as a highly available service under CAA. The cluster alias acts as the server name.

For more information, see Section 7.4.

Broadcast messages

wall(1)

rwall(1)

The wall -c command sends messages to all users on all members of the cluster. Without any options, the wall command sends messages to all users who are logged in to the member where the command is executed.

Broadcast messages to the default cluster alias from rwall are sent to all users logged in on all cluster members.

In a cluster, a clu_wall daemon runs on each cluster member to receive wall -c messages.

Dynamic Host Configuration Protocol (DHCP)

joinc(8)

A cluster can be a DHCP server, but cluster members cannot be DHCP clients. Do not run joinc in a cluster. Cluster members must use static addressing.

For more information, see Section 7.1.

dsfmgr(8)

When using the -a class option, specify c (cluster) as the entry_type.

The output from the -s option indicates c (cluster) as the scope of the device.

The -o and -O options, which create device special files in the old format, are not valid in a cluster.

Mail

mailconfig(8)

mailsetup(8)

mailstats(8)

All members that are running mail must have the same mail configuration and, therefore, must have the same protocols enabled. All members must be either clients or servers. See Section 7.8 for details.

The mailstats command returns mail statistics for the cluster member on which it was run. The mail statistics file, /usr/adm/sendmail/sendmail.st, is a member-specific file; each cluster member has its own version of the file.

Network File System (NFS)

nfsconfig(8)

rpc.lockd(8)

rpc.statd(8)

Use sysman nfs or the nfsconfig command to configure NFS. Do not use the nfssetup command, it was retired in Tru64 UNIX Version 5.0.

Cluster members can run client versions of lockd and statd. Only one cluster member runs an additional lockd and statd pair for the NFS server. The server lockd and statd are highly available and are under the control of CAA.

For more information, see Section 7.6.

Network management

netconfig(8)

netsetup(8)

gated(8)

routed(8)

If, as we recommended, you configured networks during cluster configuration, gated was configured as the routing daemon. See the TruCluster Server Cluster Installation manual for more information.

If you later run netconfig, you must select gated, not routed, as the routing daemon.

The netsetup command has been retired. Do not use it.

Network Interface Failure Finder (NIFF)

niffconfig(8)

niffd(8)

In order for NIFF to monitor the network interfaces in the cluster, niffd, the NIFF daemon, must run on each cluster member. For more information, see Section 6.1.

Network Information Service (NIS)

nissetup(8)

NIS runs as a highly available application. The default cluster alias name is used to identify the NIS master.

For more information, see Section 7.2.

Network Time Protocol (NTP)

ntp(1)

All cluster members require time synchronization. NTP meets this requirement.

Each cluster member is automatically configured as an NTP peer of the other members. You do not need to do any special NTP configuration.

For more information, see Section 7.5.

routed(8)

routed is not supported in TruCluster Server systems. The cluster alias requires gated.

When you create the initial cluster member, clu_create configures gated. When you add a new cluster member, clu_add_member propagates the configuration to the new member.

For more information about routers, see Section 6.2.

Table 1-4 describes the differences in printing management.

Table 1-4:  Printing Differences

Command Differences

lprsetup(8)

printconfig(8)

A cluster-specific printer attribute, on, designates the cluster members that are serving the printer. The print configuration utilities, lprsetup and printconfig, provide an easy means for setting the on attribute.

The file /etc/printcap is shared by all members in the cluster.

For more information, see Section 7.3.

Advanced Printing Software For information on installing and using Advanced Printing Software in a cluster, see the configuration notes chapter in the Tru64 UNIX Advanced Printing Software Release Notes.

Table 1-5 describes the differences in managing security. For information on enhanced security in a cluster, see the Tru64 UNIX Security manual.

Table 1-5:  Security Differences

Command Differences

auditd(8)

auditconfig(8)

audit_tool(8)

A cluster is a single security domain. To have root privileges on the cluster, you can log in as root on the cluster alias or on any one of the cluster members. Similarly, access control lists (ACLs) and user authorizations and privileges are clusterwide.

With the exception of audit log files, security related files, directories, and databases are shared throughout the cluster. Audit log files are specific to each member -- an audit daemon, auditd, runs on each member and each member has its own unique audit log files. If any single cluster member fails, auditing continues uninterrupted for the other cluster members.

To generate an audit report for the entire cluster, you can pass the name of the audit log CDSL to the audit reduction tool, audit_tool. Specify the appropriate individual log names to generate an audit report for one or more members.

If you want enhanced security, we strongly recommend that you configure enhanced security before cluster creation. A clusterwide shutdown and reboot are required to configure enhanced security after cluster creation.

rlogin(1)

rsh(1)

rcp(1)

An rlogin, rsh, or rcp request from the cluster uses the default cluster alias as the source address. Therefore, if a noncluster host must allow remote host access from any account in the cluster, its .rhosts file must include the cluster alias name (in one of the forms by which it is listed in the /etc/hosts file or one resolvable through NIS or the Domain Name System (DNS)).

The same requirement holds for rlogin, rsh, or rcp to work between cluster members.

For more information, see Section 5.3.

Table 1-6 describes the differences in commands and utilities for configuring and managing systems.

Table 1-6:  General System Management Differences

Command Differences
Dataless Management Services (DMS) DMS is not supported in a TruCluster Server environment. A cluster can be neither a DMS client nor a server.

Event Manager (EVM) and event management

Events have a cluster_event attribute. When this attribute is set to true, the event, when it is posted, is posted to all members of the cluster. Events with cluster_event set to false are posted only to the member on which the event was generated.

For a list of cluster events, see Appendix A.

halt(8)

reboot(8)

init(8)

shutdown(8)

There is no clusterwide halt or reboot.

The halt and reboot commands act only on the member on which the command is executed. halt, reboot, and init have been modified to leave file systems in a cluster mounted, because the file systems are automatically relocated to another cluster member.

You can use shutdown -c to halt a cluster.

The shutdown -c time command fails if any of the commands clu_quorum, clu_add_member or clu_delete_member is in progress.

You can shut down a cluster to a halt, but you cannot reboot (shutdown -r) the entire cluster.

To shut down a single cluster member, execute the shutdown command from that member.

For more information, see shutdown(8).

hwmgr(8)

In a cluster, the -member option allows you to designate the host name of the cluster member that the hwmgr command acts upon.

Use the -cluster option to specify that the command acts clusterwide.

When neither the -member nor -cluster option is used, hwmgr acts on the system where it is executed.

Process control

ps(1)

A range of possible process identifiers (PIDs) is assigned to each cluster member to provide unique process IDs clusterwide. The ps command reports only on processes that are running on the member where the command executes.

kill(1)

If the passed parameter is greater than zero (0), the signal is sent to the process whose PID matches the passed parameter, no matter on which cluster member it is running. If the passed parameter is less than -1, the signal is sent to all processes (cluster-wide) whose process group ID matches the absolute value of the passed parameter.

Even though the PID for init on a cluster member is not 1, kill 1 behaves as it would on a standalone system and sends the signal to all processes on the current cluster member, except for kernel idle and /sbin/init.

rcmgr(8)

The hierarchy of the /etc/rc.config* files allows an administrator to define configuration variables consistently over all systems within a local area network (LAN) and within a cluster.

For more information, see Section 5.1.

sysman_clone(8)

sysman -clone

Configuration cloning and replication is not supported in a cluster.

Attempts to use the sysman -clone command in a cluster fail and return the following message: Error: Cloning in a cluster environment is not supported.

System accounting services and the associated commands

fuser(8)

mailstats(8)

ps(1)

uptime(1)

vmstat(1)

w(1)

who(1)

These commands are not cluster-aware. Executing one of these commands returns information for only the cluster member on which the command executes. It does not return information for the entire cluster.

See Section 5.13.

Table 1-7 describes features that TruCluster Server does not support.

Table 1-7:  Features Not Supported

Feature Comments

Archiving

bttape(8)

The bttape utility is not supported in clusters.

For more information about backing up and restoring files, see Section 9.8.

LSM

volrootmir(8)

volunroot(8)

The volrootmir and volunroot commands are not supported for clusters.

For more information on LSM in a cluster, see Chapter 10.

mount(8)

NFS loopback mounts are not supported. For more information, see Section 7.6.2.3.

Other commands that run through mountd, like umount and export, receive a Program unavailable error when the commands are sent from external clients and do not use the default cluster alias or an alias listed in /etc/exports.aliases.

Prestoserve

presto(8)

dxpresto(8X)

prestosetup(8)

prestoctl_svc(8)

Prestoserve is not supported in a cluster.
routed(8)

The routed daemon is not supported in TruCluster Server systems. The cluster alias requires gated.

When you create the initial cluster member, clu_create configures gated. When you add a new cluster member, clu_add_member propagates the configuration to the new member.

For more information about routers, see Section 6.2.

Dataless Management Services (DMS) DMS is not supported in a TruCluster Server environment. A cluster can be neither a DMS client nor a server.
UNIX File System (UFS) A cluster member can mount a UFS file system read/write. The file system is accessible only by that member. There is no remote access; there is no failover.

sysman_clone(8)

sysman -clone

Configuration cloning and replication is not supported in a cluster.

Attempts to use the sysman -clone command in a cluster fail and return the following message: Error: Cloning in a cluster environment is not supported.