This chapter describes general migration issues that are relevant to all
types of applications.
Table 4-1
lists each migration
issue and the types of applications that might encounter them, as well
as where to find more information.
Table 4-1: Application Migration Considerations
Issues | Application Types Affected | For More Information |
Clusterwide and member-specific files | Single-instanceMulti-instanceDistributed | Section 4.1 |
Device naming | Single-instanceMulti-instanceDistributed | Section 4.2 |
Interprocess communication | Multi-instanceDistributed | Section 4.3 |
Synchronized access to shared data | Multi-instanceDistributed | Section 4.4 |
Member-specific resources | Single-instance | Section 4.5 |
Expanded process IDs (PIDs) | Multi-instanceDistributed | Section 4.6 |
Distributed lock manager (DLM) parameters removed | Multi-instance Distributed | Section 4.7 |
Licensing | Single-instanceMulti-instanceDistributed | Section 4.8 |
Blocking layered products | Single-instanceMulti-instanceDistributed | Section 4.9 |
Other migration issues | Single-instanceMulti-instanceDistributed | Section 4.10 |
4.1 Clusterwide and Member-Specific Files
In a cluster, there are two sets of configuration data:
Clusterwide data
Clusterwide data pertains to files and logs that can be shared by all
members of a cluster.
For example, when two systems are members of a
cluster, they share a common
/etc/passwd
file
that contains information about the authorized users for both systems.
Sharing configuration or management data makes file management easier. For example, Apache and Netscape configuration files can be shared, allowing you to manage the application from any node in the cluster.
Member-specific data
Do not allow files that contain member-specific data to be shared by all members of a cluster. Member-specific data may be configuration details that pertain to hardware found only on a specific system, such as a layered product driver for a specific printer connected to one cluster member.
Because the Cluster File System (CFS) makes all files visible to and accessible by all cluster members, those applications that require clusterwide configuration data can easily write to a configuration file that all members can view. However, an application that must use and maintain member-specific configuration information needs to take some additional steps to avoid overwriting files.
To avoid overwriting files, consider using one of the following methods:
Method | Advantage | Disadvantage |
Single file | Easy to manage. | Application must be aware of how to access member-specific data in the single file. |
Multiple files | Keeps configuration information in a set of clusterwide files. | Multiple copies of files need to be maintained. Application must be aware of how to access member-specific files. |
Context-dependent symbolic links (CDSLs) | Keeps configuration information in member-specific areas. CDSLs are transparent to the application; they look like soft links. | Moving or renaming files will break symbolic links. Application must be aware of how to handle CDSLs. Using CDSLs makes it more difficult for an application to find out about other instances of that application in the cluster. |
Consider the alternative that best fits your application's needs.
The
following sections describe each approach.
4.1.1 Using a Single File
Using a single, uniquely named file keeps application configuration information in one clusterwide file as separate records for each node. The application reads and writes the correct record in the file. Managing a single file is easy because all data is in one central location.
As an example, in a cluster the
/etc/printcap
file
contains entries for specific printers.
The following parameter can be
specified to indicate which nodes in the cluster can run the spooler
for the print queue:
:on=nodename1,nodename2,nodename3,...:
If the first node is up, it will run the spooler.
If that node goes
down, the next node, if it is up, will run the spooler, and so on.
4.1.2 Using Multiple Files
Using uniquely named multiple files keeps configuration information in a
set of clusterwide files.
For example, each cluster member has its own
member-specific
gated
configuration file in
/etc
.
Instead of using a context-dependent symbolic
link (CDSL) to reference member-specific files through a common
file name, the naming convention for these files takes advantage of
member IDs to create a unique name for each member's file.
For example:
# ls -l /etc/gated.conf.member* -rw-r--r-- 1 root system 466 Jun 21 17:37 /etc/gated.conf.member1 -rw-r--r-- 1 root system 466 Jun 21 17:37 /etc/gated.conf.member2 -rw-r--r-- 1 root system 466 Jun 21 13:28 /etc/gated.conf.member3
This approach requires more work to manage because multiple copies
of files need to be maintained.
For example, if the member ID of a
cluster member changes, you must find and rename all member-specific
files belonging to that member.
Also, if the application is unaware of
how to access member-specific files, you must configure it to do so.
4.1.3 Using CDSLs
Tru64 UNIX Version 5.0 introduced a special form of symbolic link, called a context-dependent symbolic link (CDSL), that TruCluster Server uses to point to the correct file for each member. CDSLs are useful when running multiple instances of an application on different cluster members on different sets of data.
Using a CDSL keeps configuration information in member-specific areas. However, the data can be referenced through the CDSL. Each member reads the common file name, but is transparently linked to its copy of the configuration file. CDSLs are an alternative to maintaining member-specific configuration information when an application cannot be easily changed to use multiple files.
The following example shows the CDSL structure for the file
/etc/rc.config
:
/etc/rc.config -> ../cluster/members/{memb}/etc/rc.config
For example, where a cluster member has a member ID of 3, the pathname
/cluster/members/{memb}/etc/rc.config
resolves to
/cluster/members/member3/etc/rc.config
.
Tru64 UNIX provides a
mkcdsl
command that lets
system administrators create CDSLs and update a CDSL inventory file.
For more information on this command, see the TruCluster Server
Cluster Administration
manual and
mkcdsl
(8).
For more information on creating CDSLs and hints to
avoid overwriting them, see the Tru64 UNIX
System Administration
manual,
hier
(5),
ln
(1), and
symlink
(2).
4.2 Device Naming
Tru64 UNIX Version 5.0 introduced a new device-naming convention that consists of a descriptive name for the device and an instance number. These two elements form the basename of the device. For example:
Location in /dev | Device Name | Instance | Basename |
./disk |
dsk |
0 | dsk0 |
./disk |
cdrom |
1 | cdrom1 |
./tape |
tape |
0 | tape0 |
Moving a disk from one physical connection to another does not change the device name for the disk. For a detailed discussion of this device-naming model, see the Tru64 UNIX System Administration manual.
Although Tru64 UNIX Version 5.0 recognizes both the
old-style (rz
) and new-style (dsk
) device
names, TruCluster Server Version 5.1 and later recognizes only new-style
device names.
Applications that depend on old-style device names or the
/dev
directory structure must be modified to use
the newer device-naming convention.
You can use the
hwmgr
utility, a generic utility for
managing hardware, to help map device names to their bus, target, and
LUN position after installing Tru64 UNIX Version 5.1A.
For example,
enter the following command to view devices:
# hwmgr -view devices HWID: Device Name Mfg Model Location -------------------------------------------------------------------- 40: /dev/disk/dsk2c DEC RZ28M (C) DEC bus-1-targ-1-lun-0 41: /dev/disk/dsk3c DEC RZ28L-AS (C) DEC bus-1-targ-2-lun-0 42: /dev/disk/dsk4c DEC RZ29B (C) DEC bus-1-targ-3-lun-0 43: /dev/disk/dsk5c DEC RZ28D (C) DEC bus-1-targ-4-lun-0 44: /dev/disk/dsk6c DEC RZ28L-AS (C) DEC bus-1-targ-5-lun-0 45: /dev/disk/dsk7c DEC RZ1CF-CF (C) DEC bus-1-targ-8-lun-0 46: /dev/disk/dsk8c DEC RZ1CB-CS (C) DEC bus-1-targ-9-lun-0 47: /dev/disk/dsk9c DEC RZ1CF-CF (C) DEC bus-1-targ-10-lun-0 48: /dev/disk/dsk10c DEC RZ1CF-CF (C) DEC bus-1-targ-11-lun-0 49: /dev/disk/dsk11c DEC RZ1CF-CF (C) DEC bus-1-targ-12-lun-0 50: /dev/disk/dsk12c DEC RZ1CF-CF (C) DEC bus-1-targ-13-lun-0 97: /dev/kevm 122: /dev/disk/floppy2c 3.5in floppy fdi0-unit-0 136: /dev/disk/dsk15c DEC RZ28M (C) DEC bus-0-targ-0-lun-0 137: /dev/disk/dsk16c DEC RZ28L-AS (C) DEC bus-0-targ-1-lun-0 138: /dev/disk/dsk17c DEC RZ28 (C) DEC bus-0-targ-2-lun-0 139: /dev/disk/dsk18c DEC RZ28D (C) DEC bus-0-targ-3-lun-0 140: /dev/disk/cdrom2c DEC RRD46 (C) DEC bus-0-targ-6-lun-0
Use the following command to view devices clusterwide:
# hwmgr -view devices -cluster HWID: Device Name Mfg Model Hostname Location ----------------------------------------------------------------------- 4: /dev/kevm provolone 33: /dev/disk/floppy0c 3.5in floppy provolone fdi0-unit-0 37: /dev/disk/dsk0c DEC RZ26L (C) DEC provolone bus-0-targ-0-lun-0 38: /dev/disk/cdrom0c DEC RRD46 (C) DEC provolone bus-0-targ-4-lun-0 39: /dev/disk/dsk1c DEC RZ1DF-CB (C) DEC provolone bus-0-targ-8-lun-0 40: /dev/disk/dsk2c DEC RZ28M (C) DEC provolone bus-1-targ-1-lun-0 40: /dev/disk/dsk2c DEC RZ28M (C) DEC pepicelli bus-1-targ-1-lun-0 40: /dev/disk/dsk2c DEC RZ28M (C) DEC polishham bus-1-targ-1-lun-0 . . .
For more information on using this command, see
hwmgr
(8).
When modifying applications to use the new device-naming convention, look for the following:
Disks that are included in Advanced File System (AdvFS) domains
Raw disk devices
Disks that are encapsulated in Logical Storage Manager (LSM) volumes or that are included in disk groups
Disk names in scripts
Disk names in data files (Oracle OPS and Informix XPS)
SCSI bus renumbering
Note
If you previously renumbered SCSI buses in your ASE, closely verify the mapping from physical device to bus number during an upgrade to TruCluster Server. See the TruCluster Server Cluster Installation manual for more information.
4.3 Interprocess Communication
Among the mechanisms that are used by applications to perform interprocess communication (IPC) are shared memory, named pipes, and signals. However, shared memory, named pipes, and signals are not supported clusterwide in TruCluster Server. If an application uses any of these IPC methods, it must be restricted to running as a single-instance application.
To run multiple instances of an application on more than one cluster
member, perform all IPC through remote procedure calls (RPCs) or socket
connections.
4.4 Synchronized Access to Shared Data
Multiple instances of an application running within a cluster must synchronize with each other for most of the same reasons that multiprocess and multithreaded applications synchronize on a standalone system. However, memory-based synchronization mechanisms (such as critical sections, mutexes, simple locks, and complex locks) work only on the local system and not clusterwide. Shared file data must be synchronized, or files must be used to synchronize the execution of instances across the cluster.
Because the Cluster File System (CFS) is fully POSIX compliant, an
application can use
flock()
system calls to
synchronize access to shared files among instances.
You can also use
the distributed lock manager (DLM) API library functions for more
sophisticated locking capabilities (such as additional lock modes,
lock conversions, and deadlock detection).
Because the DLM API library
is supplied only in the TruCluster Server product, make sure that code that
uses its functions and that is meant also to run on nonclustered systems
precedes any DLM function calls with a call to
clu_is_member()
.
The
clu_is_member()
function verifies that the system is in fact a cluster member.
For more
information about this command, see
clu_is_member
(3).
4.5 Member-Specific Resources
If multiple instances of an application are started simultaneously on
more than one cluster member, the application may not work properly
because it depends on resources that are available only from a specific member,
such as large CPU cycles or large physical memory.
This may restrict the
application to running as a single instance in a cluster.
Changing these
characteristics in an application may be enough to allow it to run as
multiple instances in a cluster.
4.6 Expanded PIDs
In TruCluster Server, process identifiers (PIDs) are expanded to a full
32-bit value.
The data type
PID_MAX
is increased to
2147483647 (0x7fffffff); therefore, any applications that test for
PID <= PID_MAX
must be recompiled.
To ensure that PIDs are unique across a cluster, PIDs for each cluster member are based on the member ID and are allocated from a range of numbers unique to that member. The formula for available PIDs in a cluster is:
PID = (memberid * (2**19)) + 2
Typically, the first two values are reserved for the
kernel idle
process and
/sbin/init
.
For example, PIDs 524,288 and 524,289 are assigned to
kernel idle
and
init
,
respectively.
Use PIDs to uniquely identify log and temporary files.
If an application
does store a PID in a file, make sure that that file is member-specific.
4.7 DLM Parameters Removed
Because the distributed lock manager (DLM) persistent resources,
resource groups, and transaction IDs are enabled by default in TruCluster Server
Version 1.6 and later, the
dlm_disable_rd
and
dlm_disable_grptx
attributes are unneeded and have
been removed from the DLM kernel subsystem.
4.8 Licensing
This section discusses licensing constraints and issues.
4.8.1 TruCluster Server Licensing Constraints
TruCluster Server Version 5.1A does not support clusterwide licensing.
Each time that you add an additional member to the cluster, you must register
all required application licenses on that member for applications that
may run on that member.
4.8.2 Layered Product Licensing and Network Adapter Failover
The Redundant Array of Independent Network Adapters (NetRAIN) and the Network Interface Failure Finder (NIFF) provide mechanisms for facilitating network failover and replace the monitored network interface method that was employed in the TruCluster Available Server and Production Server products.
NetRAIN provides transparent network adapter failover for multiple adapter configurations. NetRAIN monitors the status of its network interfaces with NIFF, which detects and reports possible network failures. You can use NIFF to generate events when network devices, including a composite NetRAIN device, fail. You can monitor these events and take appropriate actions when a failure occurs. For more information about NetRAIN and NIFF, see the Tru64 UNIX Network Administration: Connections manual.
In a cluster, an application may fail over and restart itself on another member. If it performs a license check when restarting, it may fail because it was looking for a particular member's IP address or its adapter's media access control (MAC) address.
Licensing schemes that use a network adapter's MAC
address to uniquely identify a machine can be affected by how
NetRAIN changes the MAC address.
All network drivers support the
SIOCRPHYSADDR
ioctl
that fetches MAC addresses from
the interface.
This
ioctl
returns two addresses
in an array:
Default hardware address -- the permanent address that is taken from the small PROM that each LAN adapter contains.
Current physical address -- the address that the network interface responds to on the wire.
For licensing schemes that are based on MAC addresses, use the default hardware
address that is returned by SIOCRPHYSADDR
ioctl
; do not
use the current physical address because NetRAIN modifies this address for its
own use.
See the reference page for your network adapter (for example,
tu
(7)) for a sample program that uses
the SIOCRPHYSADDR
ioctl
.
4.9 Blocking Layered Products
A blocking layered product is a product that prevents the
installupdate
command from completing during an
update installation of TruCluster Server Version 5.1A.
Blocking layered
products must be removed from the cluster before starting a rolling
upgrade that will include running the
installupdate
command.
Unless a layered product's documentation specifically states that you can install a newer version of the product on the first rolled member, and that the layered product knows what actions to take in a mixed-version cluster, we strongly recommend that you do not install either a new layered product or a new version of a currently installed layered product during a rolling upgrade.
The TruCluster Server
Cluster Installation
manual lists all layered
products that are known to break an update installation on
TruCluster Server Version 5.1A.
4.10 Other Migration Issues
This section discusses other migration issues to consider before
moving applications to TruCluster Server.
4.10.1 UFS Dependencies
TruCluster Server Version 5.1A supports the UNIX File System (UFS) as a read-only file system clusterwide. That is, a UFS file system explicitly mounted read-only is served for clusterwide read-only access by a member selected for its connectivity to the storage containing the file system.
TruCluster Server Version 5.1A includes read/write support for UFS file systems, but the file system is accessible only by that member. No other cluster members can access that file system. There is no failover should that member go down.
Advanced File System (AdvFS) file systems are required to upgrade an existing TruCluster Available Server or Production Server environment to TruCluster Server. If you are using UFS, we recommend that you migrate these file systems to AdvFS before beginning an upgrade to TruCluster Server.
See the TruCluster Server
Cluster Installation
manual for complete
TruCluster Server upgrade requirements and the Tru64 UNIX
AdvFS Administration
for information about migrating UFS file systems to AdvFS.
4.10.2 New On-Disk Format for AdvFS Domains
Tru64 UNIX Version 5.0 and later employs a new-style AdvFS format, which provides performance enhancements. Tru64 UNIX Version 5.0 and later recognizes both the new-style and old-style formats. However, file domains created in previous versions of Tru64 UNIX do not support these enhancements.
To convert data to the newer format, back up the data and restore it to a new-style AdvFS domain. There is no conversion utility.