This chapter provides information about the patches included in Patch Kit 4 for the TruCluster Server software.
This chapter is organized as follows:
Section 2.1 provides release notes that are specific to the TruCluster Server software patches in this kit.
Section 2.2 provides capsule summaries of the purpose of the TruCluster Server patches included in this kit.
Tru64 UNIX patch kits are cumulative.
For this kit, this means that
the patches and related documentation from patch kits 1 through 3 are included,
along with patches that are new to this kit.
To aid you in using this document,
release notes that are new with this release are listed as (New) in the section
head.
The beginning of
Section 2.2
provides a key for
understanding the history of individual patches.
2.1 Release Notes
This section provides release notes that are specific to the TruCluster Server
software patches in this kit.
2.1.1 Required Storage Space
The following storage space is required to successfully install this patch kit:
A total of ~250 MB of temporary storage space is required
to untar this patch kit (base and TruCluster).
We recommend that this kit
not be placed in the
/
,
/usr
, or
/var
file systems because doing so may unduly constrain the available
storage space for the patching activity.
Up to ~24 MB of storage space in
/var/adm/patch/backup
may be required for archived original files if you choose to install
and revert all patches.
See the
Patch Kit Installation Instructions
for more information.
Up to ~25 MB of storage space in
/var/adm/patch
may be required for original files if you choose to install and revert all
patches.
See the
Patch Kit Installation Instructions
for more information.
Up to ~1221 KB of storage space is required in
/var/adm/patch/doc
for patch abstract and README documentation.
A total of ~184 KB of storage space is needed in
/usr/sbin/dupatch
for the patch management utility.
See
Section 1.1.1
for information on space needed
for the operating system patches.
2.1.2 Updates for Rolling Upgrade Procedures
The following sections provide information on rolling upgrade procedures.
2.1.2.1 Unrecoverable Failure Procedure
The procedure to follow if you encounter unrecoverable failures while
running
dupatch
during a rolling upgrade has changed.
The
new procedure calls for you to run the
clu_upgrade -undo install
command and then set the system baseline.
The procedure is explained
in the
Patch Kit Installation Instructions
as notes in
Section 5.3 and Section 5.6.
2.1.2.2 During Rolling Patch, Do Not Add or Delete OSF, TCR, IOS, or OSH Subsets
During a rolling upgrade, do not use the
/usr/sbin/setld
command to add or delete any of the following subsets:
Base Operating System subsets (those with the prefix
OSF
).
TruCluster Server subsets (those with the prefix
TCR
).
Worldwide Language Support (WLS) subsets (those with the prefix
IOS
).
New Hardware Delivery (NHD) subsets (those with the prefix
OSH
).
Adding or deleting these subsets during a roll creates inconsistencies
in the tagged files.
2.1.2.3 Undoing a Rolling Patch
When you undo the stages of a rolling upgrade, the stages must be undone
in the correct order.
However, the
clu_upgrade
command
incorrectly allows a user undoing the stages of a rolling patch to run the
clu_upgrade undo preinstall
command before running the
clu_upgrade undo install
command.
The problem is that in the install stage,
clu_upgrade
cannot tell from the
dupatch
flag files whether the roll
is going forward or backward.
This ambiguity allows a user who is undoing
a rolling patch to run the
clu_upgrade undo preinstall
command without first having run the
clu_upgrade undo install
command.
To avoid this problem when undoing the stages of a rolling patch, make
sure to follow the documented procedure and undo the stages in order.
2.1.2.4 Ignore Message About Missing ladebug.cat File During Rolling Upgrade
When installing the patch kit during a rolling upgrade, you may see the following error and warning messages. You can ignore these messages and continue with the rolling upgrade.
Creating tagged files. ............................................................................... ..... *** Error *** The tar commands used to create tagged files in the '/usr' file system have reported the following errors and warnings: tar: lib/nls/msg/en_US.88591/ladebug.cat : No such file or directory ......................................................... *** Warning *** The above errors were detected during the cluster upgrade. If you believe that the errors are not critical to system operation, you can choose to continue. If you are unsure, you should check the cluster upgrade log and refer to clu_upgrade(8) before continuing with the upgrade.
2.1.2.5 clu_upgrade undo of Install Stage Can Result in Incorrect File Permissions
This note applies only when both of the following are true:
You are using
installupdate
,
dupatch
, or
nhd_install
to perform a rolling
upgrade.
You need to undo the
install
stage; that
is, to use the
clu_upgrade undo install
command.
In this situation, incorrect file permissions can be set for files on
the lead member.
This can result in the failure of
rsh
,
rlogin
, and other commands that assume user IDs or identities by
means of
setuid
.
The
clu_upgrade undo install
command must be run
from a nonlead member that has access to the lead member's boot disk.
After
the command completes, follow these steps:
Boot the lead member to single-user mode.
Run the following script:
#!/usr/bin/ksh -p # # Script for restoring installed permissions # cd / for i in /usr/.smdb./$(OSF|TCR|IOS|OSH)*.sts do grep -q "_INSTALLED" $i 2>/dev/null && /usr/lbin/fverify -y <"${i%.sts}.inv" done
Rerun
installupdate
,
dupatch
,
or
nhd_install
, whichever is appropriate, and complete
the rolling upgrade.
For information about rolling upgrades, see Chapter 7 of the
Cluster Installation
manual,
installupdate
(8)clu_upgrade
(8)2.1.2.6 Missing Entry Messages Can Be Ignored During Rolling Patch
During the
setup
stage of a rolling patch, you might
see a message like the following:
Creating tagged files. ............................................................................ clubase: Entry not found in /cluster/admin/tmp/stanza.stdin.597530 clubase: Entry not found in /cluster/admin/tmp/stanza.stdin.597568
An
Entry not found
message will appear once for each
member in the cluster.
The number in the message corresponds to a PID.
You can safely ignore this
Entry not found
message.
2.1.2.7 Relocating AutoFS During a Rolling Upgrade on a Cluster
This note applies only to performing rolling upgrades on cluster systems that use AutoFS.
During a cluster rolling upgrade, each cluster member is singly halted and rebooted several times. The Patch Kit Installation Instructions direct you to manually relocate applications under the control of Cluster Application Availability (CAA) prior to halting a member on which CAA applications run.
Depending on the amount of NFS traffic, the manual relocation of AutoFS may sometimes fail. Failure is most likely to occur when NFS traffic is heavy. The following procedure avoids that problem.
At the start of the rolling upgrade procedure, use the
caa_stat
command to learn which member is running AutoFS.
For example:
# caa_stat -t Name Type Target State Host ------------------------------------------------------------ autofs application ONLINE ONLINE rye cluster_lockd application ONLINE ONLINE rye clustercron application ONLINE ONLINE swiss dhcp application ONLINE ONLINE swiss named application ONLINE ONLINE rye
To minimize your effort in the procedure described as follows, it is desirable to perform the roll stage last on the member where AutoFS runs.
When it comes time to perform a manual relocation on a member where AutoFS is running, follow these steps:
Stop AutoFS by entering the following command on the member where AutoFS runs:
# /usr/sbin/caa_stop -f autofs
Perform the manual relocation of other applications running on that member:
# /usr/sbin/caa_relocate -s current_member -c target_member
After the member that had been running AutoFS has been halted as part of the rolling upgrade procedure, restart AutoFS on a member that is still up. (If this is the roll stage and the halted member is not the last member to be rolled, you can minimize your effort by restarting AutoFS on the member you plan to roll last.)
On a member that is up, enter the following command to restart AutoFS. (The member where AutoFS is to run, target_member, must be up and running in multi-user mode.)
# /usr/sbin/caa_startautofs -c target_member
Continue with the rolling upgrade procedure.
2.1.3 When Taking a Cluster Member to Single-User Mode, First Halt the Member
To take a cluster member from multiuser mode to single-user mode, first halt the member and then boot it to single-user mode. For example:
# shutdown -h now >>> boot -fl s
Halting and booting the system ensures that it provides the minimal set of services to the cluster and that the running cluster has a minimal reliance on the member running in single-user mode.
When the system reaches single-user mode, run the following commands:
# init s # bcheckrc # lmf reset
2.1.4 Additional Steps Required When Installing Patches Before Cluster Creation
This note applies only if you install a patch kit before creating a cluster; that is, if you do the following:
Install the Tru64 UNIX base kit.
Install the TruCluster Server kit.
Install the Version 5.1A Patch Kit-0003 before running the
clu_create
command.
In this situation, you must then perform three additional steps:
Run
versw
, the version switch command,
to set the new version identifier:
# /usr/sbin/versw -setnew
Run
versw
to switch to the new version:
# /usr/sbin/versw -switch
Run the
clu_create
command to create your
cluster:
# /usr/sbin/clu_create
2.1.5 Problems with clu_upgrade switch Stage
If the
clu_upgrade switch
stage does not complete
successfully, you may see a message like the following:
versw: No switch due to inconsistent versions
The
problem can be due to one or more members running
genvmunix
,
a generic kernel.
Use the command
clu_get_info -full
and note each
member's version number, as reported in the line beginning
Member base O/S version
If a member has a version
number different from that of the other members, shut down the member and
reboot it from
vmunix
, the custom kernel.
If multiple
members have the different version numbers, reboot them one at a time from
vmunix
.
2.1.6 Cluster Information for Tru64 UNIX Patch 1367.00
See
Section 1.1.12.2
for version switch information
related to Tru64 UNIX Patch 1367.00.
2.1.7 Change to gated Restriction TruCluster Patch 210.00
The following information explains the relaxed
Cluster Alias:
gated
restriction, delivered in TruCluster Patch 210.00.
Prior to this patch, we required that you use
gated
as a routing daemon for the correct operation of cluster alias routing because
the cluster alias subsystem did not coexist gracefully with either the
routed
or static routes.
This patch provides an
aliasd
daemon that does not depend on having
gated
running in order to function correctly.
The following is a list of features supported by this patch:
The
gated
and
routed
routing daemons are supported in a cluster.
In addition, static routing is
supported (no routing daemons are required).
Because
aliasd
is optimized for
gated
,
using
gated
remains the default and preferred routing daemon.
However, it is no longer mandatory, nor is it the only way to configure routing
for a cluster member.
For example, you could configure a cluster where all
members use static routing, or some members run
routed
,
or use a combination of routing daemons and static routes.
However, the exisiting restriction against using
ogated
still applies; do not use
ogated
as a routing daemon in
a cluster.
Note
Cluster members do not have to have identical routing configurations. In general, it is simpler to configure all cluster members identically, but in some instances, an experienced cluster administrator might choose to configure one or more members to perform different routing tasks. For example, one member might have
CLUAMGR_ROUTE_ARGS="nogated"
in its/etc/rc.config
file and have a fully populated/etc/routes
file. Or a member might run withnogated
androuted -q
.
The alias daemon
The alias daemon will handle the failover of cluster alias IP addresses
via the cluster interconnect for either dynamic routing or static routing.
If an interface fails,
aliasd
reroutes alias traffic to
another member of the cluster.
As long as the cluster interconnect is working,
there is always a way for cluster alias traffic to get in or out of the cluster.
Multiple interfaces per subnet (for network load balancing)
Although
gated
does not support this configuration,
because static routing is supported, an administrator can use static (nogated
) routing for network load balancing.
By default, the cluster alias subsystem uses
gated
,
customized configuration files (/etc/gated.conf.member<n>
), and RIP to advertise host routes for alias
addresses.
You can disable this behavior by specifying the
nogated
option to
cluamgr
, either by running the
cluamgr -r nogated
command on a member or by setting
CLUAMGR_ROUTE_ARGS="nogated"
in that members
/etc/rc.config
file.
For example,
the network configuration for a member could use
routed
,
or
gated
with a site-customized
/etc/gated.conf
file, or static routing.
For a cluster, there are three general routing configuration scenarios:
The default configuration:
aliasd
controls
gated
.
Each member has the following in its
/etc/rc.config
file:
GATED="yes" CLUAMGR_ROUTE_ARGS="" # if variable present, set to a null string
If needed, static routes are defined in each member's
/etc/routes
file.
Note
Static routes in
/etc/routes
files are installed before routing daemons are started, and honored by routing daemons.
Members run
gated
, but the cluster alias
and
aliasd
are independent of it.
The administrator has
total control over
gated
and its configuration file,
/etc/gated.conf
.
This approach is useful for an administrator who
wants to enable IP forwarding and configure a member as a full-fledged router.
Each member that will follow this policy has the following
in its
/etc/rc.config
file:
GATED="yes" CLUAMGR_ROUTE_ARGS="nogated" ROUTER="yes" # if this member will be a full-fledged router
If needed, configure static routes in
/etc/routes
.
Static routing: one or more cluster members do not run a routing daemon.
Each member that will use static routing has the following
in its
/etc/rc.config
file:
GATED="no" CLUAMGR_ROUTE_ARGS="nogated" ROUTED="no" ROUTED_FLAGS=""
Define static routes in that member's
/etc/routes
file.
2.1.8 Information for TruCluster Patch 272.00
This section provides information for TruCluster Patch 272.00.
2.1.8.1 Enablers for EVM
This patch provides enablers for the Compaq
SANworksTM
Enterprise Volume Manager (EVM) Version 2.0.
2.1.8.2 Rolling Upgrade Version Switch
This patch uses the rolling upgrade version switch to ensure that all members of the cluster have installed the patch before it is enabled.
Prior to throwing the version switch, you can remove this patch by returning
to the rolling upgrade install stage, rerunning
dupatch
,
and selecting the Patch Deletion item in the Main Menu.
You can remove this patch after the version switch is thrown, but this requires a shutdown of the entire cluster.
To remove this patch after the version switch is thrown, use the following procedure:
Note
Use this procedure only under the following conditions:
The rolling upgrade that installed this patch, including the clean stage, has completed.
The version switch has been thrown (
clu_upgrade -switch
).A new rolling upgrade is not in progress.
All cluster members are up and in multiuser mode.
Run the
/usr/sbin/evm_versw_undo
command.
When this command completes, it asks whether it should shut down the entire cluster now. The patch removal process is not complete until after the cluster has been shut down and restarted.
If you do not shut down the cluster at this time, you will not be able to shut down and reboot an individual member until the entire cluster has been shut down.
After cluster shutdown, boot the cluster to multiuser mode.
Rerun the rolling upgrade procedure from the beginning (starting
with the setup stage).
When you rerun
dupatch
, select the
Patch Deletion item in the Main Menu.
For more information about rolling upgrades and removing patches, see
the
Patch Kit Installation Instructions.
2.1.8.3 Restrictions Removed
The restriction of not supporting multiple filesets from the
cluster_root
domain has been removed.
It is now fully supported
to have multiple filesets from the
cluster_root
domain
to be mounted in a cluster; however, this could slow down the failover of
this domain in certain cases and should only be used when necessary.
The restriction of not supporting muliptle filesets from a boot partition
domain has been removed.
It is now fully supported to have multiple filesets
from a node's boot partition to be mounted in a cluster; however, when the
CFS server node leaves the cluster all filesets mounted from that node's boot
partition domain will be force unmounted.
2.1.9 CAA and Datastore TruCluster Patch 242.00
This section provides information about TruCluster Patch 242.00.
During a rolling upgrade, when the last member is rolled and immediately after the version switch is thrown, a script is run to put CAA on hold and copy the old datastore to the new datastore. CAA will connect to the new datastore when it is available.
The time required to do this depends on the amount of information in the datastore and the speed of each member machine. For 50 resources we have found the datastore conversion itself to only take a few seconds.
To undo this patch, the following command must be run:
/usr/sbin/cluster/caa_rollDatastore backward
You are prompted to guide the backward conversion process.
One step of this command will prompt you to kill the
caad
daemons on all members.
A
caad
daemon may still appear
to be running as an uninterruptible sleeping process (state
U
in the
ps
command) after issuing a
kill -9
command.
You can safely ignore this and continue with the conversion process
as prompted, because
caad
will be killed when the process
wakes up.
2.2 Summary of TruCluster Software Patches
This section provides capsule summaries of the patches in Patch Kit 4 for the TruCluster Server software products. Because Tru64 UNIX patch kits are cumulative, each patch lists its state according to the following criteria:
New
Indicates a patch that is new for this release
New (Supersedes Patches ... )
Indicates a patch that is new to the kit but was combined (merged) with one or more patches during the creation of earlier versions of this kit, before it was publicly released.
Existing (Kit 3)
Indicates a patch that was new in the previous Version 5.1A patch kit.
Existing
Indicates a patch that existed in earlier Version 5.1A patch kits.
Supersedes Patches ...
Indicates a patch that was combined (merged) with other patches.
This section provides capsule summaries of the patches in Patch Kit 4 for the TruCluster Server software products.
Number: Patch 27.00 Abstract: Fix for clusterwide wall messages not being received State: Existing This patch allows the cluster wall daemon to restart following an EVM daemon failure. |
Number: Patch 88.00 Abstract: Fix for cluster hang during boot State: Supersedes Patch 29.00 This patch addresses a situation where the second node in a cluster hangs upon boot while setting the current time and date with ntpdate. |
Number: Patch 121.00 Abstract: Using a cluster as a RIS server causes panic State: Supersedes Patch 29.00 This patch:
|
Number: 136.00 Abstract: Enhancement for clu_autofs shutdown script State: Existing This patch makes the /sbin/init.d/clu_autofs script more robust. |
Number: 181.00 Abstract: Fixes problems in the DLM subsystem State: Superseds patches 39.00, 131.00, 178.00, 179.00 This patch:
|
Number: Patch 188.00 Abstract: Fixes cluster kernel problem that causes a hang State: Superseds patches 70.00 and 186.00 This patch:
|
Number: Patch 195.00 Abstract: Memory Channel API problem causes system hang State: Existing (Patch Kit 3) This patch fixes a problem in the Memory Channel API that can cause a system to hang. |
Number: Patch 206.00 Abstract: Fixes kernel memory fault in rm_get_lock_master State: Supersedes Patches 11.00, 62.00, 97.00, 145.00, 146.00, 148.00, 203.00, 204.00 This patch:
|
Number: Patch 210.00 Abstract: aliasd now interprets NIFF parameters correctly State: Supersedes Patches 6.00, 7.00, 9.00, 207.00, 208.00 This patch:
|
Number: Patch 212.00 Abstract: Corrects performance issues on starting cluster LSM State: Supersedes Patch 150.00 This patch:
|
Patch: Patch 242.00 Abstract: Fix for Oracle failure during start-up State: Supersedes Patches 1.00, 2.00, 3.00, 5.00, 53.00, 54.00, 55.00, 56.00, 57.00, 58.00, 60.00, 66.00, 71.00, 72.00, 74.00, 84.00, 93.00, 95.00 This patch:
|
Number: Patch 244.00 Abstract: Security (SSRT2265, SSRT2265) State: Supersedes Patches 48.00, 138.00 This patch:
|
Patch: Patch 246.00 Abstract: Fixes lsm disks and cluster quorum tools problems State: Supersedes Patches 41.00, 80.00, 173.00, 175.00 This patch:
|
Number: Patch 252.00 Abstract: Fix for ICS panics State: Supersedes Patches 37.00, 82.00, 132.00, 134.00, 182.00, 183.00, 185.00, 249.00, 250.00 This patch:
|
Number: Patch 254.00 Abstract: Security State: Supersedes Patch 52.00 This patch:
|
Number: Patch 256.00 Abstract: Fix for cluster hang State: New This patch enables a cluster to boot even if the cluster root domain devices are private to different cluster members. This is not a recommended configuration; however, it should not result in an unbootable cluster. Currently, this is with respect to cluster root domains not under LSM control. |
Number: Patch 259.00 Abstract: Fixes timing problem in the Connection Manager State: Supersedes Patches 68.00, 257.00 This patch:
|
Number: Patch 263.00 Abstract: Fix for cluster panic State: Supersedes Patches 44.00, 46.00, 189.00, 190.00, 191.00, 193.00, 260.00, 261.00 This patch:
|
Number: Patch 265.00 Abstract: Fix for cluster alias manager SUITlet State:New This patch fixes the problem in which the cluster alias manager SUITlet falsely interprets any cluster alias with virtual={t|f} configured as a virtual alias regardless of its actual setting. |
Number: Patch 269.00 Abstract: A node may panic while under load State: Supersedes Patches 50.00, 200.00, 267.00 This patch:
|
Number: Patch 272.00 Abstract: Improves responsiveness of EINPROGRESS handling State: Supersedes Patches 12.00, 13.00, 14.00, 15.00, 16.00, 17.00, 18.00, 19.00, 20.00, 21.00, 22.00, 23.00, 25.00, 76.00, 92.00, 98.00, 99.00, 100.00, 101.00, 102.00, 103.00, 104.00, 105.00, 106.00, 107.00, 108.00, 109.00, 110.00, 111.00, 112.00, 113.00, 114.00, 116.00, 140.00, 142.00, 64.00, 86.00, 117.00, 119.00, 43.00, 151.00, 152.00, 153.00, 154.00, 155.00, 156.00, 157.00, 158.00, 159.00, 160.00, 161.00, 162.00, 163.00, 164.00, 165.00, 166.00, 167.00, 168.00, 169.00, 170.00, 172.00, 30.00, 31.00, 32.00, 33.00, 35.00, 78.00, 90.00, 122.00, 123.00, 124.00, 125.00, 126.00, 127.00, 129.00, 144.00, 196.00, 198.00, 202.00, 213.00, 214.00, 215.00, 216.00, 217.00, 218.00, 219.00, 220.00, 221.00, 222.00, 223.00, 224.00, 225.00, 226.00, 227.00, 228.00, 229.00, 230.00, 231.00, 232.00, 233.00, 234.00, 235.00, 236.00, 237.00, 238.00, 240.00, 270.00 This patch:
|
Patch 272.00 Continued
|
Patch 272.00 Continued
|
Patch 272.00 Continued
|
Patch 272.00 Continued
|
Patch 272.00 Continued
|