PROBLEM: (79404, 77948, 80189, 80006, 80190) (PATCH ID: TCR505-009) ******** This patch delivers a new stripped clu_genvmunix and several fixes to the cluster rolling upgrade procedure. The fixes to the rolling upgrade proceedure are primarily focused on the "undo" of the install stage. In a normal rolling upgrade this would not be necessary. However problems can occur during the install stage which would require the "undo" to work properly. PROBLEM: (81084) (PATCH ID: TCR505-018 ) ******** This patch fixes a problem seen when running clu_upgrade preinstall commands on certain multi-cpu systems. Numerous error messages simular to the following are seen: *** Error *** Could not create: ocolsocols/.Old..ocols If you see this problem enter a Ctrl/C and re-run the clu-upgrade preinstall command. PROBLEM: (78935) (PATCH ID: TCR505-003) ******** When a client has remote access to a tape device and the tape server loses its path to the device, the server will error out the request. The client application should close and reopen the device through the new server if available. In this case, if the application did not close the tape device but instead issued an ioctl call to the device, then a device lock would be taken but not released upon returning from the call. Any subsequent operation on that device would cause the system to crash with "SIMPLE_LOCK: TIME LIMIT EXCEEDED PANIC ON SHARED TAPE". PROBLEM: (79577) (PATCH ID: TCR505-007) ******** There are various problems when using a tape device on a shared bus in a 5.0A cluster: 1. Sometimes when a node was rebooted, the tape device in use can't be used any more by the remaining nodes. Every attempt to open it gives back EBUSY until the remaining nodes are rebooted. 2. When a tape was being used and another node with a direct path to the tape rebooted the new node became the new server of the tape device although there was no good reason to change the node being the tape server. 3. Sometimes there were messages about "close failed" for the tape device in syslog. All these problems are addressed in this patch PROBLEM: (80635) (PATCH ID: TCR505-010) ******** The mount command will hang after DRM has restored the path to an HSG80 storage volume. This may cause a disk to become unaccessible in a cluster environment. All requests to that disk will become stuck and the "drdmgr -n dskXX" command will report "Reconfiguration in progress" forever. PROBLEM: (80701) (PATCH ID: TCR505-012) ******** The path will be lost after DRM has restored the path to an HSG80 storage volume. This path will not be available until the node is rebooted. PROBLEM: (80824) (PATCH ID: TCR505-013) ******** This patch fixes a problem where on a cluster node, if a new device is detected by a HW scan while the cluster is up running, one of the following situations can occur: 1. only one node will be able to use the device; if the device is Fiber Channel, or 2. there is a small risk for data corruption on parallel SCSI device on a shared bus if the node subsequently loses quorum. PROBLEM: (75386, UVO58439B) (PATCH ID: TCR505-023) ******** This patch provides the DRD portion of a fix to prevent an AdvFS Domain Panic from occurring during the boot process following a clu_add_member. PROBLEM: (80559) (PATCH ID: TCR505-011) ******** This patch fixes a problem where on a cluster node, if a SCSI bus reset occurs, when there is a loss of quorum, drd may be blocked on tape devices. When this occurs, the tape device on that bus, can not be used without a reboot and CPU usage will increase heavily. PROBLEM: (79551) (PATCH ID: TCR505-004) ******** This patch fix a kernel memory fault panic in routines cfstok_find_held_tok. This is caused when the very first action of a new alloocated thread is a lookup in an NFS filesystem of ".". PROBLEM: (79140) (PATCH ID: TCR505-005) ******** This patch fixes a problem where mounts that return "ESTALE" may loop forever. PROBLEM: (DEK017467) (PATCH ID: TCR505-015) ******** This patch prevents a kernel memory fault panic from occurring when a mount of an AdvFS is attempted without the fileset name properly specified. The stack trace will be similiar to the following: 5 panic 6 trap 7 _XentMM 8 strcpy 9 cms_extract_domain_info 10 cms_select_cfs_server 11 cms_advfs_mount_initial 12 cms_mount_initial 13 cms_mount_preprocess 14 cluster_mount 15 mount1 16 mount 17 syscall 18 _Xsyscall PROBLEM: (75386, UVO58439B) (PATCH ID: TCR505-024) ******** This patch provides the CFS/CMS portion of a fix to prevent an AdvFS Domain Panic from occurring during the boot process following a clu_add_member. PROBLEM: (VIS-2-158, 80305, 80358, 80493) (PATCH ID: TCR505-020) ******** This patch corrects a problem with the cluster file system in which a cluster member will panic with a "kernel memory fault" when running cfsmgr from one to ten times consecutively. PROBLEM: (HPAQ621TD) (PATCH ID: TCR505-016) ******** This patch provides performance enhancements in CFS. The patch benefits systems by not acquiring a token for threads in cfs_getpage() or cfs_getapage() that have arrived at these functions through either a cfs_read or cfs_write. PROBLEM: (EVT26812A) (PATCH ID: TCR505-017) ******** This patch prevents the following panic from occuring: request_internal: client already had token A typical stack trace will contain the following: 8 panic 9 cfsdb_panic 10 request_internal 11 svrtok_return_range 12 svrcfstok_return_range 13 cfs_tokmsg 14 rcfstok_return 15 svr_rcfstok_return 16 icssvr_daemon_from_pool PROBLEM: (84176, TKTBB0088, HPAQC0DRG, EVT0576031, HPAQ1093P) (PATCH ID: TCR505-046) ******** This patch prevents a cfsdb_assert panic from occurring in the cfs block reserve code. The system is most likely running process accounting that will receive this type of panic. Panics seen: Assert Failed: brp->br_allocated >= 0 file: ../../../../src/kernel/tnc_common/tnc_cfe/cfs_blkrsrv.c line: 1508 caller: 0xfffffc0000861f4c panic (cpu 2): cfsdb_assert and Assert Failed: bdp->bd_svr_out >= blkreturn file: ../../../../src/kernel/tnc_common/tnc_cfe/cfs_blkrsrv.c line: 2591 caller: 0xfffffc00008d8294 panic (cpu 0): cfsdb_assert PROBLEM: (82505, SSRT0691U) (PATCH ID: TCR505-028) ******** A potential security vulnerability has been discovered, where under certain circumstances, system integrity may be compromised. This may be in the form of improper file or privilege management. Compaq has corrected this potential vulnerability. PROBLEM: (84515, 84489, 83411, 83297, 83658) (PATCH ID: TCR505-041) ******** This patch fixes several problems, including addressing the need for IOCTL for remote DRD, adding clean up for failed remote closes for non-disks, fixing error returns on failed tape/changer closes and fixes to tape deadlock experienced in netbackups. PROBLEM: (83411, N/A) (PATCH ID: TCR505-040) ******** This patch fixes an issue with a tape/changer giving back busy on open if a close from a remote node failed. PROBLEM: (83908, 81908) (PATCH ID: TCR505-044) ******** All systems in a cluster experience I/O timeouts when a Fibre Channel cable is pulled from one of the members. The devices from the failing member are intially mapped correctly to another member. The devices affected by the cable pull eventually experience timeouts and become unmapped from all members. PROBLEM: (EVT25739B, N/A) (PATCH ID: TCR505-035) ******** This patch fixes a problem with a cluster-as-NFS-client, in which there is a potential race where a CFS client node may not correctly "timeout" it's cached data for a given file. Thus, processes accessing the given file, on that particular cluster member, may not see changes made to the file via the NFS server, or other NFS clients. PROBLEM: (QAR.82428) (PATCH ID: TCR505-033) ******** The Quorum disk becomes inaccessible if manually adding Quorum disk by the command "clu_quorum -d add" because the PR flag is not cleaned up. However; it will work in the next reboot. A member cannot boot under specific hardware due to the CFS mount fails because of the PR flag is not cleaned up. PROBLEM: (80807, 80883, 82108, 78936, 79186) (PATCH ID: TCR505-029) ******** User data can become corrupted on hardware configurations that use multiported parallel Fibre Channel storage arrays. If a client is in the process of performing an open on a tape device that crashes before the open completes then the tape device becomes inaccessible from the remaining cluster. PROBLEM: (BCGM90MM9) (PATCH ID: TCR505-030) ******** This patch is to provide performance enhancements for copying large (These are files smaller than the total size of client's physical memory.) files between a CFS client and server within the cluster. PROBLEM: (83952, SOO12306A) (PATCH ID: TCR505-036) ******** This patch corrects a problem in which a cluster member can panic with the panic string "cfsdb_assert" when a NFS v3 TCP client attempts to create a socket using mknod(2). PROBLEM: (83749, VNO76936A, MGO38615A) (PATCH ID: TCR505-038) ******** This patch corrects a problem in which a cluster member will panic with the patch string "lock_terminate: lock held" from cinactive(). PROBLEM: (80649, 80982) (PATCH ID: TCR505-026) ******** This patch fixes a problem in CFS. CFS stops serving lock requests resulting in a process hang. PROBLEM: (80295) (PATCH ID: TCR505-025) ******** This patch is to prevent possible file corruption that can occur during a CFS/NFS race condition. PROBLEM: (BCGM919P3, EVT376821B, EVT376821, 82934) (PATCH ID: TCR505-034) ******** This patch fixes a hang seen while running collect and the vdump utility. This patch prevents the hang in tok_wait from occurring. This also prevents a cfsdb_assert panic that contains the following message: "Assert Failed: (tcbp->tcb_flags & TOK_GIVEBACK) == 0" PROBLEM: (79818) (PATCH ID: TCR505-053) ******** This patch is to fix a problem with booting several nodes in a cluster simultaneously which could cause a KMF panic to occur.