ECO NUMBER: VMS721_FIBRE_SCSI-V0600 PRODUCT: OpenVMS Alpha OPERATING SYSTEM V7.2-1 UPDATE PRODUCT: OpenVMS Alpha OPERATING SYSTEM V7.2-1 COVER LETTER 1 KIT NAME: VMS721_FIBRE_SCSI-V0600 2 KITS SUPERSEDED BY THIS KIT: VMS721_FIBRE_SCSI-V0500 3 KIT DEPENDENCIES: 3.1 The following remedial kit(s), or later, must be installed BEFORE installation of this, or any required kit: o VMS721_PCSI-V0100 o VMS721_UPDATE-V0300 3.2 In order to receive all the corrections listed in this kit, the following remedial kits, or later, should also be installed: VMS721_SYS-V1200 4 KIT DESCRIPTION: 4.1 Version(s) of OpenVMS to which this kit may be applied: OpenVMS Alpha V7.2-1 4.2 Files patched or replaced: o [SYSHLP.UNSUPPORTED]FC$CP.EXE (new image) o [SYSLIB]FC$SDA.EXE (new image) o [SYSLIB]SMI$OBJSHR.EXE (new image) o [SYS$LDR]SYS$DKDRIVER.EXE (new image) o [SYS$LDR]SYS$FGEDRIVER.EXE (new image) o [SYS$LDR]SYS$GKDRIVER.EXE (new image) -- COVER LETTER -- Page 2 4 June 2002 o [SYS$LDR]SYS$MKDRIVER.EXE (new image) o [SYS$LDR]SYS$PGADRIVER.EXE (new image) o [SYS$LDR]SYS$PKADRIVER.EXE (new image) o [SYS$LDR]SYS$PKCDRIVER.EXE (new image) o [SYS$LDR]SYS$PKEDRIVER.EXE (new image) o [SYS$LDR]SYS$PKJDRIVER.EXE (new image) o [SYS$LDR]SYS$PKQDRIVER.EXE (new image) o [SYS$LDR]SYS$PKSDRIVER.EXE (new image) o [SYS$LDR]SYS$PKTDRIVER.EXE (new image) o [SYS$LDR]SYS$PKWDRIVER.EXE (new image) o [SYS$LDR]SYS$PKZDRIVER.EXE (new image) o [SYSMSG]SYSMSG.EXE (new image) o [SYS$LDR]FC$GLOBALS.STB (new file) o [SYSEXE]SYS$CONFIG.DAT (new file) 5 NEW FUNCTIONALITY INTRODUCED IN VMS721FIBRE_SCSI-V0600 KIT o Interrupt and Response Coalescing Interrupt and Response Coalescing is a functional option implemented in KGPSA firmware which allows LP8000 and LP9002 adapters to reduce the number of interrupts seen by a host. Given a response count and a delay time (in ms), the adapter can defer interrupting the host until that number of responses is available or until that amount of time has passed, whichever occurs first. This also makes each interrupt seen by the host more cost-effective because it will generally process more responses per interrupt than without Interrupt Coalescing. Images Affected: - [SYSHLP.UNSUPPORTED]FC$CP.EXE - [SYS$LDR]SYS$FGEDRIVER.EXE - [SYS$LDR]SYS$PGADRIVER.EXE -- COVER LETTER -- Page 3 4 June 2002 o Enabling Interrupt and Response Coalescing You can turn on Interrupt and Response Coalescing with the following command: $ MCR SYS$ETC:FC$CP FGx [] - $_ [] - FGx : the type of FG device (FGA, FGB etc.). To determine which FG devices are present, refer to the section titled "Determining FGx Devices" - : Enables bit 1 = Response Coalescing and bit 0 = Interrupt Coalescing. - : Delay is in milliseconds and can range from 0 to 255 decimal. - Response count can range from 0 to 63 decimal. - Any negative value leaves a parameter unchanged. - Values returned are those which are current after any changes. The recommended command is: MCR SYS$ETC:FC$CP FGA 3 1 8 You should substitute FGA with whatever FG device you wish to configure. The command must be run once per boot for every Emulex FibreChannel adapter on which Interrupt Coalescing is to be enabled. Once enabled it will persist across adapter initializations, path switches, CPU affinity changes, etc., in other words, until the next boot. Interrupt Coalescing can be turned off by passing an "enables" value of 0. o Determining FGx Devices In order to tell which FGx device(s) you have on your system, execute the following commands: $ ANALYZE/SYSTEM SDA>CLUE CONFIG/A Following is an excerpt of an example of the output of the above commands. Note that all fields are not included, only those needed to determine adapter type. Adapter Configuration: ---------------------- Port BusArrayEntry Device Name / HW-Id -- COVER LETTER -- Page 4 4 June 2002 ---- ---------------- -------------------- FGA: FFFFFFFF.810FBC40 KGPSA-CA (Emulex LP8000) FGB: FFFFFFFF.810FBC78 KGPSA-** (Emulex LP9000) Interrupt and Response Coalesing will only operate on LP8000 and LP9002 adapters. If the device name is not listed you will you will need to EXAMINE the BusArrayEntry entry to tell whether the Adapter is an LP8000, LP9002 or an earlier type of adapter. Following is an example of the EXAMINE command: SDA> EXAMINE FFFFFFFF.810FBC40 FFFFFFFF.810FBC40: F80010DF.F80010DF "ß..øß..ø" SDA> EXAMINE FFFFFFFF.810FBC78 FFFFFFFF.810FBC78: F90010DF.F90010DF "ß..ùß..ù" The field F80010DF.F80010DF shows that the adapter is an LP8000 adapter. The field F90010DF.F90010DF shows that the adapter is an LP9002 adapter. 6 PROBLEMS ADDRESSED IN VMS721_FIBRE_SCSI-V0600 KIT o When booting, a shadowed system disk can hang the cluster. Images Affected: - [SYS$LDR]SYS$DKDRIVER.EXE o If the system experiences bus starvation because of heavy IO activity, there is a possibility that some IO will timeout and reset the SCSI bus. This could result in Disk Mount Verifies. Also, data corruption might occur during heavy IO Timeouts. The IO Timeouts could be detected by looking for Error Type 4, Subtype 1 in the Error Log, Images Affected: - [SYS$LDR]SYS$PKADRIVER.EXE o When a Fibre Channel disk is being brought back online after a controller failover, the user may see a variety of problems ranging from process hangs and system hangs, to system crashes with a variety of bugchecks. All systems using disks served from the affected HSG controllers will be affected. -- COVER LETTER -- Page 5 4 June 2002 The nature of this problem is twofold: o It completely shuts down the HSG controller. Anything accessing that HSG will hang until their I/O times out. This will cause any number of failures from all the disks becoming unavailable. o It consumes a number of resources on the systems. A crash will result from what ever critical resource runs out first. If the wrong equilibrium is reached, the systems can appear to hang indefinitely. It is possible, but not likely, that they will recover with no intervention. Images Affected: - [SYS$LDR]SYS$DKDRIVER.EXE o If a non-clustered OpenVMS system, prior to V7.3, attempts to mount a fibre channel disk with a persistent reservation on it, the system will bugcheck with an "INVEXCEPTN, Exception while above ASTDEL". Persistent reservations can be on a disk from the SWCC program or when the disk was mounted by a V7.3 OpenVMS release or later. They can also be present from having the disk used by a non-OpenVMS operating system that uses persistent reservations. Crashdump Summary ----------------- Bugcheck Type: INVEXCEPTN, Exception while above ASTDEL Current Process: NULL Current Image: Failing PC: FFFFFFFF.802929C8 SYS$DKDRIVER+109C8 Failing PS: 38000000.00000804 Module: SYS$DKDRIVER (Link Date/Time: 9-FEB-2001 08:51:21.81) Offset: 000109C8 Images Affected: - [SYS$LDR]SYS$DKDRIVER.EXE o When a TLZ10 tape with an incorrect label is installed on a 789X SCSI adapter (KZPEA 7899 card or built-in 7895 card in a DS20E), the user should receive a MEDOFL (Medium offline) error. Instead, the user receives an incorrect DRVERR (Drive Error) error. -- COVER LETTER -- Page 6 4 June 2002 Images Affected: - [SYS$LDR]SYS$PKADRIVER.EXE o An INIT command, when used on SCSI tapes, can take 1/2 hour or more to quit if no tape is in the unit. With this change, the INIT command fails immediately when there is no media in the drive. Images Affected: - [SYS$LDR]SYS$MKDRIVER.EXE o The system can crash with an "INCONSTATE, Inconsistent I/O data base" bugcheck at SYS$FGEDRIVER+8C3C. Crashdump Summary Information: ------------------------------ Bugcheck Type: INCONSTATE, Inconsistent I/O data base Current Process: NULL Current Image: Failing PC: FFFFFFFF.802DAC3C SYS$FGEDRIVER+08C3C Failing PS: 18000000.00000804 Module: SYS$FGEDRIVER (Link Date/Time: 5-DEC-2001 14:41:56.69) Offset: 00008C3C Images Affected: - [SYS$LDR]SYS$FGEDRIVER.EXE - [SYSLIB]FC$SDA.EXE o The system can crash with a SSRVEXCEPT, Unexpected system service exception" bugcheck. Crashdump Summary Information: ------------------------------ Bugcheck Type: SSRVEXCEPT, Unexpected system service exception CPU Type: AlphaServer 2100 4/233 Failing PC: FFFFFFFF.801CB968 NSA$REFERENCE_RIGHTS_CHAIN_C+00008 Failing PS: 10000000.00000201 Module: SECURITY (Link Date/Time: 5-AUG-2001 01:12:10.86) Offset: 0000B96 -- COVER LETTER -- Page 7 4 June 2002 Images Affected: - [SYS$LDR]SYS$FGEDRIVER.EXE o Attempting to mount a TLZ09 gives a DRVERR error. Images Affected: - [SYS$LDR]SYS$PKADRIVER.EXE o After executing a HSV110 controller restart, the system loses connection to the disks on the HSV110. Images Affected: - [SYS$LDR]SYS$FGEDRIVER.EXE 7 PROBLEMS ADDRESSED IN VMS721_FIBRE_SCSI-V0500 KIT o If an MCR SYMAN IO AUTO command is issued to hot add/ swap SCSI targets, there is a possibility that a system can experience a hang. Images Affected: - [SYS$LDR]SYS$PKADRIVER.EXE o A system can experience excessive SCSI bus resets due to un-synchronized access to device registers. Images Affected: - [SYS$LDR]SYS$PKWDRIVER.EXE o Tape Density values are not stored correctly. This causes third party tape applications, such as Oracle's RMU backup utility, to fail when performing multi-volume tape backup and other operations which require the tape density to be stored correctly. Images Affected: - [SYS$LDR]SYS$MKDRIVER.EXE o The system can hang due to IO's Timing Out, and the driver not re-trying the IO command. The IO timeout can be detected through the error log. -- COVER LETTER -- Page 8 4 June 2002 Images Affected: - [SYS$LDR]SYS$PKADRIVER.EXE o On OpenVMS Alpha clusters sharing a SCSI bus and attached to the bus with KZPBA SCSI controllers, disks may not come out of Mount Verify. Images Affected: - [SYS$LDR]SYS$PKQDRIVER.EXE o When performing multi-volume Backup (and various other Backup operations involving label processing) with generic SCSI tapes, compaction status gets turned off. Images Affected: - [SYS$LDR]SYS$MKDRIVER.EXE o Odd byte records read from tape to a memory buffer larger than the tape record results in one extra byte of data. Images Affected: - [SYS$LDR]SYS$PKWDRIVER.EXE o When an HSZ/HSG device reports mirror copy status events, the ERRCNT of the device is incremented. This gives the false impression that there is a problem with the device(s). Images Affected: - [SYS$LDR]SYS$DKDRIVER.EXE o In a multiprocessor environment, exception conditions (such as AUTOGEN, disk errors, power glitches, etc.) cause the PKWDRIVER, the hardware interface, and the script code to become un-synchronized relative to each other. This lack of synchronization can cause excessive bus resets, mount verify timeouts, command timeouts, I/O system hangs, system crashes and/or file corruption. The most obvious problem is the occurrence of SCSI bus resets. These can be seen with a "SHOW ERROR" command and will be in the error log. Images Affected: - [SYS$LDR]SYS$PKWDRIVER.EXE -- COVER LETTER -- Page 9 4 June 2002 o Tape drive performance degrades significantly after a tape error. Images Affected: - [SYS$LDR]SYS$MKDRIVER.EXE o If a disk is mounted software write protected, when the disk enters and completes mount verification, the disk is software write enabled. Images Affected: - [SYS$LDR]SYS$DKDRIVER.EXE o A system can crash with an INVEXCEPTN, Exception while above ASTDEL bugcheck. Crashdump Summary Information: ------------------------------ Bugcheck Type: INVEXCEPTN, Exception while above ASTDEL Current Process: NULL Current Image: Failing PC: FFFFFFFF.802FB820 SCS$POLL_MODE_C+00460 Failing PS: 1C000000.00000804 Module: SYS$SCS (Link Date/Time: 9-DEC-2000 00:44:16.18) Offset: 00005820 Images Affected: - [SYS$LDR]SYS$PGADRIVER.EXE o A system can experience CPUSPINWAIT Crashes. See Crash dump summary below. Crashdump Summary Information: ------------------------------ Bugcheck Type: CPUSPINWAIT, CPU spinwait timer expired Current Process: NULL Current Image: Failing PC: FFFFFFFF.80088384 SMP$TIMEOUT_C+00064 Failing PS: 18000000.00000804 Module: SYSTEM_SYNCHRONIZATION_MIN (Link Date/Time: 26-MAY-2001 22:21:11.81) Offset: 00000384 -- COVER LETTER -- Page 10 4 June 2002 Images Affected: - [SYS$LDR]SYS$PKQDRIVER.EXE o For 8MM tapes (for example Exabyte, TZK15), a COPY command to a freshly initialized tape results in a fatal drive error whenever the COPY command is issued on a TMSCP client node. The error log will show that an Illegal Request has been sent to the drive. Images Affected: - [SYS$LDR]SYS$MKDRIVER.EXE o Mounting CDs in the Yamaha CD-Writer CRW8424S results in a '%MOUNT-F-FORMAT, invalid media format' error message. Images Affected: - [SYS$LDR]SYS$DKDRIVER.EXE o When booting, a shadowed system disk can hang the cluster. Images Affected: - [SYS$LDR]SYS$DKDRIVER.EXE 8 KIT INSTALLATION RATING: The following kit installation rating, based upon current CLD information, is provided to serve as a guide to which customers should apply this remedial kit. (Reference attached Disclaimer of Warranty and Limitation of Liability Statement) INSTALLATION RATING: INSTALL_1 : To be installed by all customers. 9 INSTALLATION INSTRUCTIONS: Install this kit with the POLYCENTER Software installation utility by logging into the SYSTEM account, and typing the following at the DCL prompt: PRODUCT INSTALL VMS721_FIBRE_SCSI /SOURCE=[location of Kit] The kit location may be a tape drive, CD, or a disk directory that contains the kit. -- COVER LETTER -- Page 11 4 June 2002 Additional help on installing PCSI kits can be found by typing HELP PRODUCT INSTALL at the system prompt This kit requires a system reboot. Compaq strongly recommends that a reboot is performed immediately after kit installation to avoid system instability If you have other nodes in your OpenVMS cluster, they must also be rebooted in order to make use of the new image(s). If it is not possible or convenient to reboot the entire cluster at this time, a rolling re-boot may be performed. 9.1 Special Installation Instructions: 9.1.1 Scripting of Answers to Installation Questions During installation, this kit will ask and require user response to several questions. If you wish to automate the installation of this kit and avoid having to provide responses to these questions, you must create a DCL command procedure that includes the following definitions and commands: - $ DEFINE/SYS NO_ASK$BACKUP TRUE - $ DEFINE/SYS NO_ASK$REBOOT TRUE - Add the following qualifiers to the PRODUCT INSTALL command and add that command to the DCL procedure. /PROD=DEC/BASE=AXPVMS/VER=V6.0 - De-assign the logicals assigned For example, a sample command file to install the VMS721_FIBRE_SCSI-V0600 kit would be: $ $ DEFINE/SYS NO_ASK$BACKUP TRUE $ DEFINE/SYS NO_ASK$REBOOT TRUE $! $ PROD INSTALL VMS721_FIBRE_SCSI/PROD=DEC/BASE=AXPVMS/VER=V6.0 $! $ DEASSIGN/SYS NO_ASK$BACKUP $ DEASSIGN/SYS NO_ASK$REBOOT $! $ exit Copyright (c) Compaq Computer Corporation, 2002 All Rights Reserved. Unpublished rights reserved under the copyright laws of the United States. COMPAQ, the Compaq logo, VAX, Alpha, VMS, and OpenVMS are registered in the U.S. Patent and Trademark Office. -- COVER LETTER -- Page 12 4 June 2002 All other product names mentioned herein may be trademarks of their respective companies. Confidential computer software. Valid license from Compaq required for possession, use, or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor's standard commercial license. Compaq shall not be liable for technical or editorial errors or omissions contained herein. The information in this document is provided as is without warranty of any kind and is subject to change without notice. The warranties for Compaq products are set forth in the express limited warranty statements accompanying such products. Nothing herein should be construed as constituting an additional warranty. DISCLAIMER OF WARRANTY AND LIMITATION OF LIABILITY THIS PATCH IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND. ALL EXPRESS OR IMPLIED CONDITIONS, REPRESENTATIONS AND WARRANTIES, INCLUDING ANY IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR PARTICULAR PURPOSE, OR NON-INFRINGEMENT, ARE HEREBY EXCLUDED TO THE EXTENT PERMITTED BY APPLICABLE LAW. IN NO EVENT WILL COMPAQ BE LIABLE FOR ANY LOST REVENUE OR PROFIT, OR FOR SPECIAL, INDIRECT, CONSEQUENTIAL, INCIDENTAL OR PUNITIVE DAMAGES, HOWEVER CAUSED AND REGARDLESS OF THE THEORY OF LIABILITY, WITH RESPECT TO ANY PATCH MADE AVAILABLE HERE OR TO THE USE OF SUCH PATCH.