ECO NUMBER: VAXRMS02_072 PRODUCT: OpenVMS VAX OPERATING SYSTEM V7.2 UPDATE PRODUCT: OpenVMS VAX OPERATING SYSTEM V7.2 COVER LETTER 1 KIT NAME: VAXRMS02_072 2 KITS SUPERSEDED BY THIS KIT: VAXRMS01_072 3 KIT DEPENDENCIES: 3.1 The following remedial kit(s), or later, must be installed BEFORE installation of this, or any required kit: VAXUPDATE01_072 3.2 In order to receive all the corrections listed in this kit, the following remedial kits, or later, should also be installed: None. 4 KIT DESCRIPTION: 4.1 Version(s) of OpenVMS to which this kit may be applied: OpenVMS VAX V7.2 4.2 Files patched or replaced: o [SYSEXE]CONVERT.EXE (new image) o [SYSLIB]CONVSHR.EXE (new image) o [SYSEXE]EDF.EXE (new image) o [SYSEXE]RECLAIM.EXE (new image) o [SYS$LDR]RECOVERY_UNIT_SERVICES.EXE (new image) o [SYS$LDR]RMS.EXE (new image) o [SYS$LDR]RMSDEF.STB (new file) -- COVER LETTER -- Page 2 8 October 2002 5 PROBLEMS ADDRESSED IN VAXRMS02_072 KIT o RMS: Fix for potential RMS lock hang when global buffers are enabled. This problem involves a rare condition if the last accessor of the global section is interrupted by an abort rundown, which may lead to the need for the cleanup of the global section not being properly detected. This could result in the survival of the global section and a global bucket system lock after the last accessor is deleted. Since this problem requires the accessor that is aborted to also be the last accessor, it is more apt to occur with a file that accessors come and go -- with a file that is opened and closed frequently and may have only one accessor at a particular time (e.g. RIGHTSLIST.DAT) -- versus a file with global buffers enabled that typically has several concurrent accessors at all times until there is an application (or system) shutdown. Images Affected: - [SYS$LDR]RMS.EXE o RMS: Fix to prevent possible process hangs and system crashes when files with global buffers are accessed. An application accessing a file with global buffers enabled might experience any one of several symptoms ranging from an IVLOCKID being returned to RMS through a possible SSRVEXCEPTN due to corruption of an RMS internal control structure. Prior to this change, it is possible for an internal table maintained by RMS (the Global Buffer Interlock Table) within its global buffer sections to overflow. This can potentially result in corruption to adjoining control structures. No user data are compromised; however, the process may hang or the system crash depending on what is overwritten. This problem is most prevalent on systems where there is a high turnover of processes. Images Affected: - [SYS$LDR]RMS.EXE o RMS: Fix for inconsistent secondary key index structure. Any application that does a lot of deleting or does updates that change a no duplicate secondary key value to another value in an indexed file is a potential candidate for this problem. -- COVER LETTER -- Page 3 8 October 2002 An ANALYZE/RMS_FILE of the indexed file reports the following error for a secondary key: "Index bucket references missing data bucket with VBN nnn" The problem may be that the secondary index structure has duplicate index value entries and there should never be duplicates in the index structure. If the secondary index allows a binary search (is uncompressed), records could be hidden using an exact secondary key lookup. This problem results from the entire space being inappropriately reclaimed for the physically last SIDR record in some secondary data bucket which contains only deleted entries. This problem is restricted to an indexed file with a secondary key that allows no duplicates. The primary key contents will be intact and correct, and a convert of the file will rebuild the secondary indexes and leave the file in a consistent state. Images Affected: - [SYS$LDR]RMS.EXE o RMS: Fix for some records being potentially skipped over in a reverse key search. If the very last bucket in the data bucket chain for a particular key-of-reference is empty (no valid records), the potential exists for any valid records in the next-to-the-last bucket (and only this bucket) being skipped over in the backwards scan done by a reverse key search. This problem is restricted to a reverse key search. Images Affected: - [SYS$LDR]RMS.EXE o RMS RU journaling fix for an update window when a SIDR (secondary index data record) marked as RU-DELETE may be inappropriately re-marked as deleted while the SIDR is still part of an active recovery unit. If this transaction were aborted, this could result in the new secondary key value being retained in the primary data record. Images Affected: - [SYS$LDR]RMS.EXE -- COVER LETTER -- Page 4 8 October 2002 o RMS: Avoid an exec-mode infinite loop if RMS ever attempts to add a duplicate key value to a compressed index bucket. An index bucket should never have a duplicate key value. There is the potential, however, some inconsistency (corruption) in a lower level could result in such an attempt in the case of a compressed key. A correction has been added to issue a nonfatal RMS bugcheck (ISAM) and avoid the loop. Images Affected: - [SYS$LDR]RMS.EXE o RMS: Fix to prevent a non-fatal SSRVEXCEPT bugcheck when a wildcard network copy completes. Attempts to perform wild card file copy operations across the network may fail with a non-fatal SSRVEXCEPT bugcheck upon completion when a third party event notification software package is installed. Stale information contained within a recycled DAP message buffer could cause an invalid path to be executed during the implicit $DISPLAY of a file during RMS rundown. Images Affected: - [SYS$LDR]RMS.EXE o RMS: Warning added that the limit of 16383 for the number of either file opens or stream connects by one process has been exceeded. The Internal File (IFI) and Stream Identifier (ISI) values for file opens or stream connects are limited to 14 bits (or a maximum value of 16383). Exceeding the limit can result in unpredictable error conditions depending on what operations are attempted after the open or connect. Prior to this change, RMS failed to issue a warning when this limit was exceeded. RMS has been modified to return a RMS$_DME error status so an application won't continue when this limit is exceeded for an $OPEN or $CONNECT. This serves as a warning to an application that it must be redesigned to limit either the number of files opened or streams connected to less than 16383. Images Affected: - [SYS$LDR]RMS.EXE o RMS: Set the return length of the auxiliary buffer for calls to SYS$FILESCAN. -- COVER LETTER -- Page 5 8 October 2002 The return length of the auxiliary buffer ("retlen" optional parameter) that was passed to SYS$FILESCAN was not being set when the Field Flags argument ("fldflags" parameter) was absent. This change sets the return length value unconditionally when one has been requested. Images Affected: - [SYS$LDR]RMS.EXE o RMS: Rollback of a remote file transfer change made in OpenVMS VAX V7.2R and V7.3 to its day 1 behavior in order to restore prior performance metrics for remote file transfers that request file transfer mode by setting the SQO (FAB$L_FOP) option. The change was made to ignore the file transfer mode (FTM) request if the remote file was write shared. This has led to a number of reports of applications that previously had SQO specified for remote files that are experiencing significant performance degradations in their remote applications with VAX V7.2R and V7.3. We have reviewed the previous behavior for file transfer mode and found that while there is the appearance of locking inconsistencies for readers when FTM is used, there is no potential for data corruption. We have concluded that when users set the FTM (SQO) option, they are in effect giving permission for the same kind of inconsistencies that a user allows when the read-regardless (RRL) option is set. This change restores the VAX pre-V7.2 behavior for the file transfer mode for remote files. If SQO is set, file transfer mode will be used regardless of the sharing specified for the remote file. Users should expect to see the same kind of inconsistencies in reading data as they see when the read-regardless (RRL) option is set. The SQO option should be disabled if this is not acceptable for some application. In addition, to avoid the possibility of a hang that may be induced by retrying remote accesses after a record lock error, users should consider setting both the no-lock (NLK) and read-regardless (RRL) options in the RAB$L_ROP in applications that use the file transfer mode (SQO) option for remote file accesses. An application should continue to work with the restored behavior without a new change even if a change has been made to an existing application to restore the file transfer mode behavior since the SQO fix was made in VAX V7.2R (e.g., adding the UPI sharing option). There is just one potential problem that we need to point out. For new applications designed and implemented in VAX V7.2R or V7.3 that may allow remote accesses to write shared files, they should check whether SQO (FAB$L_FOP) is enabled. Currently the SQO option is being ignored (unless the UPI sharing option is specified), and the file transfer mode is not being used for -- COVER LETTER -- Page 6 8 October 2002 any remote accesses. With the restore of the VAX pre-V7.2 SQO behavior, it will start being used and so the behavior of the application could change. Anyone with a new application that has SQO set and the possibility of write shared files being remotely accessed by the application should consider whether the SQO option needs to be disabled. Images Affected: - [SYS$LDR]RMS.EXE o CONVERT/RECLAIM: Fix to prevent the CONVERT/RECLAIM utility from producing an inconsistent index structure in an indexed file during a reclamation. An ANALYZE/RMS_FILE reports the following error: "Index bucket references missing data bucket with VBN nnn" A level 1 index record associated with a data (level 0) bucket that was reclaimed was not removed from the index bucket, as it should have been. It is extremely difficult to detect in advance of doing a convert/reclaim whether an indexed file is vulnerable if a reclaim were applied to it. For example, one condition is that one of the initial level 1 index buckets associated with data buckets eligible for reclamation has some condition (for example, only one index record) that will cause a rollback of a removed index record during a reclamation. Without the fix, doing a full convert (without the /RECLAIM qualifier) ensures avoiding this problem. Images Affected: - [SYSEXE]RECLAIM.EXE - [SYSEXE]CONVERT.EXE - [SYSLIB]CONVSHR.EXE o CONVERT: Fix to prevent accvio when repeated calls are made to CONVERT utility callable interface. When making repeated calls to the CONVERT utility callable interface from within a user application, an access violation or various sort errors may be returned. The following conditions must exist for this error to occur: - An application must be making repeated calls to the callable CONVERT interface. -- COVER LETTER -- Page 7 8 October 2002 - The current file being converted must have more than 3 keys. - At least one of the previously converted files must have had all compressions disabled. - The current file being converted must have some compression enabled. Images Affected: - [SYSEXE]CONVERT.EXE - [SYSLIB]CONVSHR.EXE o CONVERT: Fix for SQO error on CONVERT/NOSORT with collated key. Producing an indexed file with a collated key using the qualifier /NOSORT with the CONVERT may fail with the following error: %CONV-F-READERR, error reading -RMS-F-SQO, operation not sequential (SQO set) Images Affected: - [SYSEXE]CONVERT.EXE - [SYSLIB]CONVSHR.EXE o CONVERT: Fix for remote file DAP protocol regression. The convert utility fails with the following error when the input file is a sequential file on a remote foreign (non-VMS) system and the output file is a sequential file on a VMS system if and only if /SORT is explicitly specified or implied by /FDL: %CONV-F-READERR, Error reading (IBM_filename) -RMS-F-BUG_DAP, Data Access Protocol error detected; DAP code = 0001A008 The problem is not reproducable using a remote VMS system. Images Affected: - [SYSEXE]CONVERT.EXE - [SYSLIB]CONVSHR.EXE -- COVER LETTER -- Page 8 8 October 2002 o CONVERT: Fix for a user-mode accvio when converting a sequential file when the maximum record size (MRS) in its file header is inappropriately set to zero. Images Affected: - [SYSLIB]CONVSHR.EXE - [SYSEXE]CONVERT.EXE o EDF: Fix to correct the bucket size calculation for values that are evenly divisible by the cluster size. The EDit/Fdl Design Facility (EDF) may calculate a larger than necessary bucket size when the bucket size is evenly divisible by the cluster size of the disk. This problem is fixed in OpenVMS VAX V7.3. Images Affected: - [SYSEXE]EDF.EXE 6 PROBLEMS ADDRESSED IN VAXRMS01_072 KIT o Mark the Buffer Descriptor as busy for asynch multistreamed block IO autoextends. Performing multistreamed asynchronous Block IO to a sequential file could result in random data corruption and/or sporadic SS$_BADPARAM errors if an autoextend occurs. o Modification to the global buffer hashing interlock mechanism. The hashing interlock mechanism, which was implemented in V7.2, has been redesigned to eliminate the timer-based strategy that was used during periods of high contention. A system could become overburdened by Timer Queue activity (IPL 8) when the contention for an interlock reached a threshold. This Timer Queue activity precluded the interlock owner from releasing the interlock and resulted in excessive Timer Queue activity consuming the primary CPU. o Correction for processes exiting with RMS IORNDN non-fatal bugcheck. Processes may disappear with RMS IORNDN non-fatal bugchecks when an EXIT is requested by an Executive-mode application (such as ACMS). This is a very small timing window, so processes with a large number of files increases the probability of the problem occurring. -- COVER LETTER -- Page 9 8 October 2002 If the SYSGEN parameter BUGCHECKFATAL is not enabled, then the process will be terminated; if it is enabled, then the system will crash with a RMSBUG (R2=FFFFFFF0, IORNDN) bugcheck. o Access to corrupted directory results in process deletion. Access to a corrupted directory could result in the user's process being deleted from the system through an EXEC mode exception. Note that a system bugcheck would occur if the SYSGEN parameter BUGCHECKFATAL were set. o Fix RUF bugcheck when SS$_CURTIDCHANGE returned. The system may crash with a RUF, Fatal error detected by Recovery Unit Facility bugcheck. R0 in the crash has the error code: SDA> e/cond r0 SYSTEM-F-CURTIDCHANGE, already a change to the process default transaction in progress o Support for SET RMS_DEFAULT /CONTENTION_POLICY to address locking fairness issues. The new Alpha global buffer read-mode lock support introduced in V7.2-1H1 is functionally compatible with both VAX and older Alpha releases. Operations in mixed clusters produce correct results. However, there is a locking fairness issue that may arise with mixed cluster operations. In a mixed cluster environment with very high contention for specific buckets, it is possible for accesses to write-shared files on nodes using read-mode bucket locking to dominate access to a bucket. Nodes without this support might be unable to access the bucket for a protracted period of time. It is also possible to observe comparable behavior on all OpenVMS versions when dealing with accesses to write-shared files without global buffers enabled -- even on a standalone system. A similar fairness issue between lock conversions and new lock requests may be observed in which the new lock requests may remain ungranted for an extended period of time. This kit includes support in RMS for a new option to improve fairness under high contention conditions for write-shared files, but selecting this option may noticeably increase locking overhead. The option may be set at a process or system level. Since many applications will never encounter this issue, the default system behavior leaves this option disabled. A future lock management enhancement should make this fairness workaround unnecessary for later releases. The option is controlled using the /CONTENTION_POLICY qualifier to the DCL command SET RMS_DEFAULT. The following are valid PROCESS keywords (/SYSTEM not specified): -- COVER LETTER -- Page 10 8 October 2002 NEVER Never use the higher overhead option to improve fairness for any write-shared files accessed by this process; minimal overhead. SOMETIMES Use this option for fairer bucket access (but higher overhead) to any write-shared files with global buffers enabled that are accessed by this process. ALWAYS Use this option for fairer bucket access (but higher overhead) to all write-shared files accessed by this process. SYSTEM_DEFAULT (Default) Use system setting. Note that this keyword is disallowed with /SYSTEM. The following are valid SYSTEM keywords (/SYSTEM specified): NEVER (Default) Never use the higher overhead option to improve fairness for any write-shared files accessed on the system; minimal overhead. SOMETIMES Use this option for fairer bucket access (but higher overhead) to any write-shared files with global buffers enabled that are accessed on the system. ALWAYS Use this option for fairer bucket access (but higher overhead) to all write-shared files accessed on the system. In addition to the RMS image, modifications to the following images are required: - [SYSEXE]SET.EXE - [SYSEXE]SHOW.EXE - [SYSMSG]CLIUTLMSG.EXE - replacement of modified SET.CLD in [SYSLIB]DCLTABLES.EXE These modified images are available in the VAXCLIU01_072 kit. The interface to this new functionality is not available until this CLIUTL kit is installed. Until the CLIUTL TIMA kit is available and installed, the default of NEVER for the -- COVER LETTER -- Page 11 8 October 2002 CONTENTION_POLICY option cannot be overridden. o Fix to EDIT/FDL to prevent large cluster factor giving 63-block bucket. When running EDIT/FDL, the calculated bucket sizes are always rounded up to the closest disk-cluster boundary, with a maximum bucket size of 63. This can cause problems when the disk-cluster size is large, but the "natural" bucket size for the file is small, because the bucket size is rounded up to a much larger value than required. Larger bucket sizes increase record and bucket lock contention, which can impact performance. o Initialize secondary value when using callable CONVERT. Callable convert reports a %CONV-F-INSVIRMEM error when a user application calls routine CONV$PASS_OPTIONS without a user option block. o Fix to prevent callable convert from producing accvio on repeated calls. The CONVERT utility may return an access violation and/or sort_on errors when it is repeatedly invoked from within an application utilizing the callable interface. Additionally, an invalid file structure may be created when the callable interface is invoked repeatedly from within an application. o Correction for CONVSHR SORT_ON and invalid alternate key structures. Three issues with the CONVERT utility have been addressed: + Attempts to CONVERT a prologue 3 indexed file with greater than 3 keys (key 3 and above) and with the primary key being segmented with the segments not in ascending order, results in invalid key structures being generated for key 3 and above. + Using the callable interface to convert multiple files, results in a SORT_ON error if any previous file contains no records and has at least one alternate key defined. + Using the /SECONDARY qualifier with values greater than eight results in invalid alternate key structures being generated. o Prevent ISI error on close from CONVERT/RECLAIM. CONVERT/RECLAIM reports the following error during image rundown: "%RMS-F-ISI, invalid internal stream identifier (ISI) value" -- COVER LETTER -- Page 12 8 October 2002 o Fix %CONVERT-I-SEQ errors converting sequential file. Attempts to convert a sequential file to an indexed format may report %CONVERT-I-SEQ errors despite convert's invocation of the SORT utility. These errors may be reported if any of the input file's records are shorter than the primary key's highest segment. o Expand statistic display fields for Convert. The record count and/or bucket counts displayed by the statistics function and ^T function of Convert were previously limited to eight digits. This resulted in a field of "*"s being displayed when greater than eight digits were required for displaying. o Prevent RMS-F-AID errors converting multiple input files. Converting multiple indexed input files with different area attributes can result in an RMS-F-AID, invalid area id. o Correct propagation of LRL value to output file. The convert of a file could potentially leave the LRL field for the output file as zero despite the value existing for the input file. This is inconsistent with previous versions of CONVERT. The LRL value is required for some file organizations. o Prevent ASTLM leak with callable CONVERT. Repeated calls to CONV$CONVERT using the callable interface could exhaust the user's ASTLM. o Prevent process channel consumption with callable CONVERT. Repeated calls to CONV$CONVERT using the callable interface consumes process channels if SYS$COMMAND is not a terminal device. o CONVERT utility fails to create temporary work file. The CONVERT utility fails to create the temporary work file (CONVWORK) when the following conditions exist: + RMS indexed file is being created with more than two keys, and + sort of the primary key is not being performed, and + total size of the alternate keys (keys 2-7) represents a small percentage of the full record length, and + input file is relatively small, and -- COVER LETTER -- Page 13 8 October 2002 + /SECONDARY qualifier is used to specify a value greater than the default value of 1. The effect of these conditions is that the calculated size of the work file is determined to be less than 1 block. When this size is passed to the file creation routine, the following fatal error is reported: "%CONVERT-F-CREA_ERR, error creating work file" DISK:[DIRECTORY]CONVWORK.TMP;1" "-RMS-F-ALQ, invalid allocation quantity (negative, or 0 on $EXTEND)" o Fix to CONVERT to prevent SORT_ON and user mode Access Violations. As part of OpenVMS V7.2, the CONVERT utility was modified to eliminate a previous design constraint, in which the output file would temporarily become vulnerable to user access during the exchange of the file between CONVERT and the SORT32 utility. This problem would occur during the FAST load processing of the secondary keys of the file. (For more details see Section 3.21 of the V7.2 New Features manual.) As part of this modification, some changes were added to read records from the newly created output file, while processing alternate keys. In the case of an indexed file with fixed-length records, it turns out that the logic, for determining if both key and record compression were disabled for the primary key, was flawed. Hence, this problem can lead to a record length being incorrectly calculated. It can result in the overwriting of internal structures contiguous to a temporary convert buffer and cause various errors, ranging from sort errors (SORT_ON) to user-mode access violations. In order for this problem to occur, the indexed file must have ALL of the following characteristics: + FIXED length record format + More than two keys + Primary key must have either key or record compression (or both) enabled 7 KIT INSTALLATION RATING: The following kit installation rating, based upon current CLD information, is provided to serve as a guide to which customers should apply this remedial kit. (Reference attached Disclaimer of Warranty and Limitation of Liability Statement) -- COVER LETTER -- Page 14 8 October 2002 INSTALLATION RATING: INSTALL_1 : To be installed by all customers. 8 INSTALLATION INSTRUCTIONS: Install this kit with the VMSINSTAL utility by logging into the SYSTEM account, and typing the following at the DCL prompt: @SYS$UPDATE:VMSINSTAL VAXRMS02_072 [location of the saveset] The saveset location may be a tape drive, CD, or a disk directory that contains the kit saveset. This kit requires a system reboot. Compaq strongly recommends that a reboot is performed immediately after kit installation to avoid system instability If you have other nodes in your OpenVMS cluster, they must also be rebooted in order to make use of the new image(s). If it is not possible or convenient to reboot the entire cluster at this time, a rolling re-boot may be performed. 8.1 Special Installation Instructions: There are no special installation instructions for the VAXRMS02_072 ECo kit. Copyright (c) Compaq Computer Company, 2002 All Rights Reserved. Unpublished rights reserved under the copyright laws of the United States. COMPAQ, the COMPAQ logo, VAX, Alpha, VMS, and OpenVMS are registered in the U.S. Patent and Trademark Office. All other product names mentioned herein may be trademarks of their respective companies. Confidential computer software. Valid license from COMPAQ are required for possession, use, or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor's standard commercial license. COMPAQ shall not be liable for technical or editorial errors or omissions contained herein. The information in this document is provided as is without warranty of any kind and is subject to change without notice. The warranties for COMPAQ products are set forth in the express limited warranty statements accompanying such products. Nothing herein should be construed as constituting an additional warranty. -- COVER LETTER -- Page 15 8 October 2002 DISCLAIMER OF WARRANTY AND LIMITATION OF LIABILITY THIS PATCH IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND. ALL EXPRESS OR IMPLIED CONDITIONS, REPRESENTATIONS AND WARRANTIES, INCLUDING ANY IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR PARTICULAR PURPOSE, OR NON-INFRINGEMENT, ARE HEREBY EXCLUDED TO THE EXTENT PERMITTED BY APPLICABLE LAW. IN NO EVENT WILL COMPAQ BE LIABLE FOR ANY LOST REVENUE OR PROFIT, OR FOR SPECIAL, INDIRECT, CONSEQUENTIAL, INCIDENTAL OR PUNITIVE DAMAGES, HOWEVER CAUSED AND REGARDLESS OF THE THEORY OF LIABILITY, WITH RESPECT TO ANY PATCH MADE AVAILABLE HERE OR TO THE USE OF SUCH PATCH.