OpenVMS VAXRMS02_071 VAX V7.1 RMS ECO Summary

TITLE: OpenVMS VAXRMS02_071 VAX V7.1 RMS ECO Summary New Kit Date : 07-FEB-2001 Modification Date: Not Applicable Modification Type: New Kit NOTE: An OpenVMS saveset or PCSI installation file is stored on the Internet in a self-expanding compressed file. For OpenVMS savesets, the name of the compressed saveset file will be kit_name.a-dcx_vaxexe for OpenVMS VAX or kit_name.a-dcx_axpexe for OpenVMS Alpha. Once the OpenVMS saveset is copied to your system, expand the compressed saveset by typing RUN kitname.dcx_vaxexe or kitname.dcx_alpexe. For PCSI files, once the PCSI file is copied to your system, rename the PCSI file to kitname-dcx_axpexe.pcsi, then it can be expanded by typing RUN kitname-dcx_axpexe.pcsi. The resultant file will be the PCSI installation file which can be used to install the ECO. Copyright (c) Compaq Computer Corporation 1998. All rights reserved. OP/SYS: DIGITAL OpenVMS VAX COMPONENT: RMS SOURCE: Compaq Computer Corporation ECO INFORMATION: ECO Kit Name: VAXRMS02_071 ECO Kits Superseded by This ECO Kit: VAXRMS01_071. ECO Kit Approximate Size: 1026 Blocks Kit Applies To: OpenVMS VAX V7.1 System/Cluster Reboot Necessary: Yes Rolling Re-boot Supported: Yes Installation Rating: INSTALL_1 1 - To be installed on all systems running the listed version(s) of OpenVMS. Kit Dependencies: The following remedial kit(s) must be installed BEFORE installation of this kit: None In order to receive all the corrections listed in this kit, the following remedial kits should also be installed: None ECO KIT SUMMARY: An ECO kit exists for RMS and global buffer processing on OpenVMS VAX V7.1. This kit addresses the following problems: PROBLEMS ADDRESSED IN VAXRMS02_071 KIT: o Mark the Buffer Descriptor as busy for asynch multistreamed block IO autoextends. Performing multistreamed asynchronous Block IO to a sequential file could result in random data corruption and/or sporadic SS$_BADPARAM errors if an autoextend occurs. Images Affected: [SYS$LDR]RMS.EXE o Fix to prevent an inconsistent primary index structure. This problem is characterized by an ANALYZE/RMS_FILE of an indexed file reporting the following error: "Index bucket references missing data bucket with VBN nnn" When the index bucket is examined, the problem is determined to be that the primary index structure has duplicate index entries. There should never be duplicate entries in the index structure. Any file that has a prevalence of deleted records as the last records in a data bucket that is subsequently a candidate for a bucket split could potentially encounter this problem. A convert of the file will rebuild the primary index structure and leave the file in a consistent state. This problem is fixed in OpenVMS VAX V7.2. Images Affected: [SYS$LDR]RMS.EXE o Modification to the global buffer hashing interlock mechanism. The hashing interlock mechanism, which was implemented in V7.2, has been redesigned to eliminate the timer-based strategy that was used during periods of high contention. A system could become overburdened by Timer Queue activity (IPL 8) when the contention for an interlock reached a threshold. This Timer Queue activity precluded the interlock owner from releasing the interlock and resulted in excessive Timer Queue activity consuming the primary CPU. Images Affected: [SYS$LDR]RMS.EXE o Correction for processes exiting with RMS IORNDN non-fatal bugcheck. Processes may disappear with RMS IORNDN non-fatal bugchecks when an EXIT is requested by an Executive-mode application (such as ACMS). This is a very small timing window, so processes with a large number of files increases the probability of the problem occurring. If the SYSGEN parameter BUGCHECKFATAL is not enabled, then the process will be terminated; if it is enabled, then the system will crash with a RMSBUG (R2=FFFFFFF0, IORNDN) bugcheck. Images Affected: [SYS$LDR]RMS.EXE o FAL resource identifier file owner fix Creation of a file over the network into a directory owned by a resource identifier strips the high order 2 bits of the owner field, corrupting the ownership information. Checking the owner on the remote system will reveal a UIC looking value derived from the low order bits of the identifier. Directory commands to a remote system will also strip the high order 2 bits from a general identifier on the remote system, formatting the result in UIC format. The stripping of the high order 2 bits of the owner value may result in the following errors being returned to a non-privileged user performing a COPY/LOG to a remote directory owned by a resource identifier: %COPY-S-COPIED, {file spec.} copied to {remote file spec.} %COPY-E-CLOSEOUT, error closing {file specification} as output -RMS-E-PRV, insufficient privilege or file protection violation This fix is included in OpenVMS VAX V7.2. Images Affected: [SYS$LDR]RMS.EXE [SYSEXE]FAL.EXE o SPR_VMS_V5: # 06022 o SPR_VMS_V5: # 05990 o SPR_VMS_V5: # 02759 o SPR_VMS_V5: # 02144 o SPR_VMS_V5: # 00675 o V6: # 00287 o Prevent RMS pool corruption when accessing bad .DIR file Access to a corrupted directory could result in the user's process being deleted from the system through an EXEC mode exception (a system bugcheck would occur if the SYSGEN parameter BUGCHECKFATAL were set). Images Affected: [SYS$LDR]RMS.EXE o Support for SET RMS_DEFAULT /CONTENTION_POLICY to address locking fairness issues. The new Alpha global buffer read-mode lock support introduced in V7.2-1H1 is functionally compatible with both VAX and older Alpha releases. Operations in mixed clusters produce correct results. However, there is a locking fairness issue that may arise with mixed cluster operations. In a mixed cluster environment with very high contention for specific buckets, it is possible for accesses to write-shared files on nodes using read-mode bucket locking to dominate access to a bucket. Nodes without this support might be unable to access the bucket for a protracted period of time. It is also possible to observe comparable behavior on all OpenVMS versions when dealing with accesses to write-shared files without global buffers enabled -- even on a standalone system. A similar fairness issue between lock conversions and new lock requests may be observed in which the new lock requests may remain ungranted for an extended period of time. This kit includes support in RMS for a new option to improve fairness under high contention conditions for write-shared files, but selecting this option may noticably increase locking overhead. The option may be set at a process or system level. Since many applications will never encounter this issue, the default system behavior leaves this option disabled. A future lock management enhancement should make this fairness workaround unnecessary for later releases. The option is controlled using the /CONTENTION_POLICY qualifier to the DCL command SET RMS_DEFAULT. The following are valid PROCESS keywords (/SYSTEM not specified): NEVER Never use the higher overhead option to improve fairness for any write-shared files accessed by this process; minimal overhead. SOMETIMES Use this option for fairer bucket access (but higher overhead) to any write-shared files with global buffers enabled that are accessed by this process. ALWAYS Use this option for fairer bucket access (but higher overhead) to all write-shared files accessed by this process. SYSTEM_DEFAULT (Default) Use system setting. Note that this keyword is disallowed with /SYSTEM. The following are valid SYSTEM keywords (/SYSTEM specified): NEVER (Default) Never use the higher overhead option to improve fairness for any write-shared files accessed on the system; minimal overhead. SOMETIMES Use this option for fairer bucket access (but higher overhead) to any write-shared files with global buffers enabled that are accessed on the system. ALWAYS Use this option for fairer bucket access (but higher overhead) to all write-shared files accessed on the system. In addition to the RMS image, modifications to the following images are required: - [SYSEXE]SET.EXE - [SYSEXE]SHOW.EXE - [SYSMSG]CLIUTLMSG.EXE - replacement of modified SET.CLD in [SYSLIB]DCLTABLES.EXE These modified images are available in the VAXCLIU04_071 kit. The interface to this new functionality is not available until this CLIUTL kit is installed. Until the CLIUTL TIMA kit is available and installed, the default of NEVER for the CONTENTION_POLICY option cannot be overriden. Images Affected: [SYS$LDR]RMS.EXE o Fix CONVERT/RECLAIM of RU_DELETE'd records. A CONVERT/RECLAIM of a fixed record length file with no compression enabled that has been used with RU journaling may inappropriately reclaim buckets containing valid user records. This results in data loss. This fix is included in OpenVMS VAX V7.2. Images Affected: [SYSLIB]CONVSHR.EXE o Fix to prevent callable convert from producing accvio on repeated calls. The CONVERT utility may return an access violation and/or sort_on errors when it is repeatedly invoked from within an application utilizing the callable interface. Additionally, an invalid file structure may be created when the callable interface is invoked repeatedly from within an application. Images Affected: [SYSLIB]CONVSHR.EXE o Expand statistic display fields for Convert The record count and/or bucket counts displayed by the statistics function and ^T function of Convert were previously limited to 8 digits. This resulted in a field of "*"s being displayed when greater than 8 digits were required for displaying. Images Affected: [SYSEXE]CONVERT.EXE [SYSEXE]RECLAIM.EXE PROBLEMS ADDRESSED IN VAXRMS01_071 KIT: o RMS has implemented a new algorithm for global buffer management that dramatically improves scalability. The performance associated with the previous algorithm effectively limited the maximum number of global buffers on large, shared files. With this change, customers may increase the number of global buffers on these files to the full limit of 32,767 to fully exploit large memory systems. Access to the global section used for RMS global buffers is now mainly synchronized using inline atomic instruction sequences rather than distributive locking. This change allows more concurrent access to the section (particularly on SMP machines). As has always been true, increasing the number of global buffers on specific files may require some system resources to be increased in size. Note that these enhancements required changes to the internal structures representing RMS global buffers. This will lead to some erroneous values being displayed by the System Dump Analyzer (SDA) in two RMS structures (global buffer descriptors and global buffer headers). This is a display anomaly only. o RMS and FAL have been enhanced to support time transfers using 128-bit UTC format. Currently, RMS and FAL exchange date-time file attributes using an 18-byte ASCII string which includes a 2-digit year. Since the date is pivoted at 1970 (YY's from 70 to 99 map to 1970 through 1999 and YY's from 00 to 69 map to 2000 through 2069), OpenVMS RMS is Year 2000 compliant in regards to file access using DECNET FAL. The DAP specification provides for using a 128-bit UTC binary date as an alternative to the ASCII format. This enhancement allows non-VMS operating systems to access file dates on OpenVMS in a completely Year 2000 compliant manner. Alternative workarounds: + Apply this remedial kit. + There is no temporary workaround for avoiding the secondary key inconsistency. However, a convert of the file after the problem has occurred will rebuild the secondary indexes and leave the file in a consistent state. o The last data record(s) added at the end of a relative file may be lost (overwritten) after an "explicit" call to the RMS $EXTEND service for a relative file that is being shared by two or more concurrent writers. This problem is restricted to an explicit $EXTEND; autoextends done transparently by RMS for a relative file work correctly. NOTE: Until this remedial kit can be installed, this problem can be prevented by not doing any explicit $EXTEND for a shared write relative file. Let the autoextend feature of RMS take care of any extends needed. See "Relative File Extend Size" section in Chapter 3 (Performance Considerations) of the Guide to OpenVMS File Applications for a description of how RMS derives the value it uses for its autoextend feature. Alternative Workarounds: + Apply this remedial kit. + Do not do an explicit $EXTEND for a relative file. Let the autoextend feature of RMS take care of any extends needed. Autoextends work correctly. See "Relative File Extend Size" section in Chapter 3 (Performance Considerations) of the Guide to OpenVMS File Applications for a description of how RMS derives the value it uses for its autoextend feature. o Stored semantics is a file attribute used to indicate that a file contains compound documents (i.e., contains a number of integrated components including text, graphics, and scanned images). Support for setting this file attribute is provided through RMS using an item list XAB (XABITM) user interface. In specifying the item list for setting this attribute (XAB$_STORED_SEMANTICS), a user may request a return length (the length of the stored semantics ACE added by the file system to the file). RMS without the fix returns a length of zero if the file is on a SMFS device despite the fact that the stored semantics attribute was successfully added to the file. Alternative workarounds: + Apply this remedial kit. + In actuality, the XABITM XAB$_STORED_SEMANTICS request succeeds. There is just no temporary workaround for the correct length of the ACE being returned to the return length address provided in the XABITM item list. o Fix for appender lock timing window hang for shared write, RU journaled sequential file. This hang requires all of the following conditions: a shared write sequential file, multi-record streaming enabled, RU journaling enabled on the file, and heavy write contention among sharers at the end-of-file. It has only been reported by one site, and even then there was a lapse of over one year between two occurrences. Alternative workarounds: + Apply this remedial kit. + There is no temporary workaround. Changing the application to use only a single record stream on the journaled file would theoretically work, but this would most likely involve too much of a design change to the application to be a viable workaround. Disabling RU journaling would also not be a viable workaround. o Changes to the security profile of a file that was installed with the /OPEN qualifier were not reflected cluster wide until an INSTALL/REPLACE was performed on each node. With this correction, the protection information is dynamically propagated. Alternative workarounds: + Apply this remedial kit. + Following the change in the protection information, perform an INSTALL/REPLACE on all other nodes in the cluster. o Fix to properly handle a special bucket split case to prevent an exec-mode ACCVIO occurring during a bucket split. This problem is restricted to a file with KEY compression enabled containing a secondary key allowing duplicates with a nearly full SIDR data bucket. An ACCVIO (access violation) may occur during a bucket split operation if both the last valid record in the bucket contains a long list of duplicates and starts before the calculated split point and the memory page adjacent to the page mapping the current buffer is not valid. If the SYSGEN parameter BUGCHECKFATAL is not enabled, then the process will be terminated; if it is enabled, then the system will crash with a SSRVEXCEPT ACCVIO. The file will not be left corrupted. A temporary workaround if the problem ever occurs on a file would be to convert the file using a revised FDL in which key compression is disabled until the remedial kit is applied. Alternative workarounds: + Apply this remedial kit. + Convert the file using a revised FDL in which key compression is disabled until the remedial kit is applied. o Fix to correct a unique situation where the delete of a record in a file with key compression enabled could possibly cause the resulting key expansion to overflow the current bucket. This problem is restricted to a file with key compression enabled. A nonfatal RMS bugcheck (R2 value FFFFFFD4 (Internal ISAM Error)) could occur during a record delete operation on a prologue 3 file. The ISAM error would be returned when the deletion of a record caused the expansion of the compressed key value of the next record in the bucket to exceed the available space in the data bucket. This would occur if the primary data bucket containing the key was nearly full and the record that followed the record being deleted was compressed to the extent that only one physical byte of the key was stored in the data record. The file will not be left corrupted. A temporary workaround if the problem ever occurs on a file would be to convert the file using a revised FDL in which key compression is disabled until the remedial kit is applied. This problem was introduced in OpenVMS V6.0. Alternative workarounds: + Apply this remedial kit. + Convert the file using a revised FDL in which key compression is disabled until the remedial kit is applied. o Fix to prevent process hangs when RMS rundown is invoked from within an EXEC-mode AST. If RMS rundown is invoked from within an EXEC-mode AST, the process will hang indefinitely waiting on the delivery of pending ASTs. As documented in Section 2.5 of the RMS Reference Manual, RMS services should not be called from within an EXEC-mode AST. If however, an EXEC-mode AST routine fails due to an unhandled condition, RMS rundown will be indirectly invoked by the image exit (resulting in a process hang). The rundown routine has been modified to abort RMS rundown with an RMS$_BUSY return status if it has been invoked from within an EXEC-mode AST. Alternative workarounds: + Apply this remedial kit. + While the use of STOP/ID is typically strongly discouraged, due to the nature of this loop, STOP/ID must be used to terminate the process. o Fix to prevent a nonfatal RMS bugcheck (ENQDEQFAIL) when a timeout on a lock request coincides with a system detected SS$_DEADLOCK condition for the same resource. If an RMS $GET is performed with the WAT and TMO options specified (wait for lock and timeout request after specified number of seconds) in the RAB$L_ROP (record options) field, it is possible for a nonfatal RMS bugcheck (ENQDEQFAIL; R2=FFFFFFF4) to be signaled when the timeout occurs. This can happen if an SS$_DEADLOCK condition is detected by the system (which cancels the lock request) but the timeout AST routine is triggered before RMS receives the This ECO kit creates a new startup file containing this correction. o Your DSN$STARTUP.COM is modified by this ECO kit to check for the SYSLCK privilege. o If 127 resource paths were defined for a DSNlink DECnet Gateway connection to a single resource name, and none of these 127 paths were available, the connection failed with an invalid index error instead of a resource-not-available error. o It was possible for an incorrect highlighting map to be used when applying the highlights to your ITS document under these conditions: a document read from an ITS database was read a second time from internal cache after reading intervening documents; the database you had open was remotely served by your DSNlink Host from somewhere else (not coresident with the host); and you used keyword highlighting when reading the documents. o Under circumstances similar to ECO B126, text lines returned from the ITS server could be cached at inappropriate relative positions in the cache list. o Because the File Copy application does not permit files to be copied from another node, the application contains a test for the presence of a node name. If present, the error NODNAMNOTALLOW is issued. However, the check did not always work, and this error was not issued at times. Instead, the file copy aborted with an unrelated error. o Incoming DSNlink mail is handled by DSN$MAIL_xx subprocesses on your system. Those subprocesses could enter an infinite loop if they received certain incorrectly formatted destination mail addresses. o An internal hash table initialization problem caused sporadic network link failures, such as "unknown link" errors or "change cipher key" requests. These errors always resulted in premature link aborts or prevented the establishment of new links. o If a File Copy job was aborted just after it allocated the space for the file but had not filled any blocks, the subsequent requeued job created another new file. The first empty file, with full space allocation, was not deleted. Additionally, the old in-progress file was locked by the server subprocess that populated the new in-progress file. o If a DSNlink Copy job partially copies a file, the file is requeued. If the partial file was deleted on the receiving end before the copy job restarted, the job errored out with a DSN-F-BADDATA error and a corrupt file was placed in the incoming files directory. INSTALLATION NOTES: The system/cluster does not need to be rebooted after this kit is installed. However, DSNlink will/should be shutdown and restarted in order for all kit changes to take effect. Please see the DSNLINKC012 Release Notes which are a part of this kit for further pre- and post-installation instructions. All trademarks are the property of their respective owners.

This patch can be found at any of these sites:

Colorado Site
Georgia Site
European Site

Files on this server are as follows:

dsnlinkc012.README
dsnlinkc012.CHKSUM
dsnlinkc012.CVRLET_TXT
dsnlinkc012.a-dcx_axpexe
dsnlinkc012.CVRLET_TXT