ECO NUMBER: VMS73_PTHREAD-V0300 PRODUCT: OpenVMS Alpha OPERATING SYSTEM V7.3 UPDATE PRODUCT: OpenVMS Alpha OPERATING SYSTEM V7.3 COVER LETTER 1 KIT NAME: VMS73_PTHREAD-V0300 2 KITS SUPERSEDED BY THIS KIT: VMS73_PTHREAD-V0200 3 KIT DEPENDENCIES: 3.1 The following remedial kit(s), or later, must be installed BEFORE installation of this, or any required kit: VMS73_UPDATE-V0100 3.2 In order to receive all the corrections listed in this kit, the following remedial kits, or later, should also be installed: None. 4 KIT DESCRIPTION: 4.1 Version(s) of OpenVMS to which this kit may be applied: OpenVMS Alpha V7.3 4.2 Files patched or replaced: o [SYSLIB]PTHREAD$DBGSHR.EXE (new image) o [SYSLIB]PTHREAD$RTL.EXE (new image) o [SYSLIB]SYS$STARLET_C.TLB (new file) 5 PROBLEMS ADDRESSED IN VMS73_PTHREAD-V0300 KIT o On multi-processor systems, the EXEC uses the Inner-Mode Semaphore (IMS) to serialize execution of many system services. Upcalls are used to allow the threads library to execute one thread while another is blocked waiting for the IMS. The services which the threads library uses to switch threads are themselves users of the IMS. Under certain conditions, the result is recursion of IMS-free upcalls. If the recursion -- COVER LETTER -- Page 2 6 August 2002 persists long enough, a stack overflow can occur in a null thread which terminates the process. This can only happen on a multiprocessor system when use of upcalls and multiple kernel threads are enabled. Images Affected: - [SYSLIB]PTHREAD$DBGSHR.EXE - [SYSLIB]PTHREAD$RTL.EXE o The threads library uses $SETIMR to implement pthread_cond_timedwait and pthread_delay_np operations. The library starts a timer for a given wake-up time, and expects to be awoken at that timer (or perhaps later, if the system is busy). At one customer site, we observed that occasionally, the wake-up appeared to arrive too early (at a system clock time which was less than the requested wake-up time). Under certain conditions these early wake-ups caused the process to hang. The cause of the early wake-ups is thought to be use of NTP which was configured to allow backward time jumps (which is not recommended). If such a backward jump occurs between when the timer's target time is reached, and when the library runs and reads the system clock, the effective result is an early wake-up. The library has been enhanced to defend against early wake-ups, by requesting a new timer for the original target time. Images Affected: - [SYSLIB]PTHREAD$DBGSHR.EXE - [SYSLIB]PTHREAD$RTL.EXE o The public C language header file PTHREAD_EXCEPTION.H was using the symbol NULL, without ensuring that this symbol was defined. If a user program is compiled with this header, in an environment which happens to not cause NULL to be defined, the following compiler diagnostic will appear: %CC-E-UNDECLARED, In this statement, "NULL" is not declared. The header file has been changed to not use the NULL symbol. Images Affected: - [SYSLIB]SYS$STARLET_C.TLB o Fix a locking problem which could cause a threads bugcheck at context-switch. The threads library uses internal locks to protect shared data. A thread can context-switch away while -- COVER LETTER -- Page 3 6 August 2002 holding certain locks. The internal lock environment is generally restored when the thread is later context-switched back in. A bug caused the incorrect lock environment to be restored under certain conditions. The result is most often a threads bugcheck "selected a non-ready thread," but other types of process crashes could result. Following is the introductory part of the bugcheck report: %DECthreads bugcheck (version V3.17-019), terminating execution. %Reason: selected a non-ready thread 8 (0x000000001E945740) state running %Running on OpenVMS V7.3 on AlphaServer 8400 5/625, 2048Mb; 10 CPUs % The bugcheck occurred at 29-MAY-2001 22:45:06.03, running % image DSA0:[DB22X.]SERVER.EXE;1 in process 202014EC % (named "MCR Srv"), under username "SYSTEM". AST delivery % is enabled for all modes; no ASTs active. Upcalls are % enabled. Multiple kernel threads are enabled. The current % thread sequence number is 1, at 0x7BC34908 Images Affected: - [SYSLIB]PTHREAD$DBGSHR.EXE - [SYSLIB]PTHREAD$RTL.EXE o Fix a problem seen in Java that caused process hangs. This requires an application AST to arrive while the Java garbage collector is running under certain conditions. The threads library uses the default thread to service the AST (when upcalls are enabled, which Java requires). If garbage collection is happening at the same time, the default thread can be left in an incorrect state, causing a hang. To fix this, AST delivery is now deferred while the garbage collector is running. Images Affected: - [SYSLIB]PTHREAD$DBGSHR.EXE - [SYSLIB]PTHREAD$RTL.EXE 6 PROBLEMS ADDRESSED IN VMS73_PTHREAD-V0200 KIT o The pthread_cond_timedwait() and pthread_delay_np() timed operations could wait too long before timing out. This can happen if the operating system's Time Differential Factor (TDF) is changed, e.g., due to a Daylight Saving Time (DST) -- COVER LETTER -- Page 4 6 August 2002 adjustment. Documentation for these timed operations states that they may in fact return late. However, that was not meant to cover a, for example, one hour delay due to a switch of DST. With this change, system-time alterations due to general TDF changes (such as a DST switch) now have no effect on timed thread operations. Images Affected: - [SYSLIB]PTHREAD$DBGSHR.EXE - [SYSLIB]PTHREAD$RTL.EXE o This correction fixes a race condition on a multiprocessor system, where thread scheduling states can become confused. This is a rare condition, and only happens when upcalls and multiple kernel threads are in use. The only known manifestation of the problem so far is an application hang. One kernel thread is seen to be in an AST in HIB state, while that kernel thread is running a threads library internal "null" thread. Because an AST is active, no more ASTs will be delivered. If further application progress depends on servicing subsequent ASTs, the application is hung. Images Affected: - [SYSLIB]PTHREAD$DBGSHR.EXE - [SYSLIB]PTHREAD$RTL.EXE o A process can crash (probably in PTHREAD$RTL, possibly in user-mode). This problem requires the use of multiple kernel threads (and thus upcalls) on a multiprocessor system, and AST delivery. Upon completion of an application AST, the threads library can mistakenly try to run the default thread on two CPUs concurrently. If this occurs, the application will probably ACCVIO quickly. Images Affected: - [SYSLIB]PTHREAD$DBGSHR.EXE - [SYSLIB]PTHREAD$RTL.EXE -- COVER LETTER -- Page 5 6 August 2002 7 KIT INSTALLATION RATING: The following kit installation rating, based upon current CLD information, is provided to serve as a guide to which customers should apply this remedial kit. (Reference attached Disclaimer of Warranty and Limitation of Liability Statement) INSTALLATION RATING: INSTALL_3 : To be installed by customers experiencing the problems corrected. 8 INSTALLATION INSTRUCTIONS: Install this kit with the POLYCENTER Software installation utility by logging into the SYSTEM account, and typing the following at the DCL prompt: PRODUCT INSTALL VMS73_PTHREAD /SOURCE=[location of Kit] The kit location may be a tape drive, CD, or a disk directory that contains the kit. Additional help on installing PCSI kits can be found by typing HELP PRODUCT INSTALL at the system prompt No reboot is necessary after successful installation of the kit. 8.1 Special Installation Instructions: 8.1.1 Scripting of Answers to Installation Questions During installation, this kit will ask and require user response to several questions. If you wish to automate the installation of this kit and avoid having to provide responses to these questions, you must create a DCL command procedure that includes the following definitions and commands: - $ DEFINE/SYS NO_ASK$BACKUP TRUE - Add the following qualifiers to the PRODUCT INSTALL command and add that command to the DCL procedure. /PROD=DEC/BASE=AXPVMS/VER=V3.0 - De-assign the logicals assigned For example, a sample command file to install the VMS73_PTHREAD kit would be: -- COVER LETTER -- Page 6 6 August 2002 $ $ DEFINE/SYS NO_ASK$BACKUP TRUE $! $ PROD INSTALL VMS73_PTHREAD/PROD=DEC/BASE=AXPVMS/VER=V3.0 $! $ DEASSIGN/SYS NO_ASK$BACKUP $! $ exit Copyright (c) Compaq Computer Company, 2002 All Rights Reserved. Unpublished rights reserved under the copyright laws of the United States. COMPAQ, the COMPAQ logo, VAX, Alpha, VMS, and OpenVMS are registered in the U.S. Patent and Trademark Office. All other product names mentioned herein may be trademarks of their respective companies. Confidential computer software. Valid license from COMPAQ are required for possession, use, or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor's standard commercial license. COMPAQ shall not be liable for technical or editorial errors or omissions contained herein. The information in this document is provided as is without warranty of any kind and is subject to change without notice. The warranties for COMPAQ products are set forth in the express limited warranty statements accompanying such products. Nothing herein should be construed as constituting an additional warranty. DISCLAIMER OF WARRANTY AND LIMITATION OF LIABILITY THIS PATCH IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND. ALL EXPRESS OR IMPLIED CONDITIONS, REPRESENTATIONS AND WARRANTIES, INCLUDING ANY IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR PARTICULAR PURPOSE, OR NON-INFRINGEMENT, ARE HEREBY EXCLUDED TO THE EXTENT PERMITTED BY APPLICABLE LAW. IN NO EVENT WILL COMPAQ BE LIABLE FOR ANY LOST REVENUE OR PROFIT, OR FOR SPECIAL, INDIRECT, CONSEQUENTIAL, INCIDENTAL OR PUNITIVE DAMAGES, HOWEVER CAUSED AND REGARDLESS OF THE THEORY OF LIABILITY, WITH RESPECT TO ANY PATCH MADE AVAILABLE HERE OR TO THE USE OF SUCH PATCH.