OpenVMS VMS721H1_SYS-V0200 Alpha V7.2-1H1 SYSTEM ECO Summary
TITLE: OpenVMS VMS721H1_SYS-V0200 Alpha V7.2-1H1 SYSTEM ECO Summary
New Kit Date : 16-FEB-2001
Modification Date: Not Applicable
Modification Type: Updated Kit Supersedes VMS721H1_SYS-V0100
NOTE: An OpenVMS saveset or PCSI installation file is stored
on the Internet in a self-expanding compressed file.
For OpenVMS savesets, the name of the compressed saveset
file will be kit_name.a-dcx_vaxexe for OpenVMS VAX or
kit_name.a-dcx_axpexe for OpenVMS Alpha. Once the OpenVMS
saveset is copied to your system, expand the compressed
saveset by typing RUN kitname.dcx_vaxexe or kitname.dcx_alpexe.
For PCSI files, once the PCSI file is copied to your system,
rename the PCSI file to kitname-dcx_axpexe.pcsi, then it can
be expanded by typing RUN kitname-dcx_axpexe.pcsi. The resultant
file will be the PCSI installation file which can be used to install
the ECO.
Copyright (c) Compaq Computer Corporation 2000. All rights reserved.
OP/SYS: OpenVMS Alpha
COMPONENT: System Executables
SOURCE: Compaq Computer Corporation
ECO INFORMATION:
ECO Kit Name: VMS721H1_SYS-V0200
DEC-AXPVMS-VMS721H1_SYS-V0200--4.PCSI
ECO Kits Superseded by This ECO Kit: VMS721H1_SYS-V0100
ECO Kit Approximate Size: 14096 Blocks
Kit Applies To: OpenVMS Alpha V7.2-1H1
System/Cluster Reboot Necessary: Yes
Rolling Re-boot Supported: Yes
Installation Rating: INSTALL_1
1 - To be installed on all systems running
the listed version(s) of OpenVMS.
Kit Dependencies:
The following remedial kit(s) must be installed BEFORE
installation of this kit:
VMS721H1_UPDATE-V0300
In order to receive all the corrections listed in this
kit, the following remedial kits should also be installed:
None
ECO KIT SUMMARY:
An ECO kit exists for System components on OpenVMS Alpha V7.2-1H1. This
kit addresses the following problems:
PROBLEMS ADDRESSED IN VMS721H1_SYS-V0200 KIT:
o A problem in a $TRNLNM code path for INTERLOCKED translations
can result in the service exiting without releasing the
logical name mutex (mutual exclusion semaphore). If that
$TRNLNM request or any subsequent kernel mode system service
request made by that process exits with an error status, the
system will crash with a MTXCNTNZ bugcheck.
If no kernel mode system service request made by that process
exits with an error status, the system will eventually hang,
with some processes in MUTEX wait trying to acquire the
logical name mutex. If some of those processes have already
acquired other mutexes, such as the I/O data base mutex and
GSD mutex, there may be other processes in mutex wait trying
to acquire those mutexes.
The $TRNLNM bug is exercised by a fairly unusual combination
of circumstances and is more likely to be seen on an SMP
system.
Crashdump Summary Information:
------------------------------
Bugcheck Type: MTXCNTNONZ, Mutex count nonzero at system
service exit
Current Process: ORA_PRODC0661
Current Image: $1$DGA21:[ORACLE8.RDBMS]ORACLE.EXE
Failing PC: FFFFFFFF.8008EFF4
__RELEASE_SERVICE_ERROR_EXCEPT+00094
Failing PS: 38000000.00000200
Module: EXCEPTION (Link Date/Time:
28-MAY-1999 23:22:24.23)
Offset: 00018FF4
Images Affected: [SYS$LDR]LOGICAL_NAMES.EXE
o Memory reservations can consist of multiple parts. The code
to clean up the resident sections of memory contained several
paths where the MMG spinlock was not released. This could
cause the system to crash with CPUSPINWAIT or SPLACKERR
bugchecks.
This fix is required for all customers who are using the
reserved memory feature in SYSMAN.
Crashdump Summary Information:
------------------------------
Bugcheck Type: CPUSPINWAIT, CPU spinwait timer expired
Current Process: VERIFICATION
Current Image: DSA7777:[SECTIO]SECTIO.EXE;4
Failing PC: FFFFFFFF.8007C384
SYSTEM_SYNCHRONIZATION_MIN+00384
Failing PS: 18000000.00000803
Module: SYSTEM_SYNCHRONIZATION_MIN (Link Date/Time:
2-JUN-2000 11:12:08.33)
Offset: 00000384
Images Affected: [SYS$LDR]SYSTEM_PRIMITIVES.EXE
[SYS$LDR]SYSTEM_PRIMITIVES_MIN.EXE
[SYS$LDR]SYSTEM_PRIMITIVES.STB
[SYS$LDR]SYSTEM_PRIMITIVES_MIN.STB
o The GETDVI (GET DEVICE/VOLUME INFORMATION system service) does
not take into account that a device might have multiple UCBs
(unit control blocks) associated with it.
Images Affected: [SYS$LDR]IO_ROUTINES.EXE
o Applications using the kernel threads features can get stuck
in a loop.
Images Affected: [SYS$LDR]PROCESS_MANAGEMENT.EXE
[SYS$LDR]PROCESS_MANAGEMENT_MON.EXE
o o a $SHOW SYSTEM/FULL" command displays the UIC of all
processes (except interactive process) as [0,0]. $GETJPI
shows the correct UIC's except swapper.
o A system can crash with a SSRVEXCEPT bugcheck in module
PROCESS_MANAGEMENT_MON (SYSTEM_CHECK=1) at offset
0000F4B4.
Images Affected: [SYS$LDR]PROCESS_MANAGEMENT.EXE
[SYS$LDR]PROCESS_MANAGEMENT_MON.EXE
o A system can crash with a REFCNTNEG bugcheck in
MMG$FREWSLX_64_C or with a FREEPAGREF bugcheck in
MMG_STD$ALLOC_ZERO_PFN_64_C. In both crashes, one of the
process's level 2 page table PFNs (page frame numbers) is
incorrectly set to 1100.
This crash can only occur with a 64-bit program running on an
AlphaServer GS-Series 80/160/620. The program must page out a
L2 page table page, and only occurs when the process's working
set is set low. Allocating a L2 page off RAD only occurs when
the process's home RAD is changed or under low memory
conditions.
The following SDA command on a crash dump will quickly show if
you have had this problem:
SDA> show process/page/l1/p2
Images Affected: [SYS$LDR]SYS$VM.EXE
o A system can crash with a SPINWAIT bugcheck with fragmented
non-paged pool. This crash will occur only if either of the
SYSGEN parameter NPAG_GENTLE or NPAG_AGGRESSIVE are set to a
number smaller than 100 (the default case).
Images Affected: [SYS$LDR]SYSTEM_PRIMITIVES.EXE
[SYS$LDR]SYSTEM_PRIMITIVES_MON.EXE
o Protect against corrupt PTEs that do not have the KRE (known
file entry) access bit set or have invalid PFNs (page frame
numbers) on AlphaServer GS-Series 80/160/320 systems.
Concurrently, improve system diagnostics.
Images Affected: [SYS$LDR]EXCEPTION.STB
[SYS$LDR]EXCEPTION_MON.STB
[SYSLIB]SDA$SHARE.EXE
[SYS$LDR]SYS$BASE_IMAGE.EXE
o Prevent LNMB (logical name block) headers from being partially
written as a result of corrupt PTEs that do not have the KRE
(known file entry) access bit set or PTEs that have invalid
PFNs (page frame numbers).
Images Affected: [SYS$LDR]EXCEPTION.STB
[SYS$LDR]EXCEPTION_MON.STB
[SYSLIB]SDA$SHARE.EXE
[SYS$LDR]SYS$BASE_IMAGE.EXE
o The spinlock contention on MMG (Memory Management spinlock)
can become excessive when running Oracle 7 with multiple
clients connected simultaneously.
Images Affected: [SYS$LDR]SYS$VM.EXE
o The effect of a SET SECURITY/OBJECT=DEVICE command can be
propagated to the wrong device(s) in a cluster for MK, DK, and
DG devices.
Images Affected: [SYSLIB]IOGEN$SHARE.EXE
[SYS$LDR]SYS$BASE_IMAGE.EXE
[SYS$LDR]IO_ROUTINES.EXE
[SYS$LDR]IO_ROUTINES_MON.EXE
o The swapper code that controls shared page tables may not
function properly, if the same level 2 page table page maps
both a memory resident page table of PFN-mapped pages and
shared page tables. A DELCONPFN crash could occur when the
level 2 page tables are deleted.
Images Affected: [SYS$LDR]SYS$VM.EXE
o A system can crash with a KRNLSTACKNV bugcheck during heavy
I/O activity, such as BACKUP. Forcing the stack out shows
interaction between SYS$DKDRIVER and IO_ROUTINES filling up
the KPB stack, usually with some interrupt topping off the
stack.
Crashdump Summary Information:
------------------------------
Bugcheck Type: KRNLSTAKNV, Kernel stack not valid
Current Process: NULL
Current Image:
Failing PC: FFFFFFFF.825D7310
Failing PS: 00000000.00001504
Module:
Offset: 00000000
Images Affected: [SYS$LDR]IO_ROUTINES.EXE
[SYS$LDR]IO_ROUTINES_MON.EXE
o A CPU can run two processes in the current state, which
results in an INCON_SCHED, 'Inconsistent scheduling' bugcheck.
this problem occurs on a single CPU system.
Crashdump Summary Information:
------------------------------
Bugcheck Type: INVEXCEPTN, Exception while above ASTDEL
Current Process: BROKER
Current Image: DSA360:[BROKER_U.AXP.][P]BROKER_EDITOR_U.EXE
Failing PC: FFFFFFFF.800C3B98 SCH$QEND_C+00038
Failing PS: 10000000.00000704
Module: PROCESS_MANAGEMENT (Link Date/Time:
29-DEC-1999 04:09:20.9
Offset: 00007B98
Images Affected: [SYS$LDR]SYS$VM.EXE
o A system can crash with an INVEXCEPTN bugcheck at
SCH$QEND_C+38 while getting the address of the process
alignment fault reporting information,
CTL$GL_REPORT_USER_FAULTS.
Crashdump Summary Information:
-------------------------------
Bugcheck Type: INVEXCEPTN, Exception while above ASTDEL
Current Process: BROKER
Current Image: DSA360:[BROKER_U.AXP.][P]BROKER_EDITOR_U.EXE;85
Failing PC: FFFFFFFF.800C3B98 SCH$QEND_C+00038
Failing PS: 10000000.00000704
Module: PROCESS_MANAGEMENT (Link Date/Time:
29-DEC-1999 04:09:20.9
Offset: 00007B98
Images Affected: [SYS$LDR]SYS$VM.EXE
o A multi-threaded process can hang with one thread spinning in
a loop using CPU time.
Images Affected: [SYSLIB]SYS$SSISHR.EXE
o A large crash dump may fail to include the GCT (Greenwich
Civil Time).
Images Affected: [SYS$LDR]EXCEPTION.EXE
[SYS$LDR]EXCEPTION_MON.EXE
[SYS$LDR]EXCEPTION.STB
[SYS$LDR]EXCEPTION_MON.STB
o The console callback FIND_NODE was so slow it caused the
console to lose clock ticks. An earlier fix required the
system manager manually to set the SYSGEN parameter VMS8 to
%x1029000, which would force the console to use the VMS
version of FIND_NODE rather than the callback version. That
effectively solved the problem.
This enhancement sets VMS8 automatically to %x1029000.
Images Affected: [SYS$LDR]SYSTEM_PRIMITIVES.EXE
[SYS$LDR]SYSTEM_PRIMITIVES_MIN.EXE
PROBLEMS ADDRESSED IN VMS721H1_SYS-V0100 KIT:
o Unnecessary and unwanted path switches can occur on multipath
devices. Under certain circumstances, if a user executes a
manual path switch of one member of a shadow set, the requested
path switch takes place. However, the other member(s) of the
shadow set switch paths as well. Further, if the user attempts
to switch the other member(s) back, the other members will
switch, but the originally switched member will then switch back
to the unwanted path.
Another symptom of this problem is that a transient error
condition on a multipath device can cause a path switch, even
though the current path is still valid.
This problem can occur if a multipath disk device is simultaneously
online, i.e., connected, on more than one path.
This configuration is created:
+ If two Fibre Channel cables are attached to the two host
FibreChannel ports on an HSG80 controller.
+ If two or more FibreChannel host bus adapters on the same
OpenVMS host system connect to the same fabric, i.e. the
same FibreChannel switch or into a set of cascaded switches.
+ If two parallel SCSI buses are connected to the two host
ports on an HSZ80 controller.
Images Affected: [SYS$LDR]MULTIPATH_MON.EXE
o A cross process $GETJPI request for security profile (persona)
information, which includes network privileges and rights, can
lead to a SSRVEXCPT system crash. See crash dump summary
below:
Bugcheck Type: SSRVEXCPT, Unexpected system service exception
CPU Type: COMPAQ AlphaServer DS20E 500 MHz
Current Process: NETACP
Current Image: $2$DKA100:[SYS0.SYSCOMMON.][SYSEXE]NETACP.EXE;1
Failing PC: FFFFFFFF.8011657C EXE_STD$CHECK_IMAGE_NAME_C+0033C
Failing PS: 00000000.00000000
Module: PROCESS_MANAGEMENT_MON (Link Date/Time: 13-MAR-2000
13:54:01.61)
Offset: 0001057C
Images Affected:
- [SYS$LDR]PROCESS_MANAGEMENT.EXE
- [SYS$LDR]PROCESS_MANAGEMENT_MON.EXE
o A non-privileged user can access jobs in a batch queue,
regardless of the queue protections. See the comparative
examples below:
$ show queue/full/all unhf_sys$batch ! from privileged account
Batch queue UNHF_SYS$BATCH, idle, on UNHF::
/BASE_PRIORITY=3 /CPUMAXIMUM=00:30:00 /JOB_LIMIT=3 /OWNER=[SYSTEM]
/PROTECTION=(S:M,O:D,G,W:RS) /WSEXTENT=32768 /WSQUOTA=16384
(IDENTIFIER=[SIS_DEVEL,BANNER_SCT],ACCESS=READ+SUBMIT+MANAGE)
Entry Jobname Username Status
----- ------- -------- ------
2719 DUMMY B_JOHNSTONE Holding
Submitted 1-APR-2000 09:16:27.00 /KEEP
/LOG=$1$DUA233:[B_JOHNSTONE].LOG; /NOPRINT /PRIORITY=100
/RESTART=UNHF_SYS$BATCH
File: _$1$DUA321:[B_JOHNSTONE.COM]DUMMY.COM;10
3182 DUMMY B_JOHNSTONE Holding
Submitted 1-APR-2000 14:15:28.41 /KEEP
/LOG=$1$DUA233:[B_JOHNSTONE].LOG; /NOPRINT /PRIORITY=100
/RESTART=UNHF_SYS$BATCH
File: _$1$DUA321:[B_JOHNSTONE.COM]DUMMY.COM;10
$show queue/full/all unhf_sys$batch !from non-privileged account
Batch queue UNHF_SYS$BATCH, idle, on UNHF::
/BASE_PRIORITY=3 /CPUMAXIMUM=00:30:00 /JOB_LIMIT=3 /OWNER=[SYSTEM]
/PROTECTION=(S:M,O:D,G,W:RS) /WSEXTENT=32768 /WSQUOTA=16384
(IDENTIFIER=[SIS_DEVEL,BANNER_SCT],ACCESS=READ+SUBMIT+MANAGE)
Entry Jobname Username Status
----- ------- -------- ------
2719 no privilege Holding
3182 DUMMY B_JOHNSTONE Holding
Submitted 1-APR-2000 14:15:28.41 /KEEP
Submitted 1-APR-2000 14:15:28.41 /KEEP
/LOG=$1$DUA233:[B_JOHNSTONE].LOG; /NOPRINT /PRIORITY=100
/RESTART=UNHF_SYS$BATCH
File: _$1$DUA321:[B_JOHNSTONE.COM]DUMMY.COM;10
In this example, the user can see entry 3182, as well as security
information, but cannot see entry 2719.
This also generates the following security alarm:
%%%%%%%%%%% OPCOM 1-APR-2000 15:38:36.66 %%%%%%%%%%% (from node
UNHA at 1-APR-2000 15:38:36.67)
Message from user AUDIT$SERVER on UNHA
Security alarm (SECURITY) on UNHA, system id: 1028
Auditable event: Object access
Event time: 1-APR-2000 15:38:36.65
PID: 2040C464
Source PID: 21012416
Username: R_KENNEY$
Process owner: [R_KENNEY$]
Object class name: QUEUE
Object name: UNHF_SYS$BATCH
Object owner: [0,0]
Object protection: SYSTEM:M, OWNER:D, GROUP:, WORLD:RS
Access requested: READ
Status: %SYSTEM-F-NOPRIV, insufficient privilege or
object protection violation
Images Affected:
- [SYS$LDR]SECURITY.EXE
- [SYS$LDR]SECURITY_MON.EXE
o An AST (asynchronous system trap) could cause process header
expansion while in CHECK_WINDOW_64. The routine runs at IPL0
and could allow an AST while trying to locate a process
section table entry.
Images Affected: [SYS$LDR]SYS$VM.EXE
o A problem with the $TRNLNM code path for INTERLOCKED translations
can cause the service to exit without releasing the logical name
mutex. If the $TRNLNM request or any subsequent kernel mode system
service request made by that process exits with an error status, the
system will crash with a MTXCNTNZ bugcheck.
If no kernel mode system service request made by that process
exits with an error status, the system will eventually hang,
with some processes in MUTEX wait trying to acquire the
logical name mutex. If some of those processes have already
acquired other mutexes, such as the I/O database mutex and
GSD mutex, there may be other processes in MUTEX wait trying
to acquire those mutexes.
The $TRNLNM bug is exercised by a fairly unusual combination
of circumstances and is more likely to be seen on an SMP
system.
Images Affected: [SYS$LDR]LOGICAL_NAMES.EXE
o CPUSPINWAIT or SPLACKERR crashes could occur when the MMG
spinlock is not released. This fix is required for all
customers who are using the reserved memory feature in SYSMAN.
Images Affected:
- [SYS$LDR]SYSTEM_PRIMITIVES.EXE
- [SYS$LDR]SYSTEM_PRIMITIVES_MIN.EXE
o The system can crash with an INCONSTATE bugcheck in CACHE$MOUNT.
This occurs when a process, usually RAID$SERVER, is attempting
to mount a disk, usually a member of a Raid set. It appears as
if the volume is being mounted twice and the INCONSTATE bugcheck
occurs.
Images Affected: [SYS$LDR]SYS$VCC.EXE
o The incorrect S-float value was calculated in IEEE multiply
and divide routines.
Images Affected:
- [SYS$LDR]EXCEPTION.EXE
- [SYS$LDR]EXCEPTION_MON.EXE
o An INCONSTATE bugcheck can occur during a RAID unbind
operation.
Images Affected:
- [SYS$LDR]SYS$VCC.EXE
- [SYS$LDR]SYS$VC_MON.EXE
o A BLKASTCNT crash occurred with Pathworks enqueuing many
locks, all with blocking ASTs (asynchronous system traps).
The crash occurred when one of the locks was dequeued.
Images Affected: [SYS$LDR]LOCKING.EXE
INSTALLATION NOTES:
This kit requires a system reboot. Compaq strongly recommends that
a reboot is performed immediately after kit installation to avoid
system instability
If you have other nodes in your OpenVMS cluster, they must also be
rebooted in order to make use of the new image(s). If it is not
possible or convenient to reboot the entire cluster at this time, a
rolling re-boot may be performed.
INSTALLATION INSTRUCTIONS:
Install this kit with the POLYCENTER Software installation utility
by logging into the SYSTEM account, and typing the following at the
DCL prompt:
PRODUCT INSTALL VMS721_SYS /SOURCE=[location of Kit]
The kit location may be a tape drive, CD, or a disk directory that
contains the kit.
Additional help on installing PCSI kits can be found by typing
HELP PRODUCT INSTALL at the system prompt
All trademarks are the property of their respective owners.
This patch can be found at any of these sites:
Colorado Site
Georgia Site
Files on this server are as follows:
dec-axpvms-vms721h1_sys-v0200--4.README
dec-axpvms-vms721h1_sys-v0200--4.CHKSUM
dec-axpvms-vms721h1_sys-v0200--4.pcsi-dcx_axpexe
vms721h1_sys-v0200.CVRLET_TXT
|