OpenVMS ALPSYSA02_062 Alpha V6.2 SYSTEM ECO Summary
TITLE: OpenVMS ALPSYSA02_062 Alpha V6.2 SYSTEM ECO Summary
Modification Date: 06-JUN-2000
Modification Type: Documentation Update:
Kits that need to be installed before or with this
kit have been updated.
NOTE: An OpenVMS saveset or PCSI installation file is stored
on the Internet in a self-expanding compressed file.
For OpenVMS savesets, the name of the compressed saveset
file will be kit_name.a-dcx_vaxexe for OpenVMS VAX or
kit_name.a-dcx_axpexe for OpenVMS Alpha. Once the OpenVMS
saveset is copied to your system, expand the compressed
saveset by typing RUN kitname.dcx_vaxexe or kitname.dcx_alpexe.
For PCSI files, once the PCSI file is copied to your system,
rename the PCSI file to kitname-dcx_axpexe.pcsi, then it can
be expanded by typing RUN kitname-dcx_axpexe.pcsi. The resultant
file will be the PCSI installation file which can be used to install
the ECO.
Copyright (c) Compaq Computer Corporation 1998, 1999. All rights reserved.
PRODUCT: OpenVMS Alpha
COMPONENTS: ERRORLOG
EXCEPTION
EXEC_INIT
IMAGE_MANAGEMENT
IO_ROUTINES
IO_ROUTINES_MON
LOCKING
MESSAGE_ROUTINES
PROCESS_MANAGEMENT
SYS$BASE_IMAGE
SYS$CLUSTER
SYS$PUBLIC_VECTORS
SYS$SSISHR
SYS$VCC
SYS$VCC_MON
SYS$VM
SYSLDR_DYN
SYSTEM_PRIMITIVES
EXCEPTION.STB
LOCKING.STB
SYS$CLUSTER.STB
SOURCE: Compaq Computer Corporation
ECO INFORMATION:
ECO Kit Name: ALPSYSA02_062
ECO Kits Superseded by This ECO Kit: ALPSYSA01_062
ALPSYS14_062
ALPSYS12_062
ALPSYS08_070
ALPSYS07_070
ALPSYS07_062
ALPSYS02_062
ALPSYSL02_070 for OpenVMS
Alpha V6.2 *ONLY*
ECO Kit Approximate Size: 5904 Blocks
Kit Applies To: OpenVMS Alpha V6.2 through V6.2-1H3
System/Cluster Reboot Necessary: Yes
Rolling Re-boot Supported: Yes
Installation Rating: INSTALL_1
1 - To be installed on all systems running
the listed version(s) of OpenVMS.
Kit Dependencies:
The following remedial kit(s) must be installed BEFORE
installation of this kit:
ALPCLUSIO01_062 and ALPY2K02_062
In order to receive all the corrections listed in this
kit, the following remedial kits should also be installed:
ALPF11X05_062
For an explanation of the cause for the following dependencies, see
the MME-related problem descriptions under the section titled
"PROBLEMS ADDRESSED IN ALPSYSA02_062 KIT":
ALPBACK02_062
ALPDISM02_062
ALPINIT01_062
ALPMOUN04_062
ALPMTAA02_062
ECO KIT SUMMARY:
An ECO kit exists for SYSTEM Components on OpenVMS Alpha V6.2 through
V6.2-1H3. This kit addresses the following problems:
Problems Addressed in ALPSYSA02_062:
o Previous remedial kits that replaced SYSBASE_IMAGE.EXE did not
disable MOVEFILE on the image. This could cause a problem if
third-party defragmentation software is used that moves
SYSBASE_IMAGE.EXE.
This kit disables MOVEFILE on the SYSBASE_IMAGE.EXE image.
o A process using MME could potentially "miss" the VOL1 label on
a tape. Also, a process could "hang" trying to send a message
to the MME process.
This problem can occur in several different areas of the
operating system. In order to get the full implementation of
this MME fix, the following remedial kits (or their
supersedants) should also be installed:
o ALPBACK03_071
o ALPDISM01_071
o ALPINIT01_071
o ALPMOUN05_071
o ALPMTAA01_071
It is not necessary to install these kits at the same time, but
until they are installed you may still experience this problem.
o A possible system crash occurs during Host Based RAID Unbinds
with MME code enabled. A mailbox read synchronization problem
causes the crash.
This problem only occurs when a host-based RAID UNBIND command
is done while an MME-based application is running.
This problem may occur in several different code areas of the
operating system. In order to eliminate all instances of this
problem, the following remedial kits (or their supersedents)
will also need to be installed:
o ALPBACK03_071
o ALPDISM01_071
o ALPINIT01_071
o ALPMOUN05_071
o ALPMTAA01_071
o If a system has been up for 497.1 days without rebooting, the
system cell EXE$GL_ABSTIM_TICS (number of 10 millisecond tics
since boot) will overflow. This problem can cause some
processes to remain indefinitely in the RWMPB or COMO
scheduling state.
o A crash occurs, with a PGFIPLHI "Pagefault with IPL too high"
bugcheck, in SYS$VM_PRO+15B0 in the S$ADJWSL system service.
The reason for the crash is because the code page for
SYS$ADJWSL was removed from the system working set.
o The performance counter PMS$GL_NPAGDYNEXPS (cell) was never
incremented above its initial value of zero. It can be
displayed by SDA>CLUE MEM/STAT.
o If the system is temporarily out of Lock IDs and there are
currently no free pages to expand the Lock ID table, then
SYS$VCC could crash with an INCONSTATE bugcheck at
SYS$VCC_NPRO+09700.
o The problem has been seen mostly at large ALL-IN-1 sites.
If a page being deleted with the $DELTVA or $DELTVA_64 system
service is a global page with I/O still active, the process can
possibly enter the RWAST scheduling state. Due to a deadlock
situation, it could remain in RWAST state indefinitely. When
this problem occurs, all disk I/O for the entire VMScluster can
be hung.
The problem can be detected through the use of the
ANALYZE/SYSTEM utility, by issuing a SHOW PROCESS/REGISTER
command on a process in the RWAST scheduling state. If the PC
register indicates an address in the SYS$VM image or the
MMG$DELPAG or MMG$DELPAG_64 routine and the PS register
indicates IPL 2, then the problem is present.
This update requires a FULL BUILD, which is due to the change
to [LIB]MMGDEF.SDL. This change also defines some new flags,
used only in this update, which must be obtained from a library
not contained within the SYS facility.
o $DEVICE can return the name of a dual-pathed SCSI disk twice.
These disks have two UCBs, each of which have some differences
and allow one to distinguish primary from alternate UCBs. The
latter can be filtered out by refraining from returning the
name of a UCB with the following characteristics:
1. bits 2P (dual path), CDP (non-preferred path) and SCSI set
in UCB$L_DEVCHAR2;
2. UCB$L_2P_ALTUCB non-zero (pointer to the other UCB)
o The system will not write out a crash dump.
Problems Addressed in ALPSYSA01_062:
o Updates to application ACE get lost. Customer code locks the
ACL, reads their ACE, updates a count field, re-writes the ACE,
and unlocks the ACL. The change to the count gets lost. In
order to get this full fix you must also install the
ALPF11X03_062 remedial kit.
o OpenVMS Alpha programs calling EXE$ALOPHYCNTG when insufficient
SPTEs are available can result in a loss of free memory.
o SSRVEXCEPT crash in SYS$NETWORK_SERVICES.EXE with NET$ACP as
the current process (and image).
o A Lock Manager deadlock search should either find and break a
deadlock or find out that there is no deadlock and remove the
lock at the head of the lock timeout queue. Although some
valid reasons exist on why a deadlock search could be aborted
and retried later on, in the above described case, an aborted
search was not the appropriate action.
Each second, a deadlock search was started and aborted shortly
thereafter, with the original lock being left at the head of
the timeout queue. This problem caused continual retry
attempts to perform a deadlock search on the same lock.
Problems Addressed in ALPSYS14_062:
o The wrong kit name was used in the "Problems Addressed" section
of the ALPSYS11_062 kit.
Problems Addressed in ALPSYS12_062:
o User-created protected subsystems with subsystem identifiers
granted to executable images fail to work properly in
manipulating queues via $SNDJBC[W]. Although the image has
the subsystem identifier granted, a NOPRIV error is returned.
Problems Addressed in ALPSYS11_062:
o Updates to an application ACE gets lost. The customer code
locks the ACL, reads their ACE, updates a count field,
re-writes the ACE, and unlocks the ACL. Changes to the count
get lost.
o OpenVMS Alpha systems could crash when cleaning up pending ASTs
during rundown, with the address of the next packet pointing to
a bogus value.
o OpenVMS could create processes in the same group UIC with the
same process name. On OpenVMS VAX systems, SS$_NOSLOT could be
returned when one process entry slot is left, the last one
used.
o A potential problem exists when a batch job process termination
message is sent to the JOB_CONTROL process. If the Job
Controller's mailbox is full at the time, the message is sent,
so the message could be dropped and lost. The result could be
that SHOW QUEUE shows "executing" jobs with no associated
process on the system.
Problems Addressed in ALPSYS10_062:
o System crash due to misinterpretation of Page Table Entry (PTE)
This problem was corrected in OpenVMS V7.0.
o Systems using the SYS$SET_SECURITY or the SYS$CHANGE_ACL system
services to protect File Objects will be inconsistent,
protecting the file sometimes and failing at other times.
Thus, applications that use the File Object ACL and ACE data
will not always guarantee synchronized access to the file. The
problem was fixed in OpenVMS V7.1.
Problems Addressed in ALPSYS09_062:
o A serial console on a Turbolaser (as opposed to a remote or LAT
console) may see device timeouts after issuing commands such as
$ DIRECTORY or $ SHOW DEVICE.
Problems Addressed in ALPSYS03_062:
o Privilege violation when attempting to use MSCP Express I/O
modifier by an unprivileged program.
This problem is corrected in OpenVMS Alpha V7.0.
o Occasional crashes in IOC$DISMOUNT with INCONSTATE or
SSRVEXCEPT.
This problem is corrected in OpenVMS Alpha V7.0.
Problems Addressed in ALPSYS08_070 for OpenVMS Alpha V6.2 through V6.2-1H3:
o There is a remote possibility that a system could experience a
PGFIPLHI crash. The crash is due to not enough system pages
being locked into the system working set. This is a result of
the number of pages that are locked/unlocked being calculated
incorrectly.
This problem is more apt to be encountered after the
installation of the ALPF11X03_070 remedial kit or its
supersedant.
Problems Addressed in ALPSYS07_070 for OpenVMS Alpha V6.2 through V6.2-1H3:
o PGFIPLHI crash when executing the command $ DIR
DEVICE:[*...]*.*.*
Problems Addressed in ALPSYS05_070 for OpenVMS Alpha V6.2 through V6.2-1H3:
o Using "SHOW PROCESS CONTINUOUS" on a process which does
frequent calls to $HIBER and $WAKE can result in the process
getting stuck in HIB.
Problems Addressed in ALPSYS01_070 for OpenVMS Alpha V6.2 through V6.2-1H3:
o On an EV5 system, it is possible for some ASTs (Asynchronous
System Traps) to be delivered to their process in a different
order from that in which they were queued.
o Users creating protected subsystems, with subsystem identifiers
granted to executable images, fail to work properly in
manipulating queues via $SNDJBC[W]. Although the image has the
subsystem identifier granted, a NOPRIV error is returned.
Problems Addressed in ALPSYS06_062:
o System crashes with BADPAGFLD.
Problems Addressed in ALPSYS05_062:
o A BADPRCPGFLC bugcheck can occur in a long-running process.
Problems Addressed in ALPSYS02_070 for OpenVMS Alpha V6.2 through V6.2-1H3:
o Under the rare conditions of this corruption, the system will
hang or crash trying to walk the COMO queues. The traditional
footprint has a KTB structure in the COLPG wait queue that has
been prematurely pulled out of its appropriate inswap COMO
queue. Along with this corruption, the SWAPPER will appear to
be CUR in a SHO SYS display, although no CPU in the
configuration recognizes it as the active thread.
Problems Addressed in ALPSYS07_062:
o System crashes with one of the following footprints:
INCONSTATE SYS$VCC_NPRO+00009F04
INCONSTATE SYS$VCC_NPRO+00009F00
INCONSTATE VAXCLUSTER_CACHE+03EB1
Problems Addressed in ALPSYS02_062:
o SYS$ENQ System Service call/program hangs due to the event flag
not being set.
o Due to an image build problem, the LOCKING.EXE image included
in the ALPSYS01_062 kit did not contain the problem fix it
should have.
Problems Addressed in ALPSYS01_062:
o Under certain conditions, a fork lock used by the virtual I/O
cache may be created with an incorrect length. This results in
unsynchronized data access which can cause corruption.
INSTALLATION NOTES:
The images in this kit will not take effect until the system is
rebooted. If there are other nodes in the VMScluster, they must
also be rebooted in order to make use of the new images.
If it is not possible or convenient to reboot the entire cluster at
this time, a rolling re-boot may be performed.
This patch can be found at any of these sites:
Colorado Site
Georgia Site
Files on this server are as follows:
alpsysa02_062.README
alpsysa02_062.CHKSUM
alpsysa02_062.CVRLET_TXT
alpsysa02_062.a-dcx_axpexe
alpsysa02_062.CVRLET_TXT
|