|
|
OpenVMS ALPSCSI05_071 Alpha V7.1 - V7.1-1H2 SCSI ECO Summary
|
TITLE: OpenVMS ALPSCSI05_071 Alpha V7.1 - V7.1-1H2 SCSI ECO Summary
Modification Date: 19-NOV-98
Modification Type: Updated Kit: Supersedes ALPSCSI04_071
NOTE: An OpenVMS saveset or PCSI installation file is stored
on the Internet in a self-expanding compressed file.
The name of the compressed file will be kit_name-dcx_vaxexe
for OpenVMS VAX or kit_name-dcx_axpexe for OpenVMS Alpha.
Once the file is copied to your system, it can be expanded
by typing RUN compressed_file. The resultant file will
be the OpenVMS saveset or PCSI installation file which
can be used to install the ECO.
Copyright (c) Compaq Computer Corporation 1997, 1998. All rights reserved.
****< CAUTION >*****
**** AlphaServer 8400 and 8200 (TURBOLASER) INSTALLATION WARNING ****
If you are installing this remedial kit on an AlphaServer 8400 or 8200
you MUST make sure your console is at Rev 4.0 or later. Rev 4.0 is
available on the Alpha Firmware Update CDrom V3.7. Installing this kit
on a system that has a console revision earlier than 4.0 WILL RESULT IN
AN UNBOOTABLE SYSTEM. To recover from this situation you will need to
back out the new drivers by either booting from an alternate system disk
then deleting the drivers off your regular disk, or by rebuilding your
regular system disk.
PRODUCT: DIGITAL OpenVMS Alpha
COMPONENT: SCSI Drivers - MKSET.EXE
SYS$DKDRIVER.EXE
SYS$GKDRIVER.EXE
SYS$MKDRIVER.EXE
SYS$PKCDRIVER.EXE
SYS$PKEDRIVER.EXE
SYS$PKJDRIVER.EXE
SYS$PKQDRIVER.EXE
SYS$PKSDRIVER.EXE
SYS$PKTDRIVER.EXE
SYS$PKZDRIVER.EXE
SOURCE: Compaq Computer Corporation
ECO INFORMATION:
ECO Kit Name: ALPSCSI05_071
ECO Kits Superseded by This ECO Kit: ALPSCSI04_071
ALPSCSI03_071
ALPSCSI02_071
ALPSCSI01_071
ECO Kit Approximate Size: 2052 Blocks
Kit Applies To: OpenVMS Alpha V7.1, V7.1-1H1, V7.1-1H2
System/Cluster Reboot Necessary: Yes
Rolling Reboot Supported: Yes
Installation Rating: INSTALL_1
1 - To be installed on all systems running
the listed version(s) of OpenVMS.
Also should be installed on:
Systems requiring Ultra SCSI support to
multihost configurations with a maximum of
three hosts, using the KZPBA-CB adapter.
Systems using single-host UltraSCSI support
for the Digital Personal Workstation 433au.
Kit Dependencies:
The following remedial kit(s) must be installed BEFORE
installation of this kit:
None
In order to receive all the corrections listed in this
kit, the following remedial kits should also be installed:
None
ECO KIT SUMMARY:
An ECO kit exists for SCSI Drivers (DKDRIVER, GKDRIVER, MKDRIVER,
PKCDRIVER, PKEDRIVER, PKJDRIVER, PKQDRIVER, PKSDRIVER, PKTDRIVER,
and PKZDRIVER) on OpenVMS Alpha V7.1 through V7.1-1H2. This kit
addresses the following problems:
PROBLEMS ADDRESSED IN ALPSCSI05_071:
o The major change in this kit is to prevent the possibility of
the loss of an interrupt. Usually the port is busy enough so
that the loss just shows up as reduced performance because
something else generates an interrupt soon enough. However,
situations arise, such as during mount verification, where
the interrupt loss can show up as a port hang.
Minor changes included cleanup, documentation and an attempt
to make debugging any future problems easier.
o Systems using the KZPAA single-ended narrow SCSI bus
controller to communicate with the SCSI bus may crash, if the
system decides it has to reset the SCSI bus attached to that
controller. The specific crash error code was:
Bugcheck code = 00000215: MACHINECHK, Machine check while in
kernel mode
o A bad autosense data pointer in the Current Unit Control
Block (CUCB) resulted in a NOTFCBFCB system crash.
o A system crash with an INCONSTATE error occurred due to
an attempt to execute two untagged WRITE commands. This
problem can only occur on SMP (multiple CPU) machines and
devices that do not support command Tagged Queuing.
o Four related problems occurred:
1. Shared SCSI bus systems crash with an "INCONSTATE,
Inconsistent I/O data base" error message on the console,
when a node on that shared SCSI bus is shut down and
enters AlphaBIOS.
2. SCSI bus resets cause the system to crash with an
"INCONSTATE, Inconsistent I/O database" message on
the console (crash pc EXE$GEN_BUGCHK_C+0003C).
3. An INCONSTATE, "Inconsistent I/O database" message is
displayed on the console (crash pc EXE$GEN_BUGCHK_C+0003C).
4. A crash with "INVEXCEPTN, Exception while above ASTDEL"
(crash pc ERL_STD$ALLOCEMB_C+00408) occurs.
o If an RZ1CC disk is being used together with an KZPBA adapter,
the system may crash. The crash error is INVALID EXCEPTION
ABOVE ASTDEL with reason access violation (ACCVIO).
o A cluster state transition hang the cluster after I/O is
lost. The problem is a result of a cluster node leaving
the cluster while it has a served disk mounted.
o A system crash (INVEXCEPTN) exception occurs at SYS$PKEDRIVER+0D338.
A sign extend of an SCDT address was not done for a timeout
parameter. The result was an ACCVIO crash which occurred when
an illegal address was used.
o Memory allocated to internal queues, target mode responses,
and firmware may not be bus accessible. This change tests
for that condition and sets port offline if it occurs.
NEW FUNCTIONALITY INCLUDED IN ALPSCSI04_071:
OpenVMS Alpha Version 7.1-1H1 introduced support for certain Ultra
SCSI devices in Ultra SCSI mode in single host configurations.
Since then, a new StorageWorks Ultra SCSI adapter, the KZPBA-CB,
has been released. This differential adapter is also supported by
OpenVMS Alpha Version 7.1-1H1.
This kit extends the Ultra SCSI support to multihost configurations
with a maximum of three hosts, using the KZPBA-CB adapter. The
Ultra SCSI single-ended adapter, the KZPBA-CA, does not support
multihost configurations.
Table 1 summarizes the Ultra SCSI support provided by OpenVMS,
including support for several significant Ultra SCSI devices. For
information about all Ultra SCSI devices supported by OpenVMS and
about configuring OpenVMS Alpha Ultra SCSI clusters, see the
documents described in Table 3.
Table 1 OpenVMS Alpha Ultra SCSI Support
Configuration/Adapter Version Description
--------------------- ------- ------------------------------
Single host 7.1-1H1 The KZPBA-CA is a single-ended
configurations using adapter. The KZPAC Ultra SCSI
the KZPBA-CA host RAID controller is also
supported in single host
configurations.
Single host 7.1-1H1 The KZPBA-CB is a differential
configurations using adapter. The HSZ70 is also
using the KZPBA-CB supported in Ultra SCSI mode,
using the KZPBA-CB.
Multihost 7.1-1H1 Up to three hosts can share
configurations with the Ultra SCSI bus. The HSZ70
using the KZPBA-CB this is also supported on the
kit multihost bus
--------------------------------------------------------------
Table 2 OpenVMS Restrictions
Restriction Comments
---------------------- ----------------------------------
Firmware for the Earlier firmware versions do not
KZPBA-CB must be provide multihost support.
Version 5.53 or higher.
Console Firmware must All console SCSI driver fixes are
be updated with the included on this CD. This CD also
Alpha Systems Firmware includes the latest version of the
Update CD V5.1. KZPBA-CB firmware (V5.53) or
higher).
DECevent Version 2.6 Earlier versions of DECevent will
or later is required display all of the logged data,
for analyzing events but it will be in hexadecimal
logged by the KZPBA format only.
port_driver.
--------------------------------------------------------------
Table 3 provides pointers to additional documentation for Ultra
SCSI devices and for configuring OpenVMS Alpha Ultra SCSI clusters.
Table 3
Documentation for Configuring OpenVMS Alpha Ultra SCSI Clusters
Topic Document Order_Number
------------------------- ---------------- ------------
SCSI devices that support StorageWorks EK-ULTRA-CG
support Ultra SCSI Ultra SCSI
operations and how to Configuration
configure them Guidelines
KZPBA-CB Ultra SCSI KZPBA-CB Ultra AA-R5XWA-TE
storage adapter SCSI Storage
Adapter Module
Release Notes
Multihost SCSI bus Guidelines for AA-Q28LB-TK
operation in OpenVMS DIGITAL OpenVMS
Cluster systems Cluster
Configurations
Systems and devices OpenVMS Operating SPD 25.01.xx
supported by OpenVMS System for Alpha
Version 7.1-1H1 and VAX, Version
7.1-1H1 Software
Product
Description
Multihost SCSI OpenVMS Cluster SPD 29.78.xx
support Software or later
--------------------------------------------------------------
Information about StorageWorks Ultra SCSI products is available and
periodically updated on the World Wide Web at the following URL:
http://www.storage.digital.com
OpenVMS software product descriptions are also available and
periodically updated on the World Wide Web at the following URL:
http://www.openvms.digital.com
You will find the software product descriptions under Publications,
a choice on the home page.
o Known Problems:
The following problems have been observed in an Ultra SCSI OpenVMS
Cluster configuration, with multiple hosts using KZPBA-CB adapters
sharing an Ultra SCSI bus. DIGITAL plans to correct these problems
in a future release.
+ In a multihost configuration with a heavy I/O-intensive load, if
you shut down one host and reboot it, it might not be able to
rejoin the cluster. The failure to reboot could occur if the
reboot was an automated response (by means of a command procedure)
to an unrecoverable system error, or if a system manager shut down
a system to perform some type of maintenance and then rebooted it.
The system that was shut down and rebooted starts to reboot but
stops partway through the process. Before the system hangs, it
reports several events, including several CLU$CHECK_INQUIRY
commands.
In a two-node configuration consisting of an AlphaServer 4100
system and an AlphaServer 8400 system, this failure has occurred
on both systems. The following example shows the boot events that
occurred before the system halted:
OpenVMS (TM) Alpha Operating System, Version V7.1-1H1
DECnet-I-LOADED, network base image loaded, version = 05.0C.00
%SMP-I-SECMSG, CPU #01 message: P01>>>START
%SMP-I-CPUBOOTED, CPU #01 has joined the PRIMARY CPU in
multiprocessor operation
%SYSINIT-I- waiting to form or join an OpenVMS Cluster
%VMScluster-I-LOADSECDB, loading the cluster security database
%SCCPUVER-I-INQ, CLU$CHECK_INQUIRY for PKB0 PAC -1
%SCCPUVER-I-INQ, CLU$CHECK_INQUIRY for PKB0 PAC -1
%SCCPUVER-I-CHK, CLU$CHECK_SCSI_CPU for PKB0 PAC -1
%EWA0, BNC(10base2) mode set by console
%CNXMAN, Sending VMScluster membership request to system MSCP
%SCCPUVER-I-INQ, CLU$CHECK_INQUIRY for PKB0 PAC -1
%SCCPUVER-I-INQ, CLU$CHECK_INQUIRY for PKB0 PAC -1
%CNXMAN, Now a VMScluster member -- system OATS
%SCCPUVER-I-CHK, CLU$CHECK_SCSI_CPU for PKB0 PAC -1
+ Workaround:
In cases of shutdowns for system maintenance, schedule them for
a time when the system is not performing a heavy I/O-intensive
load.
o If two hosts attempt to write dump files to the system disk at
the same time and if one of these hosts is an AlphaServer 8400,
the AlphaServer 8400 might not complete writing the dump file
and hang, or it might complete writing the dump file but report
an error message. The following error message has been reported
by the AlphaServer 8400 in both cases:
Failed to send Write to DKn-n-n-n-n
o Simultaneous Booting. When two hosts attempt to boot at the same
time, one succeeds and one may fail. This has been observed in
a two-node configuration and in a three-node configuration.
In a two-node configuration consisting of an AlphaServer 4100
system and an AlphaServer 8400 5/625 system (EV56 chip, 625 MHz
CPU, running T5.1-29 console firmware), the host that fails is
always the AlphaServer 8400 system. If the AlphaServer 8400
reports that it was unable to read the system disk, as shown
in the following message, it does not complete booting:
Failed to send Read to DKn-n-n-n-n
This problem is not specific to Ultra SCSI multihost
configurations. It has been observed on multihost configurations
that use other SCSI interconnects.
PROBLEMS ADDRESSED IN ALPSCSI04_071:
o The DMA Timeout was set at 2 seconds. This value was too
short for some devices. DMA Timeout now uses 2 seconds or the
value of SCDT$L_DMA_TIMEOUT, whichever is greater.
o With pool checking turned on, the System can crash with an
ACCVIO.
o A system will hang if a user buffer address of 0 and a non-zero
byte count are given to GKdriver.
o The pool leak fix was never ported to V7.1.
o The system crashes out of GKdriver if a user specifies an invalid
command descriptor.
o An Access Violation (ACCVIO) crash occurs on SCSI device timeouts.
o A system crash will occur on a class driver all to OTS$MOVE.
NEW FUNCTIONALITY ADDRESSED IN ALPSCSI03_071:
o UltraSCSI support for KZPBA-CB on Digital Personal Workstation
433au.
PROBLEMS ADDRESSED IN ALPSCSI03_071:
o No ECO corrections are included in this kit.
PROBLEMS ADDRESSED IN ALPSCSI02_071:
o If the HSZ configuration utility HSZTERM has an outstanding
I/O to the HSZ,and Mount Verification occurs, then the system
may crash. This usually happens under high I/O loads.
o If Mount Verification occurs while a DK Device is reporting a
write locked condition, the system will crash with an
INVEXCEPTN Bugcheck.
o Unnecessary Mount Verification for HSZ Unit Attention
Conditions will occur.
o The OpenVMS I/O User's Reference Manual added a new Magnetic
Tape I/O Function IO$_FLUSH in Document Revision 1.5 for Alpha
and revision V6.0 for VAX. This function was not fully
implemented.
o A TZ30 or TKZ50 will come up offline when a system boots on
current versions of SYS$MKDRIVER.
o The class driver queue could become frozen. HSZ devices may go
into mount verify and eventually mount verify timeout after an
HSZ70 failover.
o A Queue Full condition causes unnecessary Mount Verification.
+ If a target returns a Queue Full status, an unnecessary
Mount Verification occurs.
+ In SYS$PKSDRIVER, if a command is reinserted on the device
queue after a Queue Fullcondition occurs, the I/O will
never complete.
o Depending on the sequence in which the nodes of a cluster are
booted, it is possible for a QLOGIC adapter to return all zeros
to a target mode inquiry. This causes the initiator which sent
the inquiry to believe that the adapter which replies with all
zeros is a disk.
o Incomplete error log entries occur with devices supplying large
amounts of error information.
o Fatal controllers occur on the Qlogic adapter after a SCSI bus
reset.
o An AlphaServer 4100 may see an invalid exception crash under
heavy IO loads.
o When booting through an HSZ70 disk, accessed through a QLogic
adapter, with a pass-through tape being accessed (configured)
through the same path, a system crash can occur because of a
corrupt timer queue.
o Ensure that a recycled QBUF does not cause PK$CMD_WAIT_COMPLETION
to return without stalling, which would break the synchronization
between the Queue Manager and the SCDRP thread.
o The register dump routine attempts to dump the contents of 124
ISP registers but there are only 101 register locations that
exist. This has been fixed by moving the statement that
determines REG_FILE_SIZE to the proper place in ISP1020DEF.SDL.
o During mailbox I/O, an unexpected Qlogic adapter error crashes
the system unnecessarily.
o Add error handling for the following new status values:
Inv_Entry_Type Dev_Queue_Full SCSI_Phase_Err,
No_Sense_Data BDR_Received BDR_Sent
SCAM_Event SCSI_Cmd_Done
o Modify error handling for Data_Overrun and Data_Underrun.
o Occurrences of Selection Timeout will no longer be logged by
the port driver (PKQDRIVER). They will be logged by the
upper-level driver instead.
o Alphaserver 1200 and 4100 machines intermittently crash with
machine checks during the boot sequence.
o Mount Verify is not invoked for some recoverable errors.
o PKSDRIVER falsely reports errors.
o An RMS bugcheck may occur under high I/O loads.
PROBLEMS ADDRESSED IN ALPSCSI01_071 KIT:
o IO$_AUDIO function may crash the system.
o Running HSZTERM while heavy I/O occurs results in an INVEXCEPTN
bugcheck in port driver.
o Fatal drive errors occur during attempts to INIT the Exabyte
8200 tape drive.
o Request Sense data is truncated at 19 bytes.
o Unaligned reads (partial block) to a disk causes corruption of
the EXE$GL_ERASEPB (Erase Pattern Buffer). Since this is
used as a convenient source of zeros by various pieces of code,
it can lead to data corruption.
o If Mount Verification occurs while a DK Device is reporting a
write locked condition, the system will crash with an INCONSTATE
bugcheck.
o Disks go into Mount Verify and never come out.
o Error log entries have an incorrect format.
o Controller errors occur in systems with greater than 4Gb
of memory.
o Controller errors may occur during one- and two-byte transfers.
o A system crash may occur after a bus reset or adapter errors.
o Interaction between an RZ26F disk and an RRD45 CDROM causes I/O
performance degradation, bus resets and mount verifications.
INSTALLATION NOTES:
****< CAUTION >*****
***** AlphaServer 8400 and 8200 (TURBOLASER) INSTALLATION WARNING ****
If you are installing this remedial kit on an AlphaServer 8400 or 8200
you MUST make sure your console is at Rev 4.0 or later. Rev 4.0 is
available on the Alpha Firmware Update CDrom V3.7. Installing this kit
on a system that has a console revision earlier than 4.0 WILL RESULT IN
AN UNBOOTABLE SYSTEM. To recover from this situation you will need to
back out the new drivers by either booting from an alternate system disk
then deleting the drivers off your regular disk, or by rebuilding your
regular system disk.
The images in this kit will not take effect until the system is
rebooted. If there are other nodes in the VMScluster, they must
also be rebooted in order to make use of the new image(s).
If it is not possible or convenient to reboot the entire cluster at
this time, a rolling re-boot may be performed.
Copyright (c) Digital Equipment Corporation 1997, 1998. All rights reserved.
Modification Date: 15-JUN-1998
Modification Type: DOCUMENTATION: Technical Information:
Added V7.1-1H2 Information
****< CAUTION >*****
**** AlphaServer 8400 and 8200 (TURBOLASER) INSTALLATION WARNING ****
If you are installing this remedial kit on an AlphaServer 8400 or 8200
you MUST make sure your console is at Rev 4.0 or later. Rev 4.0 is
available on the Alpha Firmware Update CDrom V3.7. Installing this kit
on a system that has a console revision earlier than 4.0 WILL RESULT IN
AN UNBOOTABLE SYSTEM. To recover from this situation you will need to
back out the new drivers by either booting from an alternate system disk
then deleting the drivers off your regular disk, or by rebuilding your
regular system disk.
PRODUCT: DIGITAL OpenVMS Alpha
COMPONENT: SCSI Drivers - MKSET.EXE
SYS$DKDRIVER.EXE
SYS$GKDRIVER.EXE
SYS$MKDRIVER.EXE
SYS$PKCDRIVER.EXE
SYS$PKEDRIVER.EXE
SYS$PKJDRIVER.EXE
SYS$PKQDRIVER.EXE
SYS$PKSDRIVER.EXE
SYS$PKTDRIVER.EXE
SYS$PKZDRIVER.EXE
SOURCE: Digital Equipment Corporation
ECO INFORMATION:
ECO Kit Name: ALPSCSI04_071
ECO Kits Superseded by This ECO Kit: ALPSCSI03_071
ALPSCSI02_071
ALPSCSI01_071
ECO Kit Approximate Size: 1872 Blocks
Kit Applies To: OpenVMS Alpha V7.1, V7.1-1H1, V7.1-1H2
System/Cluster Reboot Necessary: Yes
Rolling Reboot Supported: Yes
Installation Rating: 3 - To be installed on all systems running
the listed versions of OpenVMS which
are experiencing the problems described.
Also should be installed on:
Systems requiring Ultra SCSI support to
multihost configurations with a maximum of
three hosts, using the KZPBA-CB adapter.
Systems using single-host UltraSCSI support
for the Digital Personal Workstation 433au.
Kit Dependencies:
The following remedial kit(s) must be installed BEFORE
installation of this kit:
None
In order to receive all the corrections listed in this
kit, the following remedial kits should also be installed:
None
ECO KIT SUMMARY:
An ECO kit exists for SCSI Drivers (DKDRIVER, GKDRIVER, MKDRIVER,
PKCDRIVER, PKEDRIVER, PKJDRIVER, PKQDRIVER, PKSDRIVER, PKTDRIVER,
and PKZDRIVER) on OpenVMS Alpha V7.1 through V7.1-1H2. This kit
addresses the following problems:
NEW FUNCTIONALITY INCLUDED IN ALPSCSI03_071:
OpenVMS Alpha Version 7.1-1H1 introduced support for certain Ultra
SCSI devices in Ultra SCSI mode in single host configurations.
Since then, a new StorageWorks Ultra SCSI adapter, the KZPBA-CB,
has been released. This differential adapter is also supported by
OpenVMS Alpha Version 7.1-1H1.
This kit extends the Ultra SCSI support to multihost configurations
with a maximum of three hosts, using the KZPBA-CB adapter. The
Ultra SCSI single-ended adapter, the KZPBA-CA, does not support
multihost configurations.
Table 1 summarizes the Ultra SCSI support provided by OpenVMS,
including support for several significant Ultra SCSI devices. For
information about all Ultra SCSI devices supported by OpenVMS and
about configuring OpenVMS Alpha Ultra SCSI clusters, see the
documents described in Table 3.
Table 1 OpenVMS Alpha Ultra SCSI Support
Configuration/Adapter Version Description
--------------------- ------- ------------------------------
Single host 7.1-1H1 The KZPBA-CA is a single-ended
configurations using adapter. The KZPAC Ultra SCSI
the KZPBA-CA host RAID controller is also
supported in single host
configurations.
Single host 7.1-1H1 The KZPBA-CB is a differential
configurations using adapter. The HSZ70 is also
using the KZPBA-CB supported in Ultra SCSI mode,
using the KZPBA-CB.
Multihost 7.1-1H1 Up to three hosts can share
configurations with the Ultra SCSI bus. The HSZ70
using the KZPBA-CB this is also supported on the
kit multihost bus
--------------------------------------------------------------
Table 2 OpenVMS Restrictions
Restriction Comments
---------------------- ----------------------------------
Firmware for the Earlier firmware versions do not
KZPBA-CB must be provide multihost support.
Version 5.53 or higher.
Console Firmware must All console SCSI driver fixes are
be updated with the included on this CD. This CD also
Alpha Systems Firmware includes the latest version of the
Update CD V5.1. KZPBA-CB firmware (V5.53) or
higher).
DECevent Version 2.6 Earlier versions of DECevent will
or later is required display all of the logged data,
for analyzing events but it will be in hexadecimal
logged by the KZPBA format only.
port_driver.
--------------------------------------------------------------
Table 3 provides pointers to additional documentation for Ultra
SCSI devices and for configuring OpenVMS Alpha Ultra SCSI clusters.
Table 3
Documentation for Configuring OpenVMS Alpha Ultra SCSI Clusters
Topic Document Order_Number
------------------------- ---------------- ------------
SCSI devices that support StorageWorks EK-ULTRA-CG
support Ultra SCSI Ultra SCSI
operations and how to Configuration
configure them Guidelines
KZPBA-CB Ultra SCSI KZPBA-CB Ultra AA-R5XWA-TE
storage adapter SCSI Storage
Adapter Module
Release Notes
Multihost SCSI bus Guidelines for AA-Q28LB-TK
operation in OpenVMS DIGITAL OpenVMS
Cluster systems Cluster
Configurations
Systems and devices OpenVMS Operating SPD 25.01.xx
supported by OpenVMS System for Alpha
Version 7.1-1H1 and VAX, Version
7.1-1H1 Software
Product
Description
Multihost SCSI OpenVMS Cluster SPD 29.78.xx
support Software or later
--------------------------------------------------------------
Information about StorageWorks Ultra SCSI products is available and
periodically updated on the World Wide Web at the following URL:
http://www.storage.digital.com
OpenVMS software product descriptions are also available and
periodically updated on the World Wide Web at the following URL:
http://www.openvms.digital.com
You will find the software product descriptions under Publications,
a choice on the home page.
o Known Problems:
The following problems have been observed in an Ultra SCSI OpenVMS
Cluster configuration, with multiple hosts using KZPBA-CB adapters
sharing an Ultra SCSI bus. DIGITAL plans to correct these problems
in a future release.
+ In a multihost configuration with a heavy I/O-intensive load, if
you shut down one host and reboot it, it might not be able to
rejoin the cluster. The failure to reboot could occur if the
reboot was an automated response (by means of a command procedure)
to an unrecoverable system error, or if a system manager shut down
a system to perform some type of maintenance and then rebooted it.
The system that was shut down and rebooted starts to reboot but
stops partway through the process. Before the system hangs, it
reports several events, including several CLU$CHECK_INQUIRY
commands.
In a two-node configuration consisting of an AlphaServer 4100
system and an AlphaServer 8400 system, this failure has occurred
on both systems. The following example shows the boot events that
occurred before the system halted:
OpenVMS (TM) Alpha Operating System, Version V7.1-1H1
DECnet-I-LOADED, network base image loaded, version = 05.0C.00
%SMP-I-SECMSG, CPU #01 message: P01>>>START
%SMP-I-CPUBOOTED, CPU #01 has joined the PRIMARY CPU in
multiprocessor operation
%SYSINIT-I- waiting to form or join an OpenVMS Cluster
%VMScluster-I-LOADSECDB, loading the cluster security database
%SCCPUVER-I-INQ, CLU$CHECK_INQUIRY for PKB0 PAC -1
%SCCPUVER-I-INQ, CLU$CHECK_INQUIRY for PKB0 PAC -1
%SCCPUVER-I-CHK, CLU$CHECK_SCSI_CPU for PKB0 PAC -1
%EWA0, BNC(10base2) mode set by console
%CNXMAN, Sending VMScluster membership request to system MSCP
%SCCPUVER-I-INQ, CLU$CHECK_INQUIRY for PKB0 PAC -1
%SCCPUVER-I-INQ, CLU$CHECK_INQUIRY for PKB0 PAC -1
%CNXMAN, Now a VMScluster member -- system OATS
%SCCPUVER-I-CHK, CLU$CHECK_SCSI_CPU for PKB0 PAC -1
+ Workaround:
In cases of shutdowns for system maintenance, schedule them for
a time when the system is not performing a heavy I/O-intensive
load.
o If two hosts attempt to write dump files to the system disk at
the same time and if one of these hosts is an AlphaServer 8400,
the AlphaServer 8400 might not complete writing the dump file
and hang, or it might complete writing the dump file but report
an error message. The following error message has been reported
by the AlphaServer 8400 in both cases:
Failed to send Write to DKn-n-n-n-n
o Simultaneous Booting. When two hosts attempt to boot at the same
time, one succeeds and one may fail. This has been observed in
a two-node configuration and in a three-node configuration.
In a two-node configuration consisting of an AlphaServer 4100
system and an AlphaServer 8400 5/625 system (EV56 chip, 625 MHz
CPU, running T5.1-29 console firmware), the host that fails is
always the AlphaServer 8400 system. If the AlphaServer 8400
reports that it was unable to read the system disk, as shown
in the following message, it does not complete booting:
Failed to send Read to DKn-n-n-n-n
This problem is not specific to Ultra SCSI multihost
configurations. It has been observed on multihost configurations
that use other SCSI interconnects.
PROBLEMS ADDRESSED IN ALPSCSI04_071:
o The DMA Timeout was set at 2 seconds. This value was too
short for some devices. DMA Timeout now uses 2 seconds or the
value of SCDT$L_DMA_TIMEOUT, whichever is greater.
o With pool checking turned on, the System can crash with an
ACCVIO.
o A system will hang if a user buffer address of 0 and a non-zero
byte count are given to GKdriver.
o The pool leak fix was never ported to V7.1.
o The system crashes out of GKdriver if a user specifies an invalid
command descriptor.
o An Access Violation (ACCVIO) crash occurs on SCSI device timeouts.
o A system crash will occur on a class driver all to OTS$MOVE.
NEW FUNCTIONALITY ADDRESSED IN ALPSCSI03_071:
o UltraSCSI support for KZPBA-CB on Digital Personal Workstation
433au.
PROBLEMS ADDRESSED IN ALPSCSI03_071:
o No ECO corrections are included in this kit.
PROBLEMS ADDRESSED IN ALPSCSI02_071:
o If the HSZ configuration utility HSZTERM has an outstanding
I/O to the HSZ,and Mount Verification occurs, then the system
may crash. This usually happens under high I/O loads.
o If Mount Verification occurs while a DK Device is reporting a
write locked condition, the system will crash with an
INVEXCEPTN Bugcheck.
o Unnecessary Mount Verification for HSZ Unit Attention
Conditions will occur.
o The OpenVMS I/O User's Reference Manual added a new Magnetic
Tape I/O Function IO$_FLUSH in Document Revision 1.5 for Alpha
and revision V6.0 for VAX. This function was not fully
implemented.
o A TZ30 or TKZ50 will come up offline when a system boots on
current versions of SYS$MKDRIVER.
o The class driver queue could become frozen. HSZ devices may go
into mount verify and eventually mount verify timeout after an
HSZ70 failover.
o A Queue Full condition causes unnecessary Mount Verification.
+ If a target returns a Queue Full status, an unnecessary
Mount Verification occurs.
+ In SYS$PKSDRIVER, if a command is reinserted on the device
queue after a Queue Fullcondition occurs, the I/O will
never complete.
o Depending on the sequence in which the nodes of a cluster are
booted, it is possible for a QLOGIC adapter to return all zeros
to a target mode inquiry. This causes the initiator which sent
the inquiry to believe that the adapter which replies with all
zeros is a disk.
o Incomplete error log entries occur with devices supplying large
amounts of error information.
o Fatal controllers occur on the Qlogic adapter after a SCSI bus
reset.
o An AlphaServer 4100 may see an invalid exception crash under
heavy IO loads.
o When booting through an HSZ70 disk, accessed through a QLogic
adapter, with a pass-through tape being accessed (configured)
through the same path, a system crash can occur because of a
corrupt timer queue.
o Ensure that a recycled QBUF does not cause PK$CMD_WAIT_COMPLETION
to return without stalling, which would break the synchronization
between the Queue Manager and the SCDRP thread.
o The register dump routine attempts to dump the contents of 124
ISP registers but there are only 101 register locations that
exist. This has been fixed by moving the statement that
determines REG_FILE_SIZE to the proper place in ISP1020DEF.SDL.
o During mailbox I/O, an unexpected Qlogic adapter error crashes
the system unnecessarily.
o Add error handling for the following new status values:
Inv_Entry_Type Dev_Queue_Full SCSI_Phase_Err,
No_Sense_Data BDR_Received BDR_Sent
SCAM_Event SCSI_Cmd_Done
o Modify error handling for Data_Overrun and Data_Underrun.
o Occurrences of Selection Timeout will no longer be logged by
the port driver (PKQDRIVER). They will be logged by the
upper-level driver instead.
o Alphaserver 1200 and 4100 machines intermittently crash with
machine checks during the boot sequence.
o Mount Verify is not invoked for some recoverable errors.
o PKSDRIVER falsely reports errors.
o An RMS bugcheck may occur under high I/O loads.
PROBLEMS ADDRESSED IN ALPSCSI01_071 KIT:
o IO$_AUDIO function may crash the system.
o Running HSZTERM while heavy I/O occurs results in an INVEXCEPTN
bugcheck in port driver.
o Fatal drive errors occur during attempts to INIT the Exabyte
8200 tape drive.
o Request Sense data is truncated at 19 bytes.
o Unaligned reads (partial block) to a disk causes corruption of
the EXE$GL_ERASEPB (Erase Pattern Buffer). Since this is
used as a convenient source of zeros by various pieces of code,
it can lead to data corruption.
o If Mount Verification occurs while a DK Device is reporting a
write locked condition, the system will crash with an INCONSTATE
bugcheck.
o Disks go into Mount Verify and never come out.
o Error log entries have an incorrect format.
o Controller errors occur in systems with greater than 4Gb
of memory.
o Controller errors may occur during one- and two-byte transfers.
o A system crash may occur after a bus reset or adapter errors.
o Interaction between an RZ26F disk and an RRD45 CDROM causes I/O
performance degradation, bus resets and mount verifications.
INSTALLATION NOTES:
****< CAUTION >*****
***** AlphaServer 8400 and 8200 (TURBOLASER) INSTALLATION WARNING ****
If you are installing this remedial kit on an AlphaServer 8400 or 8200
you MUST make sure your console is at Rev 4.0 or later. Rev 4.0 is
available on the Alpha Firmware Update CDrom V3.7. Installing this kit
on a system that has a console revision earlier than 4.0 WILL RESULT IN
AN UNBOOTABLE SYSTEM. To recover from this situation you will need to
back out the new drivers by either booting from an alternate system disk
then deleting the drivers off your regular disk, or by rebuilding your
regular system disk.
The images in this kit will not take effect until the system is
rebooted. If there are other nodes in the VMScluster, they must
also be rebooted in order to make use of the new image(s).
If it is not possible or convenient to reboot the entire cluster at
this time, a rolling re-boot may be performed.
Files on this server are as follows:
|
»alpscsi05_071.README
»alpscsi05_071.CHKSUM
»alpscsi05_071.CVRLET_TXT
»alpscsi05_071.a-dcx_axpexe
|