OpenVMS ALPSCSI04_070 Alpha V7.0 SCSI ECO Summary
NOTE: An OpenVMS saveset or PCSI installation file is stored
on the Internet in a self-expanding compressed file.
The name of the compressed file will be kit_name-dcx_vaxexe
for OpenVMS VAX or kit_name-dcx_axpexe for OpenVMS Alpha.
Once the file is copied to your system, it can be expanded
by typing RUN compressed_file. The resultant file will
be the OpenVMS saveset or PCSI installation file which
can be used to install the ECO.
Copyright (c) Digital Equipment Corporation 1995, 1997. All rights reserved.
**CAUTION**
** AlphaServer 8400 and 8200 (TURBOLASER) INSTALLATION WARNING **
If you are installing this remedial kit on an AlphaServer 8400 or 8200
you MUST make sure your console is at Rev 4.0 or later. Rev 4.0
is available on the Alpha Firmware Update CD-ROM V3.7. Installing
this kit on a system that has a console revision earlier than 4.0
WILL RESULT IN AN UNBOOTABLE SYSTEM. To recover from this situation
you will need to back out the new drivers by either booting from an
alternate system disk then deleting the drivers off your regular disk,
or by rebuilding your regular system disk.
PRODUCT: OpenVMS Alpha
COMPONENTS: SCSI Drivers - MKSET.EXE
MKSETCLD.CLD
SYS$DKDRIVER.EXE
SYS$GKDRIVER.EXE
SYS$MKDRIVER.EXE
SYS$PKCDRIVER.EXE
SYS$PKEDRIVER.EXE
SYS$PKJDRIVER.EXE
SYS$PKQDRIVER.EXE
SYS$PKSDRIVER.EXE
SYS$PKTDRIVER.EXE
SYS$PKZDRIVER.EXE
SOURCE: Digital Equipment Corporation
ECO INFORMATION:
ECO Kit Name: ALPSCSI04_070
ECO Kits Superseded by This ECO Kit: ALPSCSI03_070
ALPSCSI02_070 (V7.0 Only)
ALPSCSI01_070 (V7.0 Only)
ECO Kit Approximate Size: 1720 Blocks
Kit Applies To: OpenVMS Alpha V7.0
System/Cluster Reboot Necessary: Yes
Installation Rating: 1 - To be installed on all systems running
the listed version(s) of OpenVMS.
NOTE: In order to receive the full fixes listed in this kit,
the following remedial kits also need to be installed:
None
ECO KIT SUMMARY:
An ECO kit exists for SCSI Drivers on OpenVMS Alpha V7.0. This kit
addresses the following problems:
Problems Addressed in the ALPSCSI04_070 Kit:
o If the HSZ configuration utility, HSZTERM, has an an outstanding I/O
to the HSZ, and Mount Verification occurs, then the system may
crash. This usually happens under high I/O loads.
o When the new Quantum Atlas 2 disk drives are mounted in a cluster
running Alpha OpenVMS V6.2-1H3, the system can enter an indefinite
loop at mount verification, with each host issuing MODE SELECT
commands.
o If Mount Verification occurs while a DK Device is reporting a write
locked condition, the system will crash with an INVEXCEPTN Bugcheck.
o Unnecessary Mount Verification for HSZ Unit Attention Conditions.
o The OpenVMS I/O User's Reference Manual added a new Magnetic Tape
I/O Function IO$_FLUSH in Document Revision 1.5 for Alpha and
revision 6.0 for VAX. This function was not fully implemented.
o A TZ30 or TKZ50 will come up offline when a system boots on current
versions of SYS$MKDRIVER.
o The class driver queue could become frozen.
HSZ devices may go into mount verify and eventually mount verify
timeout after an HSZ70 failover.
o If a target returns a Queue Full status, an unnecessary Mount
Verification occurs.
In SYS$PKSDRIVER, if a command is reinserted on the device queue
after a Queue Full condition occurs, the I/O will never complete.
o RZ28B devices are not recognized by AUTOCONFIGURE.
o An AlphaServer 4100 may see an invalid exception crash under heavy
I/O loads.
o Mount Verify not invoked for some recoverable errors.
Problems Addressed in the ALPSCSI03_070 Kit:
o Get or set volume does not work if CDROM_AUDIO.C is used.
CDROM_AUDIO.C is a sample program in SYS$EXAMPLES which shows
how to use the audio functions supported by DKDRIVER (a SCSI
disk class driver). The program logs CHECK CONDITIONS and
fatal drive errors.
o Mount fails on some devices.
o Some non-Digital disks cannot be accessed by DKDRIVER due
to "invalid mode sense" errors.
o Tagged Command Queuing cannot be disabled at the drive level.
o The Fujitsu M2512A drive does not work on OpenVMS Alpha.
o COPY/WRITE_CHECK fails to return an error when a known bad
block is written.
o DKDRIVER does not properly support non-512 block devices.
o Errors are logged when mounting some disks.
o Unformatted floppies fail during format attempts.
o Certain characteristics, such as mode sense 10 and TCQ, cannot
be permanently disabled.
o Some third party SCSI-2 disks fail during data check operations.
o A problem may occur during configuration of SCSI devices.
o Mount verification occurs repeatedly with no error log
entries to explain why.
o Incorrect access to the mode page value for the WCE bit in the
Caching mode page may occur.
o Miscalculation of the DMA timeout value may occur.
o The maximum usable disk size is 8.6Gb. Disk drives with a
capacity of greater than 8.6 Gb are not fully utilized.
o During a datacheck on SYS$PKEDRIVER, the ports may crash.
This occurs because an attempt is made to read the autosense
buffer after it has been deallocated.
o Recoverable errors on disks are treated as fatal except
for "data recovered" errors of some types.
o Some recoverable errors were being treated as successful,
which could lead to data corruption.
o Deferred errors leave I/O incomplete and no errors are
reported. This can lead to undetected errors in disk I/O.
o Geometry changes occur during packet acknowledgments (packack)
which causes unexpected behavior in serving on OpenVMS clusters.
o Two names and two paths appear for SCSI disks when one side
of a shared bus configures before the other and the MSCP path
to the disk is seen first. This causes problems because when
F$DEVICE finds both, host-based RAID does not work, quorum
disks do not function correctly, and the local path is not
used when it is available and otherwise would be used.
o SPI$CMD_BUFFER_ALLOC and SPI$BUFFER_MAP calls to port drivers
can return error codes instead of allocating or mapping
buffers. The port drivers crash, but the class driver is
the root of the problem. Class code has not checked for these
and continues on using the pointers in SCDRP and other structures
as though they are valid. At least one crash has definitely been
traced to this, and several other mysterious crashes may be
related. The result can be pool corruption or, in some cases,
disk corruption.
o POOLCHECK crashes while disks are being mounted.
o A Burns platform (Alphabook 1xxx/4xxx) system disk (IBM DPRS)
is corrupted by INIT commands, by analyze/disk/repair or by
continued use.
o Badblock revectoring delivers incorrect negative block numbers
to the disk to be revectored. This will be rejected, but means
that bad blocks are not being revectored correctly.
o The FORCE_ERROR routine that is used to force errors on certain
blocks (so all shadowset members have the same error block
numbers) is incorrectly overwriting the boot block instead of the
block selected.
o IO$_AUDIO function may crash the system.
o Running HSZTERM while heavy I/O occurs results in an
INVEXCEPTN bugcheck in the port driver.
o Third-party archivers and Desktop Backup, which create
non-ANSI tapes, can see SS$_TAPEPOSLOST and SS$_DATAOVERUN
errors when they are positioning the tape.
A new utility, SYS$ETC:MKSET.EXE, can parse DCL commands and
generate these requests. It requires PHY_IO privilege since
IO$_SETCHAR is a physical I/O function;therefore, it cannot be
run by nonprivileged users.
The utility can be used by enabling the MKSET command:
$ SET COMMAND MKSET
The syntax of the command is:
$ MKSET/'qualifier' MKcuuu:
There are three qualifiers for the MKSET command:
NEVER - Never use the new SKIPFILE functionality.
ALWAYS - Always use the new SKIPFILE functionality.
PER_IO - Allow utilities such as BACKUP and DUMP to
enable and disable the SKIPFILE functionality.
Likewise, user programs can enable the SKIPFILE
functionality with the IO$M_ALLOW_FAST modifier
for the IO$_SKIPFILE function.
For more information on this utility see:
SYS$ETC:MKSET.TXT
o Fatal drive errors occur during attempts to INIT the Exabyte
8200 tape drive.
o Request Sense data is truncated at 19 bytes.
o If a Queue Full status is returned by a target, a MEDOFL
status is returned by the Class Driver. This causes Mount
Verification and an unnecessary SYSTEM-W-NOTQUEUED errorlog
entry.
o Unaligned reads (partial block) to a disk causes corruption of
the EXE$GL_ERASEPB (Erase Pattern Buffer). Since this is
used as a convenient source of zeros by various pieces of code,
it can lead to data corruption.
o If Mount Verification occurs while a DK Device is reporting a
write locked condition, the system will crash with an INCONSTATE
bugcheck.
o Disks go into Mount Verify and never come out.
o Error log entries have an incorrect format.
o Controller errors occur in systems with greater than 4Gb
of memory.
o Controller errors may occur during one- and two-byte transfers.
o A system crash may occur after a bus reset or adapter errors.
o An RZ74 will not mount if the disk is not already spinning.
o Devices that require longer DMA and disconnect timeouts cannot
be used until a fixed driver is supplied.
o The mechanism for disabling SDTR, which was available in
ALPSCSI02_070, was not documented in that kit.
o Shadow copies and merges involving SCSI-attached disks may
cause a system crash.
Problems Addressed in the ALPSCSI02_070 Kit:
o Extended Sense Data from the HSZ40 is truncated to about 20
bytes. This provides too little information to determine
when a Raid set member fails.
o Premature command timeouts and SCSI bus resets may occur on
SMP systems. Occasionally, the SCSI bus resets will cause
a system crash. This problem occurs on SMP machines with the
KZMSA adapter installed on DEC 7000 and AlphaServer 8000
machines or the Adaptec AHA-1740/1742 adapter installed on
AlphaServer 2100 machines.
o The system can crash due to the driver having multiple bad
block threads running at the same time.
o Memory may be exhausted with BUFIO data structures.
o In a two-node SCSI cluster, shutting down one node can cause
the surviving node to hang, especially if the system disk is
the only disk on the bus.
o SDTR (Synchronous Data Transfer) negotiations occur on every
command issued through the IO$_DIAGNOSE QIO function. This
can result in degradation of system performance.
o Some SCSI 1 devices will become inoperative if they get SDTR
negotiation messages.
o Preventing SDTR negotiations may crash the system.
o System crashes may intermittently occur due to bugchecks
(INCONSTATE) in PKEDRIVER when the bus state is unknown.
o Some SCSI 1 devices generate phase errors with the SCSI 2
driver.
o The following DEVICE ERROR may appear in the error log on
Alpha 8400 and 8200 systems with SCSI disks connected to a
KFTIA (ITIOP) IO module:
ENTRY TYPE - Device Error
VMS SCSI Error type - Send SCSI Command Failed
Port status - Unknown Port Status (hex value is 32C)
o PKSDRIVER may crash with a ACCVIO BUG_CHECK when a second SCSI
cluster node boots.
o An insufficient number of queue elements are available on the
Adapter Driver Free Queue (ADFQ). These elements are used
during SCSI bus reset processing. During heavy SCSI bus
cluster traffic, the current number of free queue elements
may run out.
o An INCONSTATE system crash may occur due to double deallocation
of map registers.
o DIAGNOSE reports unusual error log information for KZPSA
errors.
o The KZPSA takes 6 seconds to initialize. This time could be
reduced.
Problems Addressed in the ALPSCSI01_070 Kit:
o A problem occurs on a check condition. When the request sense
command is issued, both the condition code returned from SYS$QIO and
the condition code returned in IOSB (after synchronization by
SYS$SYNCH) indicate success. Also, the correct sense data block is
transferred to the address specified in S2DGB$L_32DATADDR.
Unfortunately the byte count in the IOSB is zero instead of the
actual transfer length.
o Serious performance degradation may occur with devices that use
GKDRIVER.
o Some SCSI devices that provide parameters cannot be used. They
cause controller errors when in fact nothing is wrong.
o Tapes, especially the TZ87, run so slowly during COPY that they
appear to be hung.
o During BACKUP, the TLZ6L (TLZ06 with autoloader) and TLZ7L can take
so long to rewind the current tape, and load the next tape that a
SCSI command timeout error occurs, and the backup aborts.
o A device at target ID 0 can be lost after a SCSI bus reset by
PKSDRIVER.
o Device errors may occur on KZPSA devices.
Problems Addressed in the ALPSCSI01_070 Kit:
o SCSI $QIO(IO$_DIAGNOSE) for the write functions fails.
o MOUNT/CLUSTER/NOWRITE does not write-lock the device on the node
which owns the disk. On the serving node, a DCL SHOW DEVICE command
will report the device as write-locked, but users on the serving
node may still modify the device.
o Compaction works only on first volume of a multi-volume saveset.
o TSZ07 density cannot be changed back and forth between 6250 bpi and
1600 bpi.
o PKCDRIVER resets the 53C94 chip if the target does not enter the
next phase within two seconds.
o A failure of the ISP1020 DUMP_RAM command causes a checksum error in
the read firmware. This improperly causes a bugcheck.
o Error log entries are improperly formatted. Not all registers are
dumped.
o Some diagnostic error messages are not seen for severe problems.
o Driver does not set field in SPDT.
o On OpenVMS Alpha systems containing greater than 2GB of memory,
PKSDRIVER would fail to deallocate a single non-paged pool MISC
(SGMAP) entry on most SCSI I/O requests. The system will either
hang or fail to recover from a non-paged pool expansion failure.
INSTALLATION NOTES:
The system should be rebooted after successful installation of this kit
and the new KZPSA-BB firmware. If the system is a member of a VMScluster,
the entire cluster should be rebooted.
In order to receive this complete problem correction you must also
install revision A09, or later, of the KZPSA-BB firmware. This revision
includes the changes necessary for support of systems with >1GB of memory
when using OpenVMS. There are a number of ways to acquire this firmware,
and although the following procedures are for firmware revision A09,
subsequent revisions of firmware can be acquired in the same way:
o Replace the older image file on your existing KZPSA Alpha AXP
Software Diskette. Delete the older A0x.img file and load the
latest file from the Alpha Firmware CD.
Then you can use that diskette with the FWUPDATE program to
update your KZPSA per the user guide instructions.
o Order a new KZPSA Alpha AXP Software Diskette from SSB at
603-884-4446 starting September 11, 1995. The part number
is AK-QGTNG-CA. Then you can use that diskette with the
FWUPDATE program to update your KZPSA per the user guide
instructions.
o If you have an AlphaServer 8200 or AlphaServer 8400 system,
you can order the CD kit part # QZ-00RAD-E8.1.0. This contains
the CD and release notes for loading. A09 is included in the
LFU application. This part can be ordered starting September 15,
This patch can be found at any of these sites:
Colorado Site
Georgia Site
Files on this server are as follows:
alpscsi04_070.README
alpscsi04_070.CHKSUM
alpscsi04_070.CVRLET_TXT
alpscsi04_070.a-dcx_axpexe
|