OpenVMS ALPDRIV17_H3062 Alpha V6.2-1H3 PCA/PNDRIVERS and SCS ECO Summary
NOTE: An OpenVMS saveset or PCSI installation file is stored
on the Internet in a self-expanding compressed file.
The name of the compressed file will be kit_name-dcx_vaxexe
for OpenVMS VAX or kit_name-dcx_axpexe for OpenVMS Alpha.
Once the file is copied to your system, it can be expanded
by typing RUN compressed_file. The resultant file will
be the OpenVMS saveset or PCSI installation file which
can be used to install the ECO.
Copyright (c) Digital Equipment Corporation 1998. All rights reserved.
OP/SYS: DIGITAL OpenVMS Alpha
COMPONENT: SYS$PCADRIVER
SYS$PNDRIVER
SYS$SCS.EXE
SOURCE: Digital Equipment Corporation
ECO INFORMATION:
ECO Kit Name: ALPDRIV17_H3062
ECO Kits Superseded by This ECO Kit: ALPDRIV12_H3062
ECO Kit Approximate Size: 1044 Blocks
Kit Applies To: OpenVMS Alpha V6.2-1H3
System/Cluster Reboot Necessary: Yes
Rolling Reboot Supported: Yes
Installation Rating: 3 - To be installed on all systems running
the listed versions of OpenVMS which
are experiencing the problems described.
INSTALLATION NOTE: Please see detailed installation instructions
in the INSTALLATION NOTES section below. *DO NOT* install this
ECO kit without first reviewing these instructions.
Kit Dependencies:
The following remedial kit(s) must be installed BEFORE
installation of this kit:
None
In order to receive all the corrections listed in this
kit, the following remedial kits should also be installed:
None
ECO KIT SUMMARY:
An ECO kit exists for PCADRIVER, PNDRIVER and SCS on OpenVMS Alpha
V6.2-1H3. This kit addresses the following problems:
Problems Addressed in ALPDRIV17_H3062:
o Messages such as:
%PNA0, Inappropriate SCA Control Message - FLAGS/OPC/STATUS/PORT
00/00/00/00
may appear on the console, with associated errorlog messages,
on systems with HSX disk controllers.
o Following a CI-port MFQE (message-free-queue-empty) interrupt,
with no SCS-credit deficit (not in "optimistic SCS-credit
mgmt. mode": MFQ entry-count = SCS Rcv-credits), a subsequent
legitimate MFQE interrupt (with SCS-credit deficit) will
result in a series of secondary errors causing port-resets,
never expanding the MFQ queue, and posting of a series of these
error-log entries (key ID: error-type/sub-type = 0x8102):
Logging OS 1. OpenVMS
System Architecture 2. Alpha
OS version V7.1
Event sequence number 3653.
Timestamp of occurrence 20-OCT-1997 00:01:27
Time since reboot 2 Day(s) 12:14:28
Host name GDC140
System Model AlphaServer 8400 Model
5/300
Entry type 98. Asynchronous Device
Attention
---- Device Profile ----
Unit GDC140$PNA0
Product Name CIXCD (XMI to CI Adapter)
------ Adapter Data -----
Error Type/SubType x8102 Hardware Error, Unspeci-
fied Port
Hardware Error.
Port will be RE-STARTED.
Count - Remaining Retries 36.
CASR x00000001 Bit 0: Message Free Que
EXHAUSTED
(AMFQE)
AMCSR x00000004 Bit 2: Interrupt ENABLE
(IE)
PESR xFFFFFFFF
XDEV x05110C2F Device Type is: 0x0C2F
= CIMNA
Device Revision is: 0x11
= A1
Firmware Revision is:0x05
= V-5
ASNR x00000001
XBER x00000040 XMI Node ID is: 1.
Commander ID is: 2 =
Microcode CMDR
XFADR xFFFFFE00
XFAER x73FF0FFF
PDCSR x00000001
PFAR x0000055C
Extra Longword 1 x00000000
Extra Longword 2 x00000000
Extra Longword 3 x00000000
----- Software Info -----
UCB$x_ERTCNT 128. Retries Remaining
UCB$x_ERTMAX 10. Retries Allowable
UCB$x_STS x00000000
UCB$x_ERRCNT 30. Errors This Unit
UCB$L_DEVCHAR1 x0C450000 Sharable
Available
Error Logging
Capable of Input
Capable of Output
o When using the ALPDRIV15_062 Cluster Ports TIMA Kit with
non-NPORT (non-CIPCA,CIMNA, or KFMSB) SCS-port drivers,
NPAGEDYN pool-corruption will occur in pool following the
end of each non-NPORT PDT (port-descriptor-table:
1-per-SCS-port).
Symptoms will vary according to how this NPAGEDYN is currently
used but could consist of INVEXCPTN, SSRVEXCPTN, and other
ACCVIOs.
Problems Addressed in ALPDRIV12_H3062:
o On Alpha systems, with many Virtual Circuit failures, the
system may BUGCHECK with a CLUEXIT or may simply hang.
Within the subsequent dump, many CDTs (Connection
Descriptor Table) in DISC_MATCH will be seen and there will
be no free CDTs.
o These problems only affect Turbolaser AS8200/8400 capable
of exceeding 4 gigabyte memory sizes.
CIMNA (NPORT CIXCD XMI-to-CI adapter for Laser/Turbolaser) and
KFMSB (XMI-to-DSSI) adapters will fail to initialize or start
under OpenVMS if non-paged-dynamic (NPAGEDYN) pool contains
PFNs (physical pages) over 4 gigabytes (PA > 32-bits), and,
BAP (bus-addressable-pool) is merged with NPAGEDYN due to the
absence of a PCI bus on the system. If any of the NPORT
structures (ABLK, AMPB, QBUFs, CRRRs, BDL, BDLT) contain
physical addresses (PA) > 32-bits, these devices fail to
start, producing the following errors. CDTs appear in various
states when examined with the "SDA>".
1. CIMNA ERRORS
===============
The CIMNA will exhibit "port-timeouts" or XMI transaction-
timeout (TTO) memory-system errors on boot, such as:
TURBOLASER CONSOLE LOG:
-----------------------
%PNA0, Port Error Bit(s) Set -
CNF/PMC/PSR 08110C2F/00000004/00000208
%PNA0, Port is Reinitializing (48 Retries Left). Check the
Error Log.
----------------------------------------------
%PNA0, CI port timeout.
%PNA0, Port is Reinitializing (49 Retries Left). Check the
Error Log.
----------------------------------------------
CIMNA ERROR LOG ENTRY:
----------------------
********************** ENTRY 2 *****************************
Logging OS 1. OpenVMS
System Architecture 2. Alpha
OS version V7.1
Event sequence number 1.
Timestamp of occurrence 01-JAN-1996 00:00:04
Time since reboot 0 Day(s) 0:00:04
Host name ANDA1A
System Model AlphaServer 8400 5/300
Entry type 98. Asynchronous Device Attention
---- Device Profile ----
Unit ANDA1A$PNA0
Product Name CIXCD (XMI to CI Adapter)
------ Adapter Data -----
Error Type/SubType x8102 Hardware Error,Unspecified Port
Hardware Error. Port will be RE-STARTED.
Count - Remaining Retries 50.
CASR 00000208 Bit 3: Memory System ERROR (MSE)
Bit 9: Uninitialize State (UNIN)
AMCSR x00000004 Bit 2: Interrupt ENABLE (IE)
PESR xFFFFFFFF
XDEV x08110C2F Device Type is: 0x0C2F = CIMNA
Device Revision is: 0x11 = A1
Firmware Revision is: 0x08 = V-8
ASNR x00000208
XBER x8000A060 Bit 13: Transaction Timeout (TTO)
Bit 15: Command NoAck (CNAK)
Bit 31: Error Summary (ES)
XMI Node ID is: 1.
Commander ID is: 3 = INTR
XFADR xFFFFFFFF XMI Failing Addr[00:28]: x1FFFFFFF
XMI Failing Addr[39]: x00000001
Failing Length: x00000003
XFAER x13FF0000 Mask[00:15]: x00000000
XMI Failing Addr[29:38]: x000003FF
XMI Failing Command: 1, READ
PDCSR x00000208
PFAR x0000055C
Extra Longword 1 x00000000
Extra Longword 2 x00000000
Extra Longword 3 x00000000
----- Software Info -----
UCB$x_ERTCNT 0. Retries Remaining
UCB$x_ERTMAX 0. Retries Allowable
UCB$x_STS x10000000
UCB$x_ERRCNT 1. Errors This Unit
UCB$L_DEVCHAR1 x0C450000 Sharable
Available
Error Logging
Capable of Input
Capable of Output
************************************************************
2. KFMSB ERRORS
===============
TURBOLASER CONSOLE LOG
----------------------
"Port Error Bit(s) Set - CNF/PMC/PSR
xxxxxxxx/xxxxxxxx/05008010"
NOTE: The PSR (taken from the ASR: Adapter Status Register)
translates to:
- <04> Adapter Abnormal Condition
- <15> Channel 1 flag
- <30:24> (=5) Illegal Carrier Address
o SCS SYSAP data transfer mapping requests will generate
incorrect (miscalculated) physical address pointers, causing
disk/tape data-transfer corruption, if the page_offset
requested extends beyond the first page (>8Kb-1: Alpha
page-size) of the requested transfer (page defined by SVAPTE
in SCS$MAP request). SYS$SCS sources the page_offset from
CDRP$L_BOFF, and sources SVAPTE from CDRP$L_SVAPTE, both of
which are supplied by the SCS client SYSAP (DUDRIVER, CNXMGR,
etc.). OpenVMS SCS SYSAPS are not believed to use CDRP$L_BOFF
values > 8k-1 (Alpha page), but user-written SCS SYSAP
applications might use a value > 13-bits since CDRP$L_BOFF is
32-bits (formerly CDRP$W_BOFF/16-bits).
o Performance is degraded on CIXCD and CIPCA based systems,
when communicating with other NADP (non-alternating-dual-path)
nodes during single-CI-path operation. This results in
CI-cable failure/removal or CI single-path failures
(NO_RESPONSE, NAK errors). NADP-supporting nodes currently
are HSJ40, HSJ50, CIPCA, and CIMNA/CIXCD.
o The CIPCA will not properly re-initialize after a PCI-DMA-Engine
"bus error" (PCI bus master abort or target abort). The
port-driver will continually fail to retry the re-initialize
until the 50 retry count is expired. The console OPA0 output is
typically as follows:
%PNB0, Port Error Bit(s) Set - NODESTS/CASR(H)/(L)
%PNB0, Port is Reinitializing (48 Retries Left).
Check the Error Log.
o When booting Alpha machines the console may display messages
such as:
%PNA0, Inappropriate SCA Control Message - FLAGS/OPC/STATUS/PORT
00/00/00/00
on the console - with associated errorlog messages.
o The CIPCA Direct-DMA (DDMA) pool will not correctly initialize
on AS4100 systems running OpenVMS V6.2-1H3 or the V6.2 remedial
stream with greater than 1 Gb. of memory. One of the following
3 symptoms will occur following an NPAGEDYN expansion event,
without a DDMA-pool when > 1 Gb. of memory is present:
o SPINWAIT system-crash
o System-hang which will respond to a ^P HALT request, and
will generate a forced crash if the system-disk/dump-file
is NOT on a CIPCA;
o System-hang which will not respond to a ^P HALT request.
A system-reset (front-panel reset switch) is required to
clear. NO DUMP is created.
NOTE: All crashes/hangs also resulted in CIPCA LED
error-code=PCI-DMA-ENGINE-RING-ERROR-1/0
(code= 0x01C or 0x01B/
System data cells SCS$GQ_DDMA_BASE & SCS$GQ_DDMA_LEN will
both contain a "00000000" value on AS4100 systems with
CIPCA and 1 Gb. of memory (use SDA> to examine).
INSTALLATION NOTES:
The images in this kit will not take effect until the system is
rebooted. If there are other nodes in the VMScluster, they must
also be rebooted in order to make use of the new image(s).
If it is not possible or convenient to reboot the entire cluster at
this time, a rolling re-boot may be performed.
o Multiprocessor Systems with CIPCAs: SMP_SPINWAIT Restriction
If your system uses a CIPCA adapter and you operate with
MULTIPROCESSING set to a non-zero value, you must reset the
value of the SMP_SPINWAIT parameter to 300000 (3 seconds)
instead of the default 100000 (1 second).
If you do not change the value of SMP_SPINWAIT, a CIPCA adapter
error could generate a CPUSPINWAIT system bugcheck similar to
the following:
**** OpenVMS (TM) Alpha Operating System V7.1 - BUGCHECK ****
** Code=0000078C: CPUSPINWAIT, CPU spinwait timer expired
This restriction will be removed in a future OpenVMS release.
Note:
This release note supersedes a similar release note, note
4.15.2.4.5, in the OpenVMS Version 7.1 Release Notes manual
as well as 6.2-1H3 sec:1.13.1, which also included a
SYSTEM_CHECK parameter restriction. The SYSTEM_CHECK
parameter restriction is incorrect. Furthermore, the
earlier Release note stated that the change to the
SMP_SPINWAIT parameter was required for a MULTIPROCESSING
parameter setting of 1 or 2. This requirement applies to
all non-zero MULTIPROCESSING parameter settings.
o Do not install this TIMA kit on systems which have a boot path
through KFMSBs and HSD10s. If you do, the system will not
boot. OpenVMS Engineering is aware of this problem and a new
This patch can be found at any of these sites:
Colorado Site
Georgia Site
Files on this server are as follows:
alpdriv17_h3062.README
alpdriv17_h3062.CHKSUM
alpdriv17_h3062.CVRLET_TXT
alpdriv17_h3062.a-dcx_axpexe
|