OpenVMS VMS721_F11X-V0200 Alpha V7.2-1 F11BXQP ECO Summary
TITLE: OpenVMS VMS721_F11X-V0200 Alpha V7.2-1 F11BXQP ECO Summary
Modification Date: 03-OCT-2000
Modification Type: Updated Kit: Supersedes VMS721_F11X-V0100
NOTE: An OpenVMS saveset or PCSI installation file is stored
on the Internet in a self-expanding compressed file.
For OpenVMS savesets, the name of the compressed saveset
file will be kit_name.a-dcx_vaxexe for OpenVMS VAX or
kit_name.a-dcx_axpexe for OpenVMS Alpha. Once the OpenVMS
saveset is copied to your system, expand the compressed
saveset by typing RUN kitname.dcx_vaxexe or kitname.dcx_alpexe.
For PCSI files, once the PCSI file is copied to your system,
rename the PCSI file to kitname-dcx_axpexe.pcsi, then it can
be expanded by typing RUN kitname-dcx_axpexe.pcsi. The resultant
file will be the PCSI installation file which can be used to install
the ECO.
Copyright (c) Compaq Computer Corporation 1999, 2000. All rights reserved.
OP/SYS: OpenVMS Alpha
COMPONENT: F11BXQP
SOURCE: Compaq Computer Corporation
ECO INFORMATION:
ECO Kit Name: VMS721_F11X-V0200
DEC-AXPVMS-VMS721_F11X-V0200--4.PCSI
ECO Kits Superseded by This ECO Kit: VMS721_F11X-V0100
ECO Kit Approximate Size: 960 Blocks
Kit Applies To: OpenVMS Alpha V7.2-1
System/Cluster Reboot Necessary: Yes
Rolling Re-boot Supported: Yes
Installation Rating: INSTALL_1
1 - To be installed on all systems running
the listed version(s) of OpenVMS.
Kit Dependencies:
The following remedial kit(s) must be installed BEFORE
installation of this kit:
VMS721_UPDATE-V0100
VMS721_PCSI-V0100
In order to receive all the corrections listed in this
kit, the following remedial kits should also be installed:
None
ECO KIT SUMMARY:
An ECO kit exists for the F11BXQP on OpenVMS Alpha V7.2-1. This kit
addresses the following problems:
Problems Addressed in VMS721_F11X-V0200:
o The system crashed at F11BXQP+11F54 with an XQPERR bugcheck
while running Storage Library System (SLS) backup. A portion
of a sample dump appears below:
Dump taken on 21-JUN-1999 20:57:45.45
XQPERR, Error detected by file system XQP
Time of system crash: 21-JUN-1999 20:57:45.45
Version of system: OpenVMS (TM) VAX Version V7.1
System Version Major ID/Minor ID: 1/0
VAXcluster node: PMSA04, a VAX 7000-830
Crash CPU ID/Primary CPU ID: 02/00
Bitmask of CPUs active/available: 00000007/00000007
CPU bugcheck codes:
CPU 02 -- XQPERR, Error detected by file system XQP
2 others -- CPUEXIT, Shutdown requested by another CPU
CPU 02 reason for Bugcheck: XQPERR, Error detected by file system XQP
Process currently executing on this CPU: SYSBAK_C059
Current image file: DSA947:[SLS$FILES_VAX.][SYSTEM]VMSBUXX.EXE;4
Current IPL: 0 (decimal)
CPU database address: 862A4000
MPB address: 8B32BF00
General registers:
R0 = 00000002 R1 = 0000000C R2 = 8A69CE90 R3 = 00000000
R4 = 00000002 R5 = 0000023C R6 = 7FE8F9A8 R7 = 00000001
R8 = 7FE8FA40 R9 = 800083F4 R10 = 7FE8FA64 R11 = 7FE8FA60
AP = 7FE8F334 FP = 7FE8F2FC SP = 7FE8F2C8 PC = 862F8D58
PSL = 00000004
Processor registers:
P0BR = BD0FB400 SBR = 1EF80400 ASTLVL = 00000001
P0LR = 00001025 SLR = 003FFF00 SISR = 00000000
P1BR = BCD2D800 PCBB = 562FF420 ICCS = 80000080
P1LR = 001FEDC7 SCBB = 1EF53200 SID = 17000201
LDEV = 00018008 LBER = 00000000 LCNR = 00000001
LCON0 = 1F000004 LCON1 = 00000000 TOUR = 00000000
LBECR0 = 0044003A LBECR1 = 00009120 LMODE = 000332A4
LMERR = 00000000 BIU_STAT = F00E1070 BIU_ADDR = 00000298
MMESTS = 1C008000 TBSTS = 800001D0 PCSTS = FFFFF800
ISP = 862A6200
KSP = 7FE8F2C8
ESP = 7FFE9800
SSP = 7FFECA44
USP = 7FDBAFD4
No spinlocks currently owned by CPU 02
SDA> ex/inst @pc-30;30
F11BXQP+11F28: BSBB F11BXQP+11F81
F11BXQP+11F2A: BNEQ F11BXQP+11F43
F11BXQP+11F2C: BBS #03,18(R2),F11BXQP+11F46
F11BXQP+11F31: PUSHL R2
F11BXQP+11F33: CALLS #01,F11BXQP+00B0E
F11BXQP+11F3A: MOVL R2,R0
F11BXQP+11F3D: JSB F11BXQP+122EE
F11BXQP+11F43: BRW F11BXQP+1205C
F11BXQP+11F46: BBC #05,18(R2),F11BXQP+11F4F
F11BXQP+11F4B: BUGW #05CC
F11BXQP+11F4F: BBC #02,18(R2),F11BXQP+11F58
F11BXQP+11F54: BUGW #05CC
F11BXQP+11F58: MOVL @30(SP),R0
SDA> ex/inst @pc-30;30
8A69CEA8: 0000000E "...."
=1110 so bit 2 is set so we don't jump around the bugchk
Process index: 005B Name: SYSBAK_C059 Extended PID: 2120BC5B
----------------------------------------------------------------
Images Affected:
- [SYS$LDR]F11BXQP.EXE
- [SYS$LDR]F11BXQP.STB
o A directory of a directory file that is greater than 127
blocks can cause a false SS$_ENDOFFILE (EOF) to be reported.
Images Affected:
- [SYS$LDR]F11BXQP.EXE
- [SYS$LDR]F11BXQP.STB
o A process would sometimes hang and would not be able to be
deleted when writing to a sequential file. The process
appeared to have a 'lost' I/O outstanding, when in fact, the
I/O was on the file control block (FCB) high water mark (HWM)
wait queue waiting for other I/Os to complete.
Images Affected:
- [SYS$LDR]F11BXQP.EXE
- [SYS$LDR]F11BXQP.STB
o The ancillary control process START_ACP MOUNT routine
bugchecked with a NOTUCBRVT during the mounting of a jukebox
device. A portion of the dump appears below:
Dump taken on 27-OCT-1999 21:54:33.46
NOTUCBRVT, Not UCB pointer in RVT
Version of system: OpenVMS (TM) Alpha Operating System, Version V7.2
VMScluster node: APOLLO, a AlphaServer 8400 5/440
Process currently executing on this CPU: BATCH_2730
Current image file: DSA20:[SYS0.SYSCOMMON.][SYSEXE]VMOUNT.EXE
Current IPL: 2 (decimal)
MOUNT routine START_ACP tripped over the following RVT because
RVT$L_REFC (00000002) is not equal to the number of UBC's in
RVT$L_UCBLST (one):
SDA> format @r1
FFFFFFFF.821A3F00 RVT$L_STRUCLKID 5E05FD3C
FFFFFFFF.821A3F04 RVT$L_REFC 00000002
FFFFFFFF.821A3F08 RVT$W_SIZE 00C0
FFFFFFFF.821A3F0A RVT$B_TYPE 0E
FFFFFFFF.821A3F0B RVT$B_NVOLS 0A
FFFFFFFF.821A3F0C RVT$T_STRUCNAME 31
FFFFFFFF.821A3F0D 353939
FFFFFFFF.821A3F10 20203031
FFFFFFFF.821A3F14 20202020
FFFFFFFF.821A3F18 RVT$T_VLSLCKNAM 31
FFFFFFFF.821A3F19 353939
FFFFFFFF.821A3F1C 20203031
FFFFFFFF.821A3F20 20202020
FFFFFFFF.821A3F24 RVT$L_BLOCKID 3B076C2F
FFFFFFFF.821A3F28 RVT$B_ACB 00
FFFFFFFF.821A3F29 0000000
FFFFFFFF.821A3F2C 00000000
FFFFFFFF.821A3F30 20000000
FFFFFFFF.821A3F34 00010001
FFFFFFFF.821A3F38 A4DCCFB0 XQP$DEQBLOCKER
FFFFFFFF.821A3F3C 821A3F00
FFFFFFFF.821A3F40 00000000
FFFFFFFF.821A3F44 00000000
FFFFFFFF.821A3F48 00000000
FFFFFFFF.821A3F4C RVT$L_TRANS 00000001
FFFFFFFF.821A3F50 RVT$L_ACTIVITY 00000001
FFFFFFFF.821A3F54 RVT$A_RVTVCB 8164E240
RVT$L_UCBLST
RVT$C_LENGTH
SDA> show stack /long (.+rvt$l_ucblst);4*(@(.+rvt$b_nvols))&ff
FFFFFFFF.821A3F54 8164E240
FFFFFFFF.821A3F58 00000000
FFFFFFFF.821A3F5C 00000000
FFFFFFFF.821A3F60 00000000
FFFFFFFF.821A3F64 00000000
FFFFFFFF.821A3F68 00000000
FFFFFFFF.821A3F6C 00000000
FFFFFFFF.821A3F70 00000000
FFFFFFFF.821A3F74 00000000
FFFFFFFF.821A3F78 00000000
Images Affected: [SYS$LDR]F11BXQP.EXE
o An exception, which leads to an INVEXCEPTN bugcheck, occurs in
XQP routine INS_LIMBO or TRIM_LIMBO. The footprint is a
corrupt limbo queue (EXE$GQ_LIMBOQ) and the exception occurs
during a VAX_INSQUE or VAX_REMQUE.
Images Affected:
- [SYS$LDR]F11BXQP.EXE
- [SYS$LDR]F11BXQP.STB
o If a window control block (WCB) list in routine MARK_INCOMPLETE
becomes corrupted, the system can crash with a NOTWCBWCB bugcheck.
Images Affected:
- [SYS$LDR]F11BXQP.EXE
- [SYS$LDR]F11BXQP.STB
o After removing a file control block (FCB) from the limbo
queue, two bugchecks were improperly added to REM_LIMBOQ.
These two XQPERR bugchecks have been removed.
1. If the queue is empty, then bugcheck if the FCB reference
count is 1 (accounting for FID_TO_SPEC).
2. If the queue is not empty, then bugcheck if the FCB
reference count is not 1 or decrement EXE$GL_LIMBOLEN.
Both bugchecks have since been removed.
Images Affected: [SYS$LDR]F11BXQP.EXE
o Make INVSECURESTATE "Invalid state detected by security
subsystem" bugcheck FATAL.
Images Affected: [SYS$LDR]F11BXQP.EXE
o Separator pages for print jobs which are created via COPY to a
spooled device do not include the complete file specification.
The current length calculation includes the file name, but not
the device name and null directory specification.
Images Affected: [SYS$LDR]F11BXQP.EXE
o The system can crash with an XQPERR bugcheck in routine
RES_SEQ_MISMATCH. The error message is "'Found a stale
referenced or non-directory FCB in FCB queue'.
Images Affected: [SYS$LDR]F11BXQP.EXE
o The system can crash with an XQPERR bugcheck in routine
MAKE_DEACCESS. The error message is "deaccess conversion
failed". In one instance, the APPEND command was used to
update files and/or create new files.
Images Affected: [SYS$LDR]F11BXQP.EXE
o The system can crash with an XQPERR bugcheck at offset
UPDATE_INDX_C+000C8 A crash summary follows below:
Crash Time: 2-MAY-2000 18:08:04.93
Bugcheck Type: XQPERR, Error detected by file system XQP
Node: CSUPR3 (Cluster)
CPU Type: AlphaServer 8400 5/625
VMS Version: V7.2-1
Current Process: KELLYS
Current Image: DSA0:[SYS2.SYSCOMMON.][SYSEXE]RENAME.EXE
Failing PC: FFFFFFFF.BF636BBC UPDATE_INDX_C+000C8
Failing PS: 00000000.00000000
Module: F11BXQP (Link Date/Time: 13-MAR-2000 21:14:54.58)
Offset: 00020BBC
Boot Time: 30-APR-2000 15:42:27.00
System Uptime: 2 02:25:37.93
Crash/Primary CPU: 00/00
System/CPU Type: 0C05
Saved Processes: 122
Pagesize: 8 KByte (8192 bytes)
Physical Memory: 4096 MByte (524288 PFNs, contiguous memory)
Dumpfile Pagelets: 521595 blocks
Dump Flags: olddump,writecomp,errlogcomp,dump_style
Dump Type: compressed,selective,dosd
EXE$GL_FLAGS: poolpging,init,bugdump,savedump
Paging Files: 5 Pagefiles and 1 Swapfile installed
Stack Pointers:
KSP = 00000000.6C588F40 ESP = 00000000.7FFA6000
SSP = 00000000.7FFAC100
USP = 00000000.6C437B30
General Registers:
R0 = 00000000.00000012 R1 = 00000000.000005B0 R2 = FFFFFFFF.BF66D3D0
R3 = 00000000.6C580012 R4 = FFFFFFFF.BE75A600 R5 = 00000000.00000024
R6 = 00000000.6C589724 R7 = 00000000.00000032 R8 = 00000000.6C589724
R9 = 00000000.6C589C60 R10 = 00000000.6C589724 R11 = 00000000.6C589968
R12 = 00000000.6C589A8C R13 = 00000000.6C58984C R14 = 00000000.6C589C9C
R15 = 00000000.6C589758 R16 = 00000000.000005B4 R17 = 00000000.00000001
R18 = 00000000.00000012 R19 = 00000000.00000000 R20 = 00000000.00000001
R21 = 00000000.6C580000 R22 = 00000000.00000012 R23 = FFFFFFFF.BE75A602
R24 = 00000000.00000012 AI = 00000000.FF000000 RA = FFFFFFFF.BF636B90
PV = FFFFFFFF.8D0AA3A0 R28 = FFFFFFFF.FFFFFE01 FP = 00000000.6C588F40
PC = FFFFFFFF.BF636BC0 PS = 00000000.00000000
---------------------
Images Affected: [SYS$LDR]F11BXQP.EXE
Problems Addressed in VMS721_F11X-V0100:
o An XQPERR bugcheck in LOCKERS can occur when the retry limit
on the F11B$x lock is reached.
This problem can occur when the owner of the $x lock is
running at a high process process priority and a number of
processes that are in a clustered system are also trying to
validate this lock, but at a lower process priority.
Image(s) Affected - [SYS$LDR]F11BXQP.EXE
o After releasing the current process's IPL/Fork lock, a
system can crash with an SPLACQERR bugcheck
Image(s) Affected - [SYS$LDR]F11BXQP.EXE
o A directory file becomes "corrupt" and DUMP/DIRECTORY
displays a block similar to the following:
Virtual block number 3574 (00000DF6), 512 (0200) bytes
0000 Directory Entry:
0000 Size: 508
0002 Version limit: 32767
0004 Type: 0 (FID)
0005 Name count: 24
0006 Name: COSLR1201_01_JUPICC2.LIS
001E Version: 7859 FID: (40993,5,0)
0026 Version: 7858 FID: (40990,1,0)
002E Version: 7857 FID: (40988,3,0)
...
01E6 Version: 7802 FID: (40455,1,0)
01EE Version: 7801 FID: (40454,1,0)
01F6 Version: 32767 FID: (16744447,65535,0)
01FE End of records (-1)
The directory shuffle code creates the above erroneous
directory entry for the following reasons:
1. So that a new directory buffer will have a valid
structure (this allows VALIDATE_DIRBLK to write
the block to disk); and
2. The entry will be spotted as incorrect (via VERIFY)
if the system crashes in the middle of this shuffle.
After the directory block (with the erroneous directory entry)
is written to disk, the bad entry is removed. A subsequent
call to READ_BLOCK assumes that the block comes from the
buffer cache and not from disk. Under heavy load, this
assumption may not be true as the directory block may have
been kicked out of the cache.
Image(s) Affected - [SYS$LDR]F11BXQP.EXE
o XQP DELETE code accepts an FCB (File Control Block) off the
limbo queue if not IO$V_DELETE. This prevents the
invalidation of VIOC cache blocks as the result of a RENAME
operation. This causes a large amount of XQP (FCB) and VIOC
(CFCB) non-paged pool usage as well as XQPERR bugchecks.
Image(s) Affected - [SYS$LDR]F11BXQP.EXE
o Under the following circumstances,
1. A directory with multiple headers (e.g., from a large ACL)
is deleted on one node (A) in a cluster; and
2. the directory had been previously accessed on another node
(B) in the cluster,
The files created with the previously deleted headers in step
1 would show up on node B with the error:
%SYSTEM-F-NOSUCHFILE, no such file.
Image(s) Affected - [SYS$LDR]F11BXQP.EXE
INSTALLATION NOTES:
The images in this kit will not take effect until the system is
rebooted. If there are other nodes in the VMScluster, they must
also be rebooted in order to make use of the new image(s).
If it is not possible or convenient to reboot the entire cluster at
this time, a rolling re-boot may be performed.
All trademarks are the property of their respective owners.
This patch can be found at any of these sites:
Colorado Site
Georgia Site
Files on this server are as follows:
dec-axpvms-vms721_f11x-v0200--4.README
dec-axpvms-vms721_f11x-v0200--4.CHKSUM
dec-axpvms-vms721_f11x-v0200--4.pcsi-dcx_axpexe
vms721_f11x-v0200.CVRLET_TXT
|