OpenVMS ALPRMS04_061 Alpha V6.1 RMS/Convert ECO Summary
NOTE: An OpenVMS saveset or PCSI installation file is stored
on the Internet in a self-expanding compressed file.
The name of the compressed file will be kit_name-dcx_vaxexe
for OpenVMS VAX or kit_name-dcx_axpexe for OpenVMS Alpha.
Once the file is copied to your system, it can be expanded
by typing RUN compressed_file. The resultant file will
be the OpenVMS saveset or PCSI installation file which
can be used to install the ECO.
Copyright (c) Digital Equipment Corporation 1997. All rights reserved.
OP/SYS: OpenVMS Alpha
COMPONENT: RMS.EXE
CONVERT.EXE
CONVSHR.EXE
EDF_TV.EXE
RECLAIM.EXE
RECOVER.EXE
RMSREC$SERVER.EXE
SOURCE: Digital Equipment Corporation
ECO INFORMATION:
ECO Kit Name: ALPRMS04_061
ECO Kits Superseded by This ECO Kit: ALPRMS03_061
AXPRMS02_061 (AXPRMS)
AXPRMS01_061
ECO Kit Approximate Size: 4644 Blocks
Kit Applies To: OpenVMS Alpha V6.1, V6.1-1H1, V6.1-1H2
System/Cluster Reboot Necessary: Yes
Installation Rating: 1 - To be installed on all systems running
the listed version(s) of OpenVMS.
NOTE: In order to receive the full fixes listed in this kit,
the following remedial kits also need to be installed:
ALPSYS17_061
IMPORTANT NOTES:
o It is strongly recommended that this kit be applied immediately
after any system is upgraded to OpenVMS Alpha V6.1, V6.1-1H1, or
V6.1-1H2, before production activity begins on the system.
o In order for your system to operate properly, the ALPSYS17_061 kit
MUST be installed prior to installing the ALPRMS04_061 kit. If the
ALPSYS17_061 has not been installed and you attempt to install the
ALPRMS04_061 kit, the installation will fail.
ECO KIT SUMMARY:
An ECO kit exists for RMS and CONVERT Utilities on OpenVMS Alpha V6.1.
This kit addresses the following problems:
Problems addressed in ALPRMS04_061 kit:
o Make last chance handler more robust when process that has file
opened with global buffers enabled is terminated with STOP/ID.
STOP/ID invokes the RMS abort last chance handler
(RM$LAST_CHANCE). Problems addressed include:
o A potential deadlock in the RMS last-chance handler if the
same file is open more than once within a process or
subprocess and it has global buffers set on it.
o A potential fatal KRNLSTAKNV system crash due to nonfatal
RMS bugchecks from failing $GETLKI calls which cause error
rundown recursion until the stack fills up.
These problems are fixed in OpenVMS Alpha V7.1.
NOTE: Until the system can be upgraded to OpenVMS Alpha V7.1
or this remedial kit can be installed, these symptoms may be
avoided by refraining from the use of the 'STOP/ID' DCL
command. Instead, if a process needs to be terminated, (which
should be a rare event), use a program that does a $FORCEX,
waits a few seconds, and then does a $DELPRC.
o Fix to exclude block I/O access from XAB area maximum check on
the open of an indexed file.
This problem is fixed in OpenVMS Alpha V7.1.
o This kit contains a fix for compressed primary key problem that
occurs in the context of record deletes being done using sequential
access rather than keyed access.
While sequentially walking through (accessing) the primary data
records in an indexed file (a primary key is a string key with
key compression enabled), a delete is done of one of the records.
The following error may be returned when an attempt is made to
sequentially get the next record:
RMS-F-BUG, fatal RMS condition, process deleted.
The file is not left corrupted. A convert of the file will
leave the file in a consistent state.
This problem is fixed in OpenVMS Alpha V7.1.
o Fix for crash with kernel-mode caller.
A kernel-mode caller may crash with KRNLSTAKNV (kernel stack
not valid). A page in the kernel stack has the 'fault on
write' bit set.
This problem is fixed in OpenVMS Alpha V7.1.
o Fix for RMS bugcheck NOTLOCKED (R2=FFFFFFEF) on shared
sequential file.
Process is delivered blocking AST to release file lock. Before
releasing file lock, it has to release lock it holds on EOF
buffer. NOTLOCKED bugcheck is returned if process is holding
PW (protected write) rather than EX (exclusive) lock on EOF
buffer. The check should be for either EX or PW lock.
This problem is fixed in OpenVMS Alpha V7.1.
o A loop may occur at the EXEC-MODE AST level during the deletion
of network logical links. The process that loops cannot be deleted
and will be an application that does network operations.
RMS maintains a process cache of recent logical links for a
period of time in an attempt to reuse them and save on image
and process activations. When a link becomes inactive and is
added to the cache, RMS deletes any links in surplus of three.
In walking the link cache queue to delete these links, there is
a potential race condition that could result in a stale pointer
being used after a stall, which leads to the loop.
This problem is fixed in OpenVMS Alpha V7.1.
o This kit containes an autoextend fix for shared block I/O write
which requires use of UPI option which disables any locking
(file or EOF bucket).
In this particular case with no locking synchronization, it is
possible if two or more processes or two or more asynchronous
streams are writing to a file, that two or more autoextends may
occur concurrently. Though the file system will do the extends
synchronously, the processing of the extend result may complete
out of sequence and result in RMS bugcheck FTL$_BADEBKHBK
(R2=FFFFFFDC).
This problem is fixed in OpenVMS Alpha V7.1.
o Improved error reporting on ASB allocation failure. If there
is a user context to return the error to, RMS$_DME will now be
returned instead of RMS$_BUSY.
When RMS allocates an Asynchronous Status Block (ASB) for an
RMS thread from process-permanent pages (PIOPAGES), and the
allocation attempt fails due to lack of memory, the error is
not properly handled: in some cases the error is ignored,
leading to subsequent hangs and inconsistencies, or else the
true cause of the error is obscured by overlapping bugchecks.
These allocation failures are most likely to be encountered
with RMS journaling applications, and can be avoided by
increasing the PIOPAGES SYSGEN parameter.
This problem is fixed in OpenVMS Alpha V7.1.
o Non-indexed file opened with an indexed file XAB may ACCVIO.
If a non-indexed file (sequential or relative) is opened with
one or more XABs specific to indexed files (XABKEY, XABSUM, or
XABALL) chained to the FAB, the process may either:
o Terminate (SYSGEN parameter BUGCHECKFATAL not enabled) with
an access violation (ACCVIO); or
o Crash the system (BUGCHECKFATAL enabled) with a SSRVEXCEPT
ACCVIO.
A necessary condition for this to occur is that the virtual
address space used for the program region (P0) must be more
than half a gigabyte. In other words, it requires both a very
large application and one that has indexed XABs chained to the
FAB for a non-indexed file. We are aware of only one application
that this problem has occurred with to date: ALL-IN-1.
This problem is fixed in OpenVMS Alpha V7.1.
NOTE: Until the system can be upgraded to OpenVMS Alpha V7.1
or this remedial kit can be installed, these symptoms may be
avoided by modifying the application's source code. Remove any
of the indexed XABs (XABSUM, XABKEY, or XABALL) from the chain
off the FAB for any file that is not an indexed file.
o Fix for SDA-W-FAOERR when displaying RMS FWA (File Work Area).
The contents of a buffer (LOGNAM) are being displayed after the
space have been returned and reallocated for some other use.
This problem is fixed in OpenVMS Alpha V7.1.
o Statistics setting on a file is lost from the FDL produced by
ANALYZE/RMS/FDL. File monitoring is always set to 'no' whether
it is enabled or not. The problem is not with ANALYZE[/RMS],
but with RMS's processing of an XABITM list.
This problem is fixed in OpenVMS Alpha V7.1.
Problems Addressed in the ALPRMS03_061 Kit:
o A system or cluster may hang with several processes in an LEF (local
event flag) state waiting for a lock to be converted or granted.
The symptoms for this problem are:
o A process holds a lock (PR, PW, or EX) on a widely used file.
o The lock held is blocking other processes (the lock resource
has one or more processes in the conversion or wait queue)
o The blocking process, once identified, is found to have
executive-mode AST delivery disabled. If the file is one used
for login (such as RIGHTSLIST.DAT), no new processes can login
to the system or cluster.
With exec-mode ASTs disabled, a blocking AST cannot be delivered to
a process for it to release a lock it is holding that is blocking
some other process. For a widely used file this can lead to a queue
of processes waiting for a lock. So, what begins with one process
hung in a wait state because it can't be interrupted by an exec-mode
AST, can slowly lead to many processes on a system or cluster being
in a wait state.
This problem is fixed in OpenVMS Alpha V6.2.
o The symptom for this problem is that a process remains in an LEF
(local event flag) state, though the operation it is waiting for has
completed.
To determine that the problem has occurred, you may use SDA to
locate the RAB or FAB for the pending RMS operation, and examine the
status longword (RAB$L_STS or FAB$L_STS). If it is non-zero, and
the process remains in LEF, the problem has occurred.
This problem is fixed in OpenVMS Alpha V6.2.
o Fix for DAP (Data Access Protocol) violation on $WRITE. This is
only a problem when the $WRITEs are issued on a Phase V node.
Under DECnet/OSI, an RMS process issuing $WRITEs to a remote node
which encounters a non-continuable error (such as device-full) may
hang along with the remote FAL. An example is using BACKUP on a
Phase V node to write a saveset to another node (which may be Phase IV
or Phase V) when the remote device fills up.
This problem is fixed in OpenVMS Alpha V6.2.
o A process hangs during file lookup. This problem happens only in
the context of multithreaded and/or AST driven servers that have
multiple $CREATEs, $OPENs and/or $PARSEs outstanding. The symptoms
include:
o The process hangs (executing inside both RMS and the XQP), and
file lookups fail with RMS$_FNF or RMS$_DNF errors and a
secondary status of SS$_NOSUCHFILE.
o A file of the same name is returned from a superior directory,
also of the same name (e.g., a lookup of [A.A]B.C might return
[A]B.C).
This problem is fixed in OpenVMS Alpha V6.2.
o Make last chance handler more robust to re-entry to avoid potential
system crash when process that has file opened with global buffers
enabled is terminated with STOP/ID.
STOP/ID invokes the RMS last chance handler (RM$LAST_CHANCE). The
last chance handler was written with the assumption that it will be
called exactly once during any process rundown. An example of this
assumption is the dequeuing of the global buffer header (GBH) lock,
though the pointer to it is not cleared.
Evidence from several recent crash dumps suggests that it may be
possible for a STOP/ID to result in the last chance handler being
entered more than once. If this happens and the process being
stopped has global buffers enabled on one or more files, a fatal
KRNLSTAKNV system crash may result due to nonfatal RMS bugchecks
from failing $GETLKI calls which cause error rundown "recursion"
until the stack fills up.
This problem is fixed in OpenVMS Alpha V7.0
o Audit check fix for installed /OPEN images. Incomplete security
success audit information is generated upon image activation of
installed /OPEN images in some cases.
This problem is fixed in the next release after OpenVMS Alpha V7.0.
o Fix for Known File Entry (KFE) file open SSRVEXCEPT (Unexpected
system service exception) due to ACCVIO in RM$KNOWNFILE (RMS0OPEN).
Heavy concurrent INSTALL and F$FILE_ATTRIBUTE usage may cause
locking conflicts accessing the KFE list, which can result in a bad
KFE pointer being handed to RMS leading to ACCVIO.
This problem is fixed in OpenVMS Alpha V7.0.
o Quota borrowing implemented for RMS internal AST to avoid an RMS
bugcheck. An RMS process that's operating at ASTLM may bugcheck on
an AST generated internally by RMS for rescheduling execution. This
extra AST is currently charged against the user's ASTLM.
This problem is fixed in OpenVMS Alpha V6.2.
o This fix addresses several synchronization problems with $WAIT on a
FAB:
o Status is currently cleared on the initial entry to $WAIT so it
can destroy the status of the operation it is checking.
o $WAIT on a $CLOSE does not wait.
These problems are fixed in OpenVMS Alpha V6.2.
o An update to the end-of-file for a sequential file with an undefined
(UDF) record format inappropriately rounds the first free byte (FFB)
up to an even number. This results in the last record, if an odd
size, containing one more byte than it should.
This problem is fixed in OpenVMS Alpha V6.2.
o Add check of extend values returned in FIB by file system.
A check has been added to the RMS code to test the assumption that
if the file system (XQP) returns no error for an autoextend, the
values returned in the FIB (FIB$L_EXSZ and FIB$L_EXVBN) result in an
increase in the size of the HBK. If there is an increase (as
expected), then the HBK in the IFAB is updated. If there is no
increase, an RMS bugcheck (FTL$_BADEBKHBK) is returned and the
process is terminated. This check was added to guard against the
possibility of an infinite loop that might occur under the remote
circumstance of the XQP failing to update the FIB values for a
successful extend operation in the context of a process in an
overdrawn quota state.
This problem is fixed in OpenVMS Alpha V6.2.
o Fix added to change dangling SIDR (Secondary Index Data Record)
error status from RMS$_BUG to RMS$_RNF.
This fix changes the handling of a dangling SIDR pointing to a
nonexistent record from an RMS$_BUG status to a status of RMS$_RNF.
Also a cleanup of the SIDR element is done if the process doing the
$GET (or $FIND) has WRITE access to the file.
If a process has any modified buckets whose writeback to disk has
been deferred (deferred-write option) when a power failure, system
crash or a STOP/ID occurs, the indexed file may be left with a
SIDR pointing to a nonexistent primary data record. This can happen
when some other process has requested through the lock manager
access to a modified SIDR data bucket but not the primary data
bucket. This request triggers a blocking AST to writeback the SIDR
bucket to disk. If the process with the modified buckets closes the
file or exits normally thereafter, then any remaining modified
buckets are written back to disk and now the SIDR points to a
primary data record that exists in the "disk" file. It is the
combination of both deferred-write and the rare event of a power
failure, system crash or a STOP/ID that may leave an indexed
file with this inconsistency. (Note that enabling RMS RU journaling
automatically enables deferred-write.)
If a subsequent $GET or $FIND via a secondary key attempts to
retrieve a nonexistent primary data record pointed to by a dangling
SIDR, then without this fix an RMS$_BUG status (with an associated
RAB$L_STV of RMS$_RNF) will be returned. Although the text of the
message associated with RMS$_BUG indicates the process has been
terminated, in the case of a dangling SIDR, this is not true. Even
though the process is not terminated, a side-effect of handling this
condition as a bug is that the internal context for the record
operation (IRAB) is not updated. The result of this is that it is
impossible for a process to recover from this error.
RMS$_BUG is a status returned for a broad spectrum of conditions
that RMS views as suggesting some risk in continuing any further
processing. In its regular use of RMS$_BUG, it identifies the
specific condition associated with the bug in R2 and does in fact
terminate the process. In the case of a dangling SIDR pointing to a
nonexistent record there is no risk in continuing.
The handling of the error status varies depending on whether the
dangling SIDR is encountered as part of an exact match keyed access
rather than either a sequential access or a NXTEQ/NXT keyed access:
o Exact match keyed retrieval
Status (RMS$L_STS) of RMS$_RNF is returned. RNF is a status
routinely checked for by existing applications so this change
will not require modification of existing applications. In
addition a nonzero status value is returned in the status value
(RAB$L_STV) to indicate the special case of a dangling SIDR
pointing to a record-id that is inconsistent with the
information in the primary data bucket header. Except for this
condition, a status of RMS$_RNF will continue to have a status
value of zero associated with it.
This has the advantage of conveying the information to the user
that this special case of a dangling SIDR has occurred while
allowing the application to continue. Applications can build
into the design of any exact keyed retrievals via a secondary
key, checking the associated status value and signaling it when
it is nonzero, if they so choose, or simply displaying an
informational text message and continuing.
o sequential access or NXTEQ/NXT (KGE/KGT) keyed retrieval
No explicit status will be returned to the user. When this
condition is detected, RMS silently advances to the next SIDR
element or array attempting to find the next "non-deleted"
primary data record via the secondary key.
This problem is fixed in OpenVMS Alpha V6.2.
o Fix to validate key-of-reference for $CONNECT & $REWIND.
Key-of-reference in RAB (RAB$B_KRF) is one of the input fields to
both $CONNECT and $REWIND. Documentation for each of these
services indicates RMS$_KRF (invalid key-of-reference) as one of
the possible error statuses that can be returned by each of these
services. No validation, however, is done of key-of-reference by
either of these services. Currently, it is not done until either
a $GET or $FIND operation.
This problem is fixed in OpenVMS Alpha V6.2.
o Fix for relative file fixed record format file.
This problem is restricted to a relative file with a fixed length
record format that has global buffers enabled on it. An improper
move instruction (MOVW that has been corrected to MOVZWL) was used
to move maximum record size (word in length) into a register for
relative file with fixed length record format. Prior to V6.1 this
register happened to always be clear so no problem was encountered.
For V6.1, if the file has global buffers enabled, the register is
not clear. This results in an RMS-F-UBF (invalid user buffer) error
being (erroneously) reported to the user.
This problem is fixed in OpenVMS Alpha V6.2.
o A key-greater-than $get or $find to a newly created indexed file
that has not had a single record added to it (primary index is
uninitialized) returns a RNF (record not found) error to user.
A key-less-than $get or $find to a newly created indexed file that
has not had a single record added to it (primary index is
uninitialized) returns an IDX (index not initialized) error to user.
While strictly speaking the errors could be different for the two
cases, this requires different handling by the user. The error
reported to the user for a key-less-than $get or $find in the case
of a newly created file that has never had any records added to it
(index is uninitialized) has been changed to be consistent with the
error reported under the same circumstances for a key-greater-than
$get or $find. Specifically, the error that will now be reported
for both cases will be RNF (record not found).
This change was made in OpenVMS Alpha V6.2.
o RMS-F-AID error (invalid area ID) is not reported on the open of an
indexed file with an XABALL declaration that has an area ID that is
exactly one greater than the maximum number of areas in that indexed
file.
If FDL$GENERATE is used to generate an FDL from this file open, a
bad area definition in the FDL will be created.
This problem is fixed in OpenVMS Alpha V7.0.
o An ISAM RMS nonfatal bugcheck is returned by RM$EXPAND_KEY (module
RM3PCKUNP). This bugcheck is forced if key expansion for the
deletion of a SIDR would overflow a SIDR data bucket.
The current problem occurs in the context of an $UPDATE that changes
a secondary key value (with key compression enabled) if both the
SIDR data bucket freespace is exactly equal to the bucket size minus
one and the key expansion of the next SIDR record (if any) results
in a gain in bytes exactly equal to the number of bytes to be
deleted. For both conditions to be satisfied makes the probability
of this problem's occurrence extremely rare.
The update to the primary record will have been completed; the
problem occurs during the delete operation to remove the old
secondary key value after the new updated secondary key value has
been added. The file is not left corrupted. A convert of the file
will rebuild the secondary indexes and leave the file in a
consistent state.
This problem is fixed in OpenVMS Alpha V7.0.
o Fix for a SIDR compressed key expansion problem that could result in
a corrupted SIDR bucket. The conditions that are required make the
probability of its occurrence extremely rare.
For this problem to occur requires both (1) the presence of many
duplicate records for a secondary key, (2) a secondary key of string
type that is 9 bytes or more in length, and (3) a very particular
key pattern which results in a number of bytes at the front of the
key being compressed. The problem occurs in the context of a bucket
split when the current design moves an entire SIDR array into a new
bucket. It is possible that when the compressed secondary key for
that SIDR array is expanded in its position as the first record in
the new bucket, it may grow larger than the bucket.
ANALYZE/RMS_FILE will most likely report the following error:
"Invalid first free byte offset in bucket header."
Since the problem is with a secondary key bucket, a convert will
recover all the data records in the corrupted file.
This problem is fixed in OpenVMS Alpha V7.0.
o Fix to $TRUNCATE for zero-byte record problem for STREAM_LF
sequential file.
The RMS service $TRUNCATE does not properly handle a zero-byte
record in a STREAM_LF file. Truncating a STREAM_LF file after
reading a zero-byte record will fail with an RMS-F-CUR (no current
record) error. This problem is restricted to a sequential file with
a STREAM_LF record format.
This problem is fixed in OpenVMS Alpha V6.2.
o Running APL (which creates a system logical name that points to the
terminal) and then stopping (STOP/ID) the process results in the
terminal being unavailable. Subsequent attempts to login to this
terminal fail. This is due to the channel not being properly deassigned.
This problem is fixed in OpenVMS Alpha V6.2.
o Fix for RMS-F-MBC (invalid multi-block count) in callable CONVERT.
If callable convert (CONV$CONVERT preceded by CONV$PASS_FILES and
CONV$PASS_OPTIONS) is called multiple times, with the FDL option
enabled in an earlier call and disabled in a later call, the call
may fail with the following error:
%CONV-F-OPENOUT, error opening as output
-RMS-F-MBC, invalid multi-block count
The problem was introduced in OpenVMS Alpha V6.1, and is due to a
stale global pointer. If the user does multiple calls, with the
FDL option set on an earlier call and disabled on a later call, it
currently will use a stale global pointer to virtual memory (VM)
that was released on the earlier call to set the multi-block count
(MBC) in the output file's RAB. If the VM has been reallocated for
other purposes, an invalid MBC can result.
The failure is unpredictable since it requires the space to have
been reallocated and some specific bytes within this space to have
been overwritten with invalid values.
This problem is fixed in OpenVMS Alpha V6.2.
o The following error will be reported in converting a VFC-format file
to a fixed-format file using the /PAD qualifier:
%LIB-F-BADBLOADR, bad block address
The problem is restricted to a convert job that involves an input
file with a VFC record format that is being converted to an output
file with a fixed record format using the /PAD qualifier.
This problem was introduced in OpenVMS Alpha V6.1 and is fixed in
OpenVMS Alpha V7.0.
Problems Addressed in the AXPRMS02_061 Kit:
o Repeated intermittent SSRVEXCPT bugchecks in UCX$inet_acp
following an RMS bug check R2=FFFFFFFD.
This problem is fixed in OpenVMS Alpha V6.2.
o If one of the files being closed by the RMS abort rundown
procedure has global buffers set on it and there is an
end-of-file (EOF) sublock associated with the system-owned global
buffer header (GBH) parent lock, then the EOF sublock should be
dequeued as part of rundown. Without the fix an incorrect branch
instruction results in a $DEQ being done if the lock-id is zero
(SS$_INVLOCKID returned) rather than nonzero.
This problem is fixed in OpenVMS Alpha V6.2.
o If the data bucket pointed to by the level 1 "high key" index
value for a duplicate key (primary or secondary) has one or more
continuation buckets chained to it and the last bucket is full,
under rare conditions a race condition may occur when two
processes are concurrently adding (putting) records to a
duplicate key, each with a key value higher than the highest key
value currently stored in the file.
The user-visible symptoms are as follows:
* If the index is compressed, an infinite loop will occur due
to an attempt to add a duplicate entry to the level 1 index
bucket.
* If the index is not compressed, a duplicate entry will
successfully be added to the level 1 index bucket.
NOTE: An index bucket should never have a duplicate entry
for any index value.
In either case this can result in:
* Records in the level 0 data chain being out of sorted order.
* Records being hidden from a keyed lookup (though visible
with a sequential scan).
Workaround:
The file should be converted.
This problem is fixed in OpenVMS Alpha V6.2.
o Fix to convert for undefined record format input file. In the
case of an UNDEFINED input file, the convert may report:
%LIB-F-BADBLOADR, bad block address or %FDL-E-INVBLK,
invalid RMS control block at virtual
address 000000
Workaround:
The records would have to be read and rewritten to the new
file using a program.
This problem is fixed in OpenVMS Alpha V6.2.
o If the caller calls RMS in EXEC mode and specifies an ERR ast and
an error occurs early in the RMS service, the user will see an
ASTFLT bugcheck or a system service exception. The simplest way
to replicate the problem is by issuing the following command in
mail:
MAIL> SHOW FORWARD/USER=*
when there are no user forwarding addresses.
This problem is fixed in OpenVMS Alpha V6.2.
Problems Addressed in the AXPRMS01_061 Kit:
o A fix to the Convert/Reclaim utility implemented in OpenVMS VAX V6.1
and OpenVMS Alpha V6.1 contains a new feature that may leave a
secondary index data (SIDR) bucket in a state that under the right
circumstances could subsequently become corrupted.
Any file with at least one secondary key allowing duplicates that
has a Convert/Reclaim performed on it on an OpenVMS V6.1 system
is a possible candidate for this problem. MAIL.MAI files are
prime candidates, since the VMS Mail utility performs automatic
Convert/Reclaim operations on these files transparent to the user.
Any of the following errors may be returned when corruption of
the user's mail file is encountered:
- RMS-F-IRC, illegal record encountered; VBN or record number = nn
- MAIL-W-WRITERR, error writing 'file-spec'
- MAIL-F-CODERR, internal coding error. Please submit an SPR.
Prior to noticing the corruption, a CONVERT/RECLAIM has been
performed on the mail file via one of the following methods:
+ If AUTO_PURGE is enabled or a PURGE command issued and more
than 32767 bytes of deleted messages have accumulated, Mail
will automatically perform a CONVERT/RECLAIM on the user's
behalf.
+ The user issues a PURGE/RECLAIM in Mail.
+ A CONVERT/RECLAIM MAIL.MAI command is issued at DCL.
Any new mail messages received after the corruption occurs may be
orphaned. This occurs because the external mail file
(MAIL$xxx.MAI) is created before the entry into the mail file
(MAIL.MAI) fails.
The order of events leading up to the problem is as follows:
1. Even though a SIDR bucket is found to contain only deleted
records, under a few rare conditions, Convert/Reclaim may
assess that the bucket cannot be reclaimed. For any such
"empty" bucket that is not reclaimed, the V6.1 fix adds the
resetting of the freespace to the beginning of the data
portion of the bucket immediately after the bucket header
(byte offset hex '0E').
2. At the end of the Convert/Reclaim, the data file is not
corrupted. An ANALYZE/RMS of this file will show no errors.
However, at some subsequent time point if a new record is
added to the data file, one of the SIDR buckets may become
damaged if the data file has the following set of
characteristics:
o At least one secondary key that allows duplicates
o At least three SIDR buckets associated with a secondary
key allowing duplicates were not reclaimed
o The second of these SIDR buckets had its freespace
reset to hex '0E'
o A continuation bucket for a different key value
immediately precedes the SIDR bucket with the freespace
of hex '0E'
3. If all of these characteristics are met, then after the
Convert/Reclaim, the SIDR bucket with the freespace of hex
'0E' may become damaged if the continuation bucket preceding
it ever becomes full and another record with that same key
value is added. The problem occurs because the RMS code
has a built-in assumption that a SIDR bucket will always
contain a key value at the beginning of the bucket, as a
place holder for the index entry above, even if all the
records in the bucket have been deleted.
Any access (either an attempt to do a read or another record
insert) to this SIDR bucket once it has become damaged returns
the following error:
RMS-F-IRC, illegal record encountered; VBN or record number = nn
And once the SIDR bucket has become damaged, an ANALYZE/RMS of
the file reports the following two errors:
*** VBN nn: Data record spills over into free space of bucket.
Unrecoverable error encountered in structure of file.
If there is a need to verify that this error is due to this
specific problem, then use ANALYZE/RMS/INTERACTIVE to position to
the bucket identified in the error (POSITION/BUCKET vbn). If the
corruption is due to this problem, "Free Space Offset" will be
hex '13' (except for larger VBNs it may be either hex '14' or '15').
Since any damage is restricted to the secondary index structure,
all the data records can be recovered.
This problem was introduced in OpenVMS Alpha V6.1 and is fixed in
OpenVMS Alpha V6.2.
Workaround:
No data records will be lost due to this problem. Any damage is
restricted to the secondary index structure. Therefore, all the
records can be recovered and the secondary index structure
rebuilt by performing a full Convert on the file:
CONVERT infile outfile
o When converting a record from any record format without FORTRAN
carriage control (in any type of file organization) to a
fixed-length record format with FORTRAN carriage control, the
last character from the input record may be overwritten by the
pad character if the record is short and requires padding.
This problem is fixed in OpenVMS Alpha V6.2.
o EDIT/FDL/NOINTERACTIVE may re-define key segments and the PRINT
record attribute was not accepted for Relative files.
This problem was introduced in OpenVMS Alpha V6.1 and is fixed in
OpenVMS Alpha V6.2.
INSTALLATION NOTES:
In order for the corrections in this kit to take effect, the system must
be rebooted. If the system is a member of a VMScluster, the entire
cluster should be rebooted.
If you cannot reboot the other nodes that use this common system disk
then perform the following:
o The CONVSHR.EXE image must be re-installed on the other nodes
that use the same system disk as the one this kit was installed
on. To do this perform the following command:
$ INSTALL REPLACE SYS$SHARE:CONVSHR.EXE
In order for your system to operate properly, the ALPSYS17_061 kit MUST
be installed prior to installing the ALPRMS04_061 kit. If the
ALPSYS17_061 has not been installed and you attempt to install the
ALPRMS04_061 kit, the installation will fail.
It is strongly recommended that this kit be applied immediately after any
system is upgraded to OpenVMS Alpha V6.1 before production activity
begins on the system. Once the kit is applied, no data files thereafter
using Convert/Reclaim will be left with unreclaimed buckets with a
freespace of hex '0E'; therefore, this problem will not occur.
If the kit is applied after there has been some production activity on a
V6.1 system, the remedial images do not correct any files that already
contain any buckets with a freespace of hex '0E'. A full convert
(CONVERT infile outfile) is needed to take care of this. In other words,
once the kit is applied, a file can experience the same corruption
thereafter if Convert/Reclaim was used on it prior to applying the kit
and it was not converted after the kit was applied. Users should not rely
on an ANALYZE/RMS revealing an error since the right conditions leading
to the corruption may not have occurred yet.
For those who are already using OpenVMS Alpha V6.1, once the remedial kit
has been applied, they should recommend to all their users that
performing a full convert at a convenient time is advisable on any
MAIL.MAI files or any other data files with which Convert/Reclaim may
have been directly used since upgrading to OpenVMS Alpha V6.1. A
precautionary convert will preclude the possibility of an untimely
occurrence at some future time.
This patch can be found at any of these sites:
Colorado Site
Georgia Site
Files on this server are as follows:
alprms04_061.README
alprms04_061.CHKSUM
alprms04_061.CVRLET_TXT
alprms04_061.a-dcx_axpexe
|