RTR V3.2 RTRI255 RTR V3.2 for Windows NT Intel ECO Summary
TITLE: RTR V3.2 RTRI255 RTR V3.2 for Windows NT Intel ECO Summary
Copyright (c) Compaq Computer Corporation 1999. All rights reserved.
Modification Date: 28-DEC-99
Modification Type: Updated Kit Supersedes RTRI249
PRODUCT: Reliable Transaction Router (RTR) V3.2 for Windows NT/Intel
OP/SYS: Windows NT
SOURCE: Compaq Computer Corporation
ECO INFORMATION:
ECO Kit Name: RTRI255
ECO Kits Superseded by This ECO Kit: RTRI249
ECO Kit Approximate Size: 4868 Blocks
2492416 Bytes
Kit Applies To: RTR V3.2
System/Cluster Reboot Necessary: Unknown
Rolling Re-boot Supported: Information Not Available
Installation Rating: INSTALL_UNKNOWN
Kit Dependencies:
The following remedial kit(s) must be installed BEFORE
installation of this kit:
None
In order to receive all the corrections listed in this
kit, the following remedial kits should also be installed:
None
ECO KIT SUMMARY:
An ECO kit exists for Reliable Transaction Router (RTR) V3.2 on
Windows NT V4.0. This kit addresses the following problems:
Problems Addressed in the RTRI255 Kit (ECO-4):
o 14-1-743 Wrong return status RTR_STS_COMSTAUNO
In RTR V3.2 it was sometimes possible for a transaction
which had not yet been voted on by a server which exits
in mid-transaction to be aborted with incorrect status
RTR_STS_COMSTAUNO.
o 14-1-805 Attempt to create a partition that already
exists returns incorrect status
An attempt to create a partition that already existed
used to return the error KRINUSE (key range in use).
This has been superseded by the more explicit PRTALREXI
(partition already exists).
o 14-1-813 MONITOR SYSTEM shows WARNING on calls due to
invalid time call
The MONITOR SYSTEM monitor picture would sometimes
incorrectly display a warning state for the CALL row.
o 14-1-841 Replayed shadow transaction stuck in VOTED
The implementation of RTR's cooperative recovery
protocol algorithm has been enhanced so that some
situations which would previously hang during permanent
network link outages are now recovered correctly using
the remaining connections.
o 14-1-842 Transactions which do not specify a timeout abort
If a frontend failed over to another router, then failed
back to the original router, transactions which were in
progress could sometimes be rejected with status RTR_
STS_TXTIMOUT.
o 14-1-845 Transactions remain in VOTED state
In earlier versions of RTR it was occasionally possible
for transactions to remain hanging in VOTED state on a
shadow primary backend and be aborted with status RTR_
STS_FELINLOS on the secondary backend after network link
failures in a slow WAN environment.
o 14-1-846 Transactions remain in SENDING state
In previous versions of RTR it was occasionally possible
for a transaction to remain hanging in SENDING state
on a backend after a network partition had forced the
backend to lose quorum.
o 14-5-111 Certain RTR commands now recorded in the RTR
operator log
Operator log files created by previous versions of
RTR could sometimes be difficult to interpret. By
recording certain RTR commands, such as START RTR and
CREATE FACILITY, the RTR log file has become easier to
interpret.
o 14-5-156 Logging partition state transitions insufficient
Previous versions of RTR did not report backend
partition state transitions in the operator log file
with sufficient detail.
Backend partition state transitions are now reported as
follows:
- Previously unlogged state transitions are recorded in
the operator log with the new PRTSTATRA message.
- The PRTBEGIN message is no longer generated.
- The PRTCREATED and PRTEND message formats have been
changed to match that of the PRTSTATRA messages.
o 14-8-267 V3.1D-to-V3.2 Journal incompatibility corrected
If you upgrade from V3.1D to earlier versions of V3.2,
it was possible to encounter situations which caused the
RTR ACP to crash.
o 14-8-287 Named partition state change caused crash
When using the CREATE PARTITION command, it was possible
for RTR to crash on the backend node if the last channel
using the partition is closed at the same instant that a
state-change message from the router is pending.
Problems Addressed in the RTRI249 Kit (ECO-1):
o 14-1-433 Show transactions not recovered on link break
/reconnect
If a secondary shadow backend lost its link to the RTR
router after the router had sent a vote request, and the
server on the primary shadow accepts the transaction,
then in unusual circumstances it was possible that
the transaction would not be immediately recovered on
the secondary shadow after the link to the router was
re-established. In such cases it required a cycle of
the servers on the secondary site for the remembered
transaction to be recovered from the primary shadow
journal.
This has now been fixed.
o 14-1-617 Problems with DUMP JOURNAL
In previous versions of RTR, qualifiers which required a
value did not generate an error if the value was not
supplied or was supplied incorrectly. Incorrect or
missing values now generate an error message.
If a string of less than five characters was passed for
partition record class, the partition record counter
was not updated and the record was not available. These
problems have been fixed by comparing each character
instead of five characters at a time.
o 14-1-777 Transaction state is not getting EXCEPTION
after issuing rtr_close/imme
SET PARTITION /RECOVERY_RETRY_COUNT is new functionality
implemented in RTR V3.2. The scope of this command was
not fully documented, and is clarified here.
If an application server dies while processing a
transaction recovered from RTR journal, then RTR will
present the transaction to another (concurrent or
standby) server. The RECOVERY_RETRY_LIMIT indicates
the maximum number of times the transaction should be
presented to a server for recovery before being written
to the journal as an exception.
There are two types of recovery operations where
transactions are recovered from journal: local recovery
and shadow recovery. Shadow recovery is the process
of recovering the remembered transactions written to a
primary shadow journal while the secondary shadow site
is down.
The SET PARTITION /RECOVERY_RETRY_COUNT parameter
does not have an effect on remembered transactions
recovered during shadow recovery. That is, if there
is a killer transaction remembered in the journal on
a primary shadow node, on this node RTR does not count
the number of times the transaction is recovered by a
recovering secondary shadow node. The way to ensure that
a remembered transaction will be exceptioned by RTR is
by starting a sufficient number of concurrent servers on
the recovering secondary shadow node.
For this reason, RTR recommends that the number of
concurrent secondary shadow servers started is greater
than the value set for the RECOVERY_RETRY_LIMIT on a
partition. This will ensure that a remembered (killer)
transaction being recovered from a primary shadow
journal will be exceptioned if the retry limit is
exceeded.
Only those transactions that have reached voting stage
on a server can be exceptioned. If a server always dies
before voting on a transaction, then the transaction
will be aborted by RTR after the third try. This is
a hard-coded limit (the so called "three strikes and
you're out" feature).
o 14-1-791 Backends erroneously remain inquorate after
routers trimmed
In versions V3.1D-eco14 and V3.2 of RTR it was sometimes
possible for nodes to erroneously remain inquorate
following a TRIM FACILITY operation.
This has now been fixed.
o 14-1-792 Revised rtrreq.c and rtrsrv.c sample RTR
applications
The sample client and server used in the IVP have been
extensively revised. Please pay special attention to the
comments which explain how to write a wakeup handler,
and comments drawing attention to several common
programming mistakes we have seen in RTR applications.
o 14-1-50 Looping RTR process for empty node string, e.g.,
/NODE=dna.
Specifying an incomplete node specification, such as
one with only the protocol prefix, e.g., "RTR SHOW
RTR /NODE=dna." could cause the RTR process to loop,
consuming CPU.
This problem has been fixed.
o 14-1-582 ACP access violation
If a number of concurrent servers died in sequence
while processing the same transaction, then under rare
circumstances it was possible the ACP could also abort.
This was due to a counter being incremented incorrectly
and has now been fixed.
o 14-1-760 ACP crashed when modifying journal size
After a journal had been modified, the Flow Control
subsystem of RTR was not properly updated with the
new size. This could result in a hang or crash
situation even though the journal size was increased
to accommodate increased traffic.
This problem has been fixed.
o 14-1-763 rtr_close_channel fails for distributed
transaction
Calling rtr_close_channel while a distributed
transaction was pending caused an incorrect status to
be returned.
The correct status is now returned.
o 14-1-772 CALL CLOSE_CHANNEL defaults to IMMEDIATE
The flag RTR_F_CLO_IMMEDIATE is a new flag added in RTR
V3.2 that allows the caller to close a server channel
without acknowledging the transaction on the channel.
By default, the flag is not set when calling the rtr_
close_channel API. However, the /IMMEDIATE qualifier
is implicitly present in the RTR CLI version of the API
(rtr call rtr_close_channel).
Because this is incompatible with the behavior of
previous versions of RTR, functionality has been
restored to the same as before V3.2. When using the
CLI version of the API (rtr call rtr_close_channel),
/NOIMMEDIATE is now the default.
o 14-1-774 TOOMANCHA and distributed transaction left open
after rtr_open_channel() failure
If rtr_open_channel failed after the RTR acp had been
stopped, then that channel remained available for a
subsequent open. The application could eventually run
out of channels and return RTR_STS_TOOMANCHA.
Now if rtr_open_channel fails after a distributed
transaction has been opened, the distributed transaction
is always closed.
o 14-3-291 SHOW SERVER truncates shd_rec_icpl to shd_rec_ic
Some of the values previously truncated by the brief
SHOW SERVER command are now displayed more fully.
o 14-3-298 Application may crash if invoked before RTR
after a reboot
Normally the RTR executable must have been invoked at
least once since reboot before an RTR application can
be started. If an RTR application is invoked first, the
first RTR api call now always returns RTRNOTSTA, RTR not
started.
o 14-7-420 IOS tid on IP only nodes is not unique
Using previous versions of RTR, if you ran client
applications that used the RTR V2 API on systems that
had DECnet disabled, then there was a remote possibility
that the same transaction identifier could be generated
on two such systems if RTR was started on both systems
within milliseconds of each other.
This has now been fixed.
o 14-8-215 Faster loading of large journals on first
CREATE FACILITY
RTR now takes much less time to load journals containing
a large number of journaled transactions.
o 14-8-257 The broadcast message was not delivered from BE
to client
If a frontend loses the connection to its original
router, and is the first frontend to connect to the
router it fails over to, then the frontend may stop
receiving broadcasts. Further, backends could also fail
to receive broadcasts delivered by routers added to a
facility after the server applications have started.
These problems have been fixed.
o 14-8-262 RTR has both backends as primary for some
transactions (STR#1885690)
In a partitioned network situation (when each of two
routers have access to only half of the backend nodes),
RTR will choose the router with the lower network
address as the one that remains or becomes active. In
previous versions of RTR, this would sometimes result in
both sets of backends becoming active, due to a problem
with the network ID comparison algorithm.
This has been corrected.
o 14-1-305 The RTR V2 command DCL_TX_PRC() treated unsigned
quadword keys as signed.
The RTR V2 interface now handles unsigned quadword keys
correctly.
o 14-1-472 Winsock2 error messages logged as uninterpreted
numbers
Winsock2 error messages were sometimes logged as
uninterpreted numbers, e.g., 10091.
RTR now includes a translation table for all known
Winsock2 messages so that the symbol and meaning can
be written in the log file.
o 14-3-301 Winsock errors logged on one line with
no garbage
Winsock errors are now logged like any other system call
errors, with no extra new lines or uninitialized garbage
characters.
Known Problems with Workarounds
o 14-3-303 Install procedure needs all rtr processes terminating,
including rtrd
All rtr processes and rtr applications must be terminated
before installing a new version of rtr. After using rtr stop
rtr and rtr disc server please check for any surviving
processes such as rtrd and applications programmed to handle
RTR_STS_NOACP, and terminate any such processes until there
are none left. Note that all the rtr acp and comserver
processes must be terminated before rtrd, otherwise they will
simply create a new rtrd.
On WIN32, the rtrd can be terminated with: rtr disc
server/daemonor by selecting the rtr.exe image and End
Process in the Windows NT Task Manager, or in the Windows
95/98 Ctrl-Alt-Del dialog.
INSTALLATION NOTES:
The Reliable Transaction Router Version 3.2 ECO4 installation
procedure is the same as the installation procedure for RTR
Version 3.2. Refer to the Installation Guide for further
information.
All trademarks are the property of their respective owners.
This patch can be found at any of these sites:
Colorado Site
Georgia Site
Files on this server are as follows:
rtri255.README
rtri255.CHKSUM
rtri255.CVRLET_TXT
rtri255.exe
rtri255.CVRLET_TXT
|