RTR V3.1D RTRAVME010D Reliable Trans. Router V3.1D Alpha ECO Summary
TITLE: RTR V3.1D RTRAVME010D Reliable Trans. Router V3.1D Alpha ECO Summary
Copyright (c) Compaq Computer Corporation 1998, 1999. All rights reserved.
Modification Date: 02-MAR-99
Modification Type: Updated Kit Supersedes RTRAVME08D
PRODUCT: Reliable Transaction Router V3.1D for OpenVMS Alpha (RTR)
OP/SYS: OpenVMS Alpha
COMPONENTS: RTR.EXE
LIBRTR.EXE
SOURCE: Compaq Computer Corporation
ECO INFORMATION:
ECO Kit Name: RTRAVME010D
ECO Kits Superseded by This ECO Kit: RTRAVME08D
ECO Kit Approximate Size: 6149 Blocks
3148288 Bytes
Kit Applies To: RTR V3.1D for OpenVMS Alpha
OpenVMS Alpha V6.1*, V6.2*, V7.0, V7.1*
System/Cluster Reboot Necessary: No
Installation Rating: INSTALL_UNKNOWN
Kit Dependencies:
The following remedial kit(s) must be installed BEFORE
installation of this kit:
None
In order to receive all the corrections listed in this
kit, the following remedial kits should also be installed:
None
ECO KIT SUMMARY:
An ECO kit exists for Reliable Transaction Router on OpenVMS Alpha V6.1
through V7.1. This kit addresses the following problems:
Problems Addressed in the RTRAVME010D Kit:
This kit (ECO-10) contains the following corrections to RTR V3.1D (210)
ECO8 (ECO9 was not released on OpenVMS):
o 14-1-436,14-3-225 Additional problem with aborted transactions
This bug was previously addressed in ECO7, but a further side effect
of the original change has been discovered and fixed. The problem
had to do with a potential ACP crash if a journal flush operation
was attempted after an aborted transaction.
o 14-1-496 Calling RTR_SET_INFO would get hung if ACP is not running
Calling rtr_set_info() or running the RTR SET TRAN command would
hang there if ACP is not running. This problem has been corrected.
o 14-1-497 Monitor performance degrades as cube of row count
MONITOR commands can now handle hundreds of rows efficiently.
o 14-3-215 Rows dropped in monitor display with large # of rows
MONITOR can now display more than 100 rows subject to /ROWS in the
monitor file.
RTR will now display up to over 1000 rows, provided the values for
the /ROWS qualifiers in the relevant *.mon file are edited.
This is most easily verified by redirecting the output to a file or
pipe. If the output goes to a terminal, then you can use the SCROLL
commands which are bound to various numeric keypad keys to scroll
all except the last line of monitor output.
o 14-3-221 RTR crash during server reject plus network problems
Under unusual conditions (for example, one server rejecting the TX
and another accepting the same TX, or a TX being aborted due to
resource problems -- all at the same time that network fluctuations
are occurring) it was possible that RTR would find an inconsistent
TX state while recovering a TX from JNL. This resulted in RTR
crashing and has now been fixed.
o 14-3-224 Crash in 1-node configuration due to length mismatch
RTR was aborting if it detected a length mismatch in the message
passed to it. This has now been fixed.
o 14-3-226,14-3-162 ACP crash after facility deletion
After a facility is deleted, it is possibly for the RTRACP to
receive a message from an application that references the deleted
facility. The verification that the facility had been deleted
failed on rare occasions causing the RTRACP to abort. This has now
been fixed.
o 14-3-229 Logging of journal record deletion errors
If there is an error deleting records from the RTR journal, then an
error is logged. Previously, RTR would silently continue.
o 14-3-239 Virtual address space full error
RTR tries to extend the virtual address space of the ACP if there is
not enough space to allocate data structures when a client or server
application is started. If the ACP failed to do this, it would
crash. This has now been fixed. Any such failure will simply
prevent the new application from starting, rather than crashing the
ACP.
o 14-3-241 Application crash trying to send large messages to looping ACP
Several changes were made to combat this combination of application
crash and rapidly expanding ACP heap:
Flow control is now granted only to the channel and facility
that requested it. A problem was discovered and corrected
whereby a grant of flow control credit could allow unrelated
channels to send too. This is believed to be the prime cause
of the symptoms reported.
An application that is unable to send to the ACP due to
resource shortage, for example if the ACP is alive but no
longer receiving for whatever reason, now keeps trying
indefinitely, and will now appear to hang rather than crash.
The TCP_NODELAY option which disables the Nagle algorithm is no
longer enabled on any RTR platform. This will improve
throughput under load, although there may be a slight impact on
response time under certain conditions.
o 14-3-260 Superfluous network traffic for nonexistent channels
Whenever a channel opens or closes, RTR sends an update message to
the router so that it can modify its broadcast routing information,
if necessary. In previous versions of RTR such messages were sent
even if no channel existed for the facility. In cases where
machines with the Frontend role had a large number of facilities
defined, this could result in significant network traffic that would
be quite noticeable over slow links, such as asynchronous
connections over telephone wire. RTR no longer sends these messages
unless a channel exists on the facility.
o 14-3-262,14-3-214 Hash table algorithm bug
The algorithm for accessing certain RTR data stored using a hash
table was found to be inefficient and could sometimes fail to find
data elements correctly. This bug has primarily affected access to
Transaction IDs and may have caused excessive CPU usage during data
retrieval or failure to find certain data elements. This bug has
been corrected.
o 14-3-266 Broadcast message corruption
Reception of an illegal or unrecognizable broadcast now results in a
log file entry (BMHDRVSN) rather that the demise of the ACP process.
If such entries persist you may wish to consider checking the
network for correct operation.
o 14-8-152 Multiple broadcast or data received on wrong channel
When running W95/NT and having PATHWORKS installed, RTR would not
detect that the client had closed its channel when the client
application was aborted by closing down the window. RTR now detects
when the client has aborted the channel and closes the channel.
o 14-1-389 RTR diagnostic output improved
The exit handler for RTR processes on VMS has been improved to
provide more and better diagnostic output. When reporting problems
with RTR, please submit any RTR.DMP and RTR_ERROR.LOG files that are
created.
o 14-8-162, 14-8-147, 14-8-164 Servers hanging during failover recovery
Transaction recovery as a result of server failover could result in
server applications getting hung in 'local recovery' state if it
also happened that more than 10 client channels had simultaneously
caused new transactions to be presented to the backend node. This
has been fixed both by increasing the limit to 50 and by adding a
check to make sure that recovery is complete before enforcing the
limit, which is designed to keep a backend node from getting
overwhelmed when transactions are coming in at a rate faster than it
can handle.
o 14-8-163 Corrupt network message caused RTR crash
Reception of a corrupt network message would hitherto result in a
failed assertion and the demise of the RTR ACP process. The
behavior has been changed to yield a log file entry (BADNETMSG),
followed by a reset of the link concerned. If such log file entries
persist for a particular pair of nodes, it may mean that a network
problem exists, and you should consider checking the network
hardware for correct operation.
The RTR log entry has also been improved to be better able to
identify the link on which it reports errors.
o 14-8-167 Null bytes display in SHOW PARTITION output
The display of null bytes in the upper and lower key bounds has been
suppressed if the bytes appear at the end of a key of type string.
The following restrictions apply to this kit:
o 14-1-285: A temporary inconsistency in shadow server state can
occur during initial facility startup of a shadowed configuration.
A shadow server can erroneously remain in state "sec_act" until the
rest of the facility has been started.
o 14-3-67: An application's wakeup routine may be called more often
than necessary.
o 14-7-785: RTR applications are not thread safe.
The current implementation of RTR for OpenVMS is not thread-safe and
applications may not call RTR V3 or V2 API routines from more than
one thread. A threaded RTR shared image for OpenVMS Version 7 has
been developed and may be made available on request.
If you are using or considering using a threaded application, please
refer to the documentation for DECthreads, and note the extra
conditions concerning AST re-entrancy in an OpenVMS threaded
application.
o 14-1-544: This version of RTR does not support a mixture of VAX and
Alpha nodes in the same cluster if both are configured as Backends.
This compatibility issue will be addressed in RTR V3.2.
Problems Addressed in the RTRAVME08D Kit:
o 14-8-144 RTR crash when ASYNC cable disconnected
Disconnecting a cable that was being used by an asynchronous
DECnet link to a remote machine could result in an ACP failure
when the transport marked the sockets as invalid. RTR has been
changed to handle this error by temporarily suspending all network
activity on the affected node. Network activity will resume as soon
as the network is found to be usable again.
o 14-8-154 Router crash when link to Frontend disconnected Router
ACPs configured to accept anonymous clients could under circumstances
fail when handling a network link loss event. This has been corrected.
o 14-8-155 New environment variables for adjusting connection timeout
parameter
Two new environment variables have been created to give operators
greater discretion in determining how long to wait before retrying
a network connection attempt.
The RTR_TIMEOUT_CONNECT variable controls how long a connecting
node will wait for a response from the connectee to its link initiation
request. This value defaults to 60 seconds.
If the RTR_TIMEOUT_CONNECT period expires without a response from
the connectee, RTR will wait an additional period determined by
the RTR_TIMEOUT_CONNECT_RELAX variable. This variable defaults
to a value of 90 seconds. The purpose of the "relax" period is
to allow the connector to accept a connection request from the
connectee node, if any are forthcoming. It is important not to
set this value too low on Backends and Routers, as such machines
are likely to be receiving connection requests from many other
machines. On machines configured to use only the Frontend role,
however, you can safely set RTR_TIMEOUT_CONNECT_RELAX to just a
few seconds so that the node can be free to attempt to connect to
another router as quickly as possible.
The minimum value for RTR_TIMEOUT_CONNECT is 5 and the minimum
for RTR_TIMEOUT_CONNECT_RELAX is 1.
The following restrictions apply to this kit:
o 14-1-285: A temporary inconsistency in shadow server state can
occur during initial facility startup of a shadowed configuration.
A shadow server can erroneously remain in state "sec_act" until the
rest of the facility has been started.
o 14-3-67: An application's wakeup routine may be called more often
than necessary.
INSTALLATION NOTES:
The Reliable Transaction Router installation procedure uses the
POLYCENTER Software Installation Utility (PCSI). For details on using
PCSI, refer to the OpenVMS System Manager's Manual, Section "Installing
with the POLYCENTER Software Installation Utility".
The logical name PCSI$SOURCE is used to define the location of the
software kits you want to install. For example, if the Reliable
Transaction Router software is located in DISK1:[KITS], enter the
following at the DCL prompt (or include the line in the system manager's
login command file):
$ DEFINE PCSI$SOURCE DISK1:[KITS]
When running the installation procedure for Reliable Transaction Router,
you can choose whether to install the ODBC Over RTR Oracle7 Server. This
is an RTR server used for supporting ODBC-enabled applications on
Windows. You should not install the ODBC Over RTR Oracle7 Server unless
you already have Oracle7 installed.
To start the installation, type the command:-
$ PRODUCT INSTALL RTR
You will see a display similar to the following:-
The following product has been selected:
DEC AlphaVMS RTR V3.1-D225 [Available]
Do you want to continue? [YES]
Press . You may safely accept the installation default options.
This patch can be found at any of these sites:
Colorado Site
Georgia Site
Files on this server are as follows:
dec-axpvms-rtr-v0301-d225-1.README
dec-axpvms-rtr-v0301-d225-1.CHKSUM
dec-axpvms-rtr-v0301-d225-1.CVRLET_TXT
dec-axpvms-rtr-v0301-d225-1.exe
|