Reliable Transaction Router
Release Notes


Previous Contents


         rtr> set facility FACILITY_NAME/REPLY_CHECKSUM 

To turn off:


         rtr> set facility FACILITY_NAME/NOREPLY_CHECKSUM 

To view flag:


         rtr> show facility FACILITY_NAME/CONFIG 
              show facility FACILITY_NAME/FULL 

Please notice the Reply Checksum: label in the following example. A yes value indicates that the "response matching feature" is enabled.


 
RTR> set  facility miked/REPLY_CHECKSUM 
 
RTR> show facility miked/config 
 
Facilities: 
 
Facility:                      miked 
 
Configuration:- 
 
Frontend:            yes   Router:              yes   Backend:             yes 
Reply Checksum:      yes   Router call-out:      no   Backend call-out:     no 
                           Load balance:         no   Quorum-check off:     no 

  • 14-3-161 MONITOR CALLS/ID=n where n is not a valid id - monitors all ids
    Use of the monitor command with any of the qualifers /link, /process, /facility or /partition would generate an empty display if the requested entity did not exist. This was unlike V2 behavior, and was considered by some to be misleading. V2 behavior has been restored.
  • 14-3-164 RTR_F_REP_INDEPENDENT flag needs to be specified even when superfluous
    If the server channel has been opened with RTR_F_OPE_EXPLICIT_ACCEPT, then the RTR_F_REP_INDEPENDENT flag can only be used together with RTR_F_REP_ACCEPT.
    If the server channel has been opened with implicit accept, then the use of RTR_F_REP_INDEPENDENT implies the use of RTR_F_REP_ACCEPT.
  • 14-3-170 Terminal output is no longer unbuffered
    On some platforms the output stream was unbuffered by default when bound to a terminal device (the normal case), and incurred a large number of buffered I/O operations. This was noticeably inefficient when using the Rtr command interface over certain kinds of packet-based network link. This problem has been resolved.
  • 14-3-203 ACP crash when other nodes shutdown
    Configurations where more than 100 frontends were connected to any particular router may experience an ACP failure whilst managing quorum loss. This has been corrected.
    Automatic router failback has been restored for RTR V2 frontends connecting to RTR V3 routers.
  • 14-3-205 Inconsistent TR TX timeout if no link to FE
    Using previous versions of RTR, if a router lost a connection to a frontend that had a transaction active in enqueuing state, then the router would abort the transaction after a period of about one minute if the frontend link was not reestablished. This happened even if the client had specified a transaction timeout much less than this when starting the transaction.
    This is now fixed, so that a transaction in enqueueing state on the router would be aborted after the interval specified by the client (if it's less than one minute) if the router loses its connection to the frontend.
  • 14-3-210 START RTR qualifiers from V2
    Attempts to use obsolete V2 qualifiers to the START RTR command cause a warning to be issued. Qualifiers affected are partitions, cache_pages, and relations. Warnings are also generated if an OpenVMS qualifier is used on a non-OpenVMS platform.
  • 14-3-213 Facility names can be up to 30 characters
    Although the FACNAMLON message states the facility name can only have 30 characters, prior versions of RTR V3 would allow the system manager to create facilities with names as long as 31 characters. It was however not possible to open channels to such facilities.
    The documented maximum length of a facility name string is 30 characters. This limit is also now enforced by the 'create facility' command.
    The application interface symbol RTR_MAX_FACNAM_LEN found in <rtr.h> still indicates a maximum length of 31 characters not including the string terminator at compile time, even though the system management interface in this release does not create such a facility. Applications should normally use rtr_open_channel() to check whether a facility name has valid length and characters, and has been created. This ensures that applications do not need to be recompiled should 31 character facilities be supported at runtime in a future release.
  • 14-3-215 Rows dropped in monitor display with large # of rows
    RTR will now display up to over 1000 rows, provided the values for the /ROWS qualifiers in the relevant *.mon file are edited.
    This is most easily verified by redirecting the output to a file or pipe. If the output goes to a terminal, then you can use the SCROLL commands which are bound to various numeric keypad keys to scroll all except the last line of monitor output.
  • 14-3-219 Unthreaded applications received repeated wakeups before next RTR API call
    RTR now suppresses additional signal-based wakeups after the first until the next RTR API call. This caused a problem for an application that issued a write() to a pipe in the wakeup handler without first checking a flag to see if a read() had occurred since the last wakeup. The pipe could fill while the application was too busy to select on it and read it until empty, at which point the next write() would block and hang the application in the signal-based wakeup handler. This has been corrected.
  • 14-3-224 ACP crash in 1 node config
    RTR was aborting if it detected a length mismatch in the message passed to it. RTR has now been changed so that if this condition is detected, diagnostic information is written to the operator log and the link disconnected. RTR will not abort.
  • 14-3-226 ACP crash, ncf_validate_fdbptr
    After a facility is deleted, the RTR ACP can receive a message from an application that references the deleted facility. Verification that the facility had been deleted failed (rarely) causing the RTR ACP to abort. This has been corrected.
  • 14-3-229 tx replay after reboot
    If there is an error deleting records from the RTR journal, an error is logged. Previously, RTR would continue without logging the error.
  • 14-3-239 Virtual address space full
    RTR tries to extend the virtual address space of the ACP if there is insufficient space to allocate data structures when a client or server application is started. If the ACP failed to do this, it would crash. This has now been corrected. Any such failure will simply prevent the new application from starting rather than crashing the ACP.
  • 14-3-241 Application crash trying to send large messages to unresponsive ACP
    An application that is unable to send to the ACP due to resource shortage, for example if the ACP is alive but no longer receiving for whatever reason, now keeps trying indefinitely, and will now appear to hang rather than crash.
  • 14-3-250 Flow control has negative credit
    Applications with multiple channels engaged on more than one facility could experience flow control difficulties causing indefinite delays in transaction completion. This has been corrected.
  • 14-3-258 Stop inquorate standby from going active
    When there is a network segmentation in an active/standby configuration, the segment in the minority would become active. This behavior resulted in two active servers for the same partition. RTR now puts the inquorate or minority server in wt_quorum state and the majority server in active state.
  • 14-3-261 Assertion in knl_net_compare_ids() from ncf_accept_ast2()
    Corruption of network messages passed during the link connect phase could cause failure of the receiving RTR ACP process. This has been corrected.
  • 14-3-265 Successive ACP crashes
    Reception of a corrupt network message could cause a failed assertion and demise of the RTR ACP process. The behavior has been changed to yield a log file entry (BADNETMSG), followed by a reset of the link concerned. If such log file entries persist for a particular pair of nodes, it may mean that a network problem exists, and you should consider checking the network hardware for correct operation.
    The RTR KNL subsystem log entry has also been improved to better identify the link on which it reports errors.
  • 14-3-266 ACP crash apparently caused by word shear of a packet.
    Reception of an illegal or unrecognizable broadcast now causes a log file entry (BMHDRVSN) rather than demise of the ACP process. If such entries persist you may wish to consider checking the network for correct operation.
  • 14-3-282 Dual-ported TCP router not establishing facility links
    Problems can arise if nodes in your configuration have multiple network adapters and the IP name server is not configured to return all the configured IP addresses for such nodes. This causes such nodes to reply to connection requests with an ID that is different from that determined by the initiator of the connection. This can cause refused connections, or only the first connecting facility gaining a current router.
    This version of RTR has been changed to operate correctly in this partially configured environment.
    It is also now possible to provide RTR with full configuration information about hosts with multiple adapaters through alias entries in the host's database. Provide alias entries corresponding to the alternate interface addresses, and refer to these aliases when defining the primary entry. If using a host's file on UNIX, the alias entries should be defined prior to any references to them.
  • 14-3-289 TR failback imperfect
    The implementation of frontend router failback has been improved. Frontend nodes are now more likely to maintain conections with their preferred routers. On systems with multiple similarly configured facilities, this will reduce the number of networks links required and consequently resource consumption will be lowered.
  • 14-5-69 Journal record version control
    V3.2 of RTR implements some changes in the format of records in the journal. Rolling upgrades from earlier versions of RTR V3 are handled automatically, except for clustered nodes operating standby servers - these configurations must be upgraded simultaneously.
    Records written by RTR V3.2 cannot be used by earlier versions of RTR, so when installing an older version of RTR over RTR V3.2 you must create a new journal.
  • 14-5-82 Update SHOW NODE to display inactivity timer
    The format of the SHOW NODE command has been changed to include a display of the current inactivity timer setting for the node.
  • 14-5-129 Improvements to netstat and connects monitor pictures
    The Monitor Connects and Monitor Netstat pictures now include information in their summary sections indicating if all required links are connected or not.
  • 14-7-751 Error handling for RTR_PREF_PROT violations
    Additional checks have been implemented at facility creation time to check for and report on the absence of any specified transports, required or optional name-to-address lookups on specified or available transports. Errors or warnings are issued to the terminal session, and also recorded in the RTR log file.
  • 14-8-131 Failure to come up in remember mode
    When a node in remember mode fails during recovery, it will return to remember mode. Previously a node in remember mode would undergo local shadow recovery, then shadow recovery failure, when it could not access the journal of the secondary node. RTR now knows that it was in remember mode during its recovery process and if the secondary is not available, it will return to remember mode.
  • 14-8-144 RTR crash when ASYNC cable disconnected
    Disconnecting a cable that was being used by an asynchronous DECnet link to a remote machine could cause an ACP failure when the transport marked the sockets as invalid. RTR has been changed to handle this error by temporarily suspending all network activity on the affected node. Network activity will resume as soon as the network is found to be usable again.
  • 14-8-162 RTRACP (V3.1C) backend crash
    Transaction recovery as a result of server failover could result in server applications getting hung in 'local recovery' state if it also happened that more than 10 client channels had simultaneously caused new transactions to be presented to the backend node. This has been fixed both by increasing the limit to 50 and by adding a check to make sure that recovery is complete before enforcing the limit, which is designed to keep a backend node from getting overwhelmed when transactions are coming in at a rate faster than it can handle.
  • 14-8-181 RTR ACP Crashes
    RTR on a frontend could select a router as its current router immediately after that router had been trimmed from the facility. This could potentially leave the frontend in a 'connecting' state. This has been corrected.
  • 14-8-199 Large monitor screen last line overwritten
    In previous versions, RTR generated lines which were off the bottom of the physical screen. Most screens moved as far as they could to the last line when told to move off the bottom, so that you saw the last line and all subsequent lines superimposed on the bottom line. Usually the last line in the monitor file covered the others completely because none of the other lines were longer. This problem has been corrected.
  • 14-8-204 Inquorate router could cause backend ACP failure
    In previous versions of RTR, an inquorate router could send messages to backends informing them to change state at the same time that one or more quorate routers were asking them to change to or remain in a different state. In certain situations when network links are unreliable, this could sometimes cause a buildup of messages in the RTR ACP process on the backend node that caused it to fail due to lack of memory.
    This problem has been corrected by not allowing an inquorate router to send state update messages to backends.
  • 14-8-207 Concurrent timer cancellation could cause data corruption
    Internal RTR data could get corrupted when timers for the same events were scheduled concurrently (within the same 1 second time slice). This should have occurred only under unusual circumstances or when RTR is consistently denied access to resources due to privilege constraints. RTR has been corrected to avoid this occurrence even under extreme circumstances.
  • 14-8-214 Potential distributed loop
    It was possible for a distributed loop to occur between a backend and two or more routers. The problem occurred when one router suggested the backend go into active mode, but another suggested standby. When the backend accepted the standby suggestion, it entered standby mode and broadcast its decision to the routers. This occurred regardless of whether the backend was already in standby mode (and therefore had already broadcast its status to the routers) or not. The routers would then respond to the backend with their suggestions.
    RTR now does not broadcast when a standby-to-standby transition occurs, as the routers will have already been informed of the backend's status.

    1.3 Known Problems with Workarounds

    1.4 Restrictions


    Previous Next Contents