Reliable Transaction Router
System Manager's Manual


Previous Contents Index

5.2.16 Monitor Link


    LINK COUNTERS  8-MAR-1999 14:24:44, NODE: -ALL- , LINK: -ALL-  -> -ALL- 
 
NCF_PROTOCOL             0        NIO_WRITING              0 
NCF_TIMEOUT              0        NIO_READING              0 
NCF_LINKEXIT             0        NIO_WRERROR              0 
NCF_DISCONNECT           0        NIO_RDERROR              0 
NCF_THIRDPARTY           0        NIO_RDINCMP              0 
NCF_PATHLOST             0        NIO_SEQERR               0 
NCF_RESPONSES_SENT       3        NIO_BUFOVF               0 
NCF_QUERIES_RCVD        11        NIO_READS_ACTIVE         3 
NCF_RESPONSES_RCVD       2        NIO_WRITES_ACTIVE        0 
NCF_QUERIES_SENT        10        NIO_BYTES_RCVD    44206961 
NCF_IPT_MSGS_RCVD       16        NIO_BYTES_SENT    44206996 
NCF_IPT_MSGS_SENT       16        NIO_PCKTS_RCVD      412114 
NCF_LINK_GAIN            4        NIO_PCKTS_SENT      412114 
NCF_LINK_LOSS            0        NIO_MSGS_RCVD       189284 
NCF_ABORTED              0        NIO_TMO_SENDS       161307 
NCF_REJECTEE             1        QRM_REQ_LINK_QUEUE       0 
NCF_ACCEPTED             1        QRM_RSP_LINK_QUEUE       0 
NCF_CONFIRMED            1 
NCF_INITIATED            2 

Displays a number of per link counters. The /LINK=link-name qualifier can be used if the values for one specific link are to be displayed, otherwise the total values for all links are displayed.

5.2.17 Monitor Netbytes


LINK TRAFFIC IN BYTES Fri Apr 16 1999 17:41:12, NODE: nodea.zko.dec.com 
 
                             Bytes  Rcvd                  Bytes Sent 
                      --------------------------    -------------------------- 
                           Abs      Rate     Max          Abs     Rate     Max 
Total                 59201466    6776.0      -    1002579480 113857.0      - 
 
nodea->nodea            3072248     336.0   336.0      3072248    336.0    336.0 
nodea->nodeb           42569496    4974.0  4974.0    717457678  84438.0  84438.0 
nodea->nodec           13559722    1466.0  1466.0    282049554  29083.0  29083.0 
 

Displays a list of the links to other nodes. For each link, the total number of bytes received and sent on that link and the number of bytes received and sent per second are displayed. Derived from the NIO_BYTES_RCVD and NIO_BYTES_SENT counters. The Max field represents the maximum rate since the link started.

5.2.18 Monitor Netstat


              C o n n e c t i o n   S t a t u s   D e t a i l 
 
              Node: NODEA                    Mon March 15 1999 09:50:28 
 
                Ini  Cnf  Acc  Abo  Rej Loss Gain Ctmo Rstr State  Type FailCode 
Node    Link     12    0    2   12   12    1    3    0    0 
 
NODEA ->nodeb     0    0    0    0    0    0    1    0    0    up alpha 
NODEA ->nodec     6    0    0    6    6    0    0    0    0  down     ? 76490676 
NODEA ->noded     6    0    0    6    6    0    0    0    0  down     ? 76490676 
NODEA ->nodee     0    0    2    0    0    1    2    0    0    up alpha 

Displays the link status for connected links in detail and the fail code for any links on which a connection has failed. Unconnected links where connection have been lost are highlighted. Link aborts, rejects, loss, gain, restarts, state and architecture of the remote node are also displayed. More detail included than in in the monitor connects display.

5.2.19 Monitor Partit


PARTITION DEFINITIONS  Tue Apr  6 1999 10:40:31, NODE: NODEA 
 
Partition-name                              #    #      bounds      callout 
                                State     svrs segs   lo      hi    type 
RTR$DEFAULT_PARTITION_16777218   active      1    1  "A"   "A"     - 
RTR$DEFAULT_PARTITION_16777221   active      1    1  "B"   "B"     - 
RTR$DEFAULT_PARTITION_16777217   active      1    1  0     429496 

Partitions are shown in the form node$facility_partition-id or partition name if a partition name has been specified using the SET PARTITION command. The number of servers and key segments are shown for each partition. The least significant byte of the partition's low and high bound is also shown, and callout type (if any). The partition state meanings are given in Table 5-3.

Table 5-3 Monitor Partition States
State Meaning
wt_tr_ok Server is waiting for routers to accept it
wt_quorum Server is waiting for backend to be quorate
lcl_rec Local recovery
lcl_rec_fail Primary server waiting for access to a restart journal
lcl_rec_icpl Getting next journal to recover from
lcl_rec_cpl Processed all journals for local recovery
shd_rec Shadow recovery
shd_rec_fail Shadow server waiting for access to a restart journal
shd_rec_icpl Shadow getting next journal to recover from
shd_rec_cpl Processed all journals for shadow recovery
catchup Secondary is catching up with primary
standby Server is declared as standby
active Server is active
pri_act Server is active as primary shadow
sec_act Server is active as secondary shadow
remember Primary is running without shadow secondary

5.2.20 Monitor Queues


         TRANSACTION QUEUES BY PARTITION 15-JAN-1999 12:42:53, NODE: NODEA 
 
Partition-name                                 Processed          Queued     # 
                                          Txns    Msgs  Rplys    Txn  Msg   Svrs 
NODEA$NODEB$16842753                      5792    5794      0      2    6    3 
 

Shows transaction queues on a partition basis. Uses counters from Transaction Manager (TM) and the Requester/Server configurator (RSC).

5.2.21 Monitor Quorum


QUORUM STATUS BY NODE AND FACILITY Tue Apr  6 1999 10:50:24, NODE: NODEA 
(node/role counts can be inaccurate for incorrectly configured facilities) 
States: bad configuration,not connected,minority,uncertain,quorate     node/roles 
Node     Facility                       State                          CNF RCH QRT 
NODEA    RTR$DEFAULT_FACILITY           TR:quorate,BE:quorate            2   2   2 
NODEA    shadow                         TR:quorate,BE:minority(ffranc)   4   2   1 

Quorum states are shown for router (TR) and backend (BE) nodes and roles in the columns State.

The number of nodes seen as configured (CNF), reachable (RCH) and quorate (QRT) are shown for each node, in the columns node/roles.

5.2.22 Monitor Recovery


RECOVERY INFORMATION at Tue Apr  6 1999 10:54:50 on NODEA 
 
                           Last        Restart-Recovery     Shadow-Recovery 
              Server       Recovery    Journal  Txns        Journal  Txns 
Partition-id  State        Backend     Scans    Recovered   Scans    Recovered 
------------  ------       -------     -------  ---------   -------  ---------  
16777218      active       NODEA             1          0         0          0 
16777221      active       NODEA             1          0         0          0 
16777217      active       NODEA             1          0         0          0 

Shows the progress of transaction recovery. Last recovery backend is the last backend accessed to recover transactions. If the server state is lcl_rec_fail or shd_rec_fail, this entry is the name of the background which could not be accessed. Journal scans is the number of journal files searched. Transactions recovered is the number of transactions found for this partition.

Server recovery state meanings are shown in Table 5-4.

Table 5-4 Monitor Recovery States
State Meaning
wt_tr_ok Server is waiting for routers to accept it
wt_quorum Server is waiting for backend to be quorate
lcl_rec Local recovery
lcl_rec_fail Primary server waiting for access to a restart journal
lcl_rec_icpl Getting next journal to recover from
lcl_rec_cpl Processed all journals for local recovery
shd_rec Shadow recovery
shd_rec_fail Shadow server waiting for access to a restart journal
shd_rec_icpl Shadow getting next journal to recover from
shd_rec_cpl Processed all journals for shadow recovery
catchup Secondary is catching up with primary
standby Server is declared as standby
active Server is active
pri_act Server is active as primary shadow
sec_act Server is active as secondary shadow
remember Primary is running without shadow secondary

5.2.23 Monitor Rejects


                             Rejected Transaction Summary 
NODE: NODEA                       PROCESS: 20413894      Fri Apr  9 1999 10:26:14 
 
        Time           Pid     Chan    Reason   Status Text 
-------------------   ------  ------  --------  ----------------------------- 
Fri Apr  9 10:18:43  20417266 client         0  No server available to handle 
Fri Apr  9 10:17:47  20417274 server         0  Client aborted tx 

Displays the last rtr_mt_rejected message received by each running process.

Table 5-5 MONITOR REJECTS Fields
Field Meaning
Time Time of day that the rtr_mt_rejected message was received.
Pid The process id that received the message.
Chan The type of channel (client or server) that received the message.
Reason The reason field returned in the rtr_status_data_t buffer.
Status Text The textual status that describes the reject reason.

5.2.24 Monitor Rejhist


                             Rejected Transaction History 
NODE: NODEA                       PROCESS: 38009A8B      Mon Mar  9 1999 10:26:14 
 
        Time           Chan    Reason   Status Text 
-------------------   ------   -------  -------------------------------------- 
Mon Mar 15 18:06:06   client         0  Client aborted tx 
Mon Mar 15 18:06:41   server         0  Normal successful completion 
Mon Mar 15 18:06:41   client         0  Server aborted tx 
 
------------------------------------------------------------------------------- 
 
                        number of reject msgs = 3 
                        number of accept msgs = 0 
                        rejected / total txns = 100% 
 

Displays the last ten rtr_mt_rejected messages received by the selected process. This picture should always be invoked with the /ID qualifier. The transaction identifier associated with the rejected transaction can be displayed with the SHOW PROCESS <id>/COUNTER=api_reject* command.

Table 5-6 MONITOR REJHIST Fields
Field Meaning
Time Time of day that the rtr_mt_rejected message was generated.
Chan The type of channel (client or server) that received the message.
Reason The reason field returned in the rtr_status_data_t buffer.
Status Text The textual status that describes the reject reason.

5.2.25 Monitor Response


 
                  TRANSACTION DURATION AT 10:24:51 Fri Apr  9 1999 
 
 Process     Image         Client Response Time        Server Response Time 
   ID        Name        seconds 0   1   2   3   4   seconds 0   1   2   3   4 
20413894   SERVER.EXE;4   0.000                       3.670 
20417266   RTR.EXE;75     2.200                       3.440 
20417274   SERVER.EXE;4   0.000                       1.160 

Displays the elapsed time that a transaction has been active on the opened channels of a process. On the client, transaction duration is measured between the rtr_start_tx or rtr_send_tx call and the receipt of the final rtr_mt_accepted or rtr_mt_rejected message. A call to rtr_reject_tx also marks the end of a transaction. On the server, transaction duration is measured between receipt of a rtr_mt_msg1 or rtr_mt_msg1_uncertain message and the receipt of the final rtr_mt_rejected message or rtr_reject_tx call. Accepted transaction end times are recorded when the server issues a rtr_receive_message call to request a new transaction for processing.

5.2.26 Monitor Rolequorum


 
  QUORUM COUNTS BY FACILITY  7-JAN-1999 14:32:48, NODE: -ALL- 
 
     Router View of           Backend View of 
         backends    routers       backend      routers 
       CNF RCH QRT  CNF RCH QRT  CNF RCH QRT  CNF RCH QRT 
 
VIP                           1   1   1    1   1   1    1   1   1    1   1   1 

5.2.27 Monitor Routers


              ROUTER TRANSACTION COUNTERS AT 14:33:29  7-JAN-1999 
 
Node:     -ALL- 
Facility: -ALL- 
 
            Abs        Rate         10  20  30  40  50  60  70  80  90 100 
Starts:      116641    25.7 
Enqueues:    116641    25.7 
Commits:     116641    25.6 
Aborts:           0     0.0 

Displays information on a router node. It gives an indication of the utilization of the router in terms of transactions and broadcasts routed through this node. Useful to monitor performance, or locate problems. Uses counters in the Transaction Manager (TM) subsystem.

5.2.28 Monitor Routing


ROUTING STATISTICS BY FACILITY  Thu Apr-15-1999 14:34:20, NODE: -ALL- 
 
                                        Transactions         Broadcasts 
                                       Absolute    Rate   Absolute    Rate 
Total                                    118489    39.2      68444   994.0 
 
VIP                                      118489    39.2      68444   994.0 

Displays statistics of transaction and broadcast traffic by facility. Rate is the number of transactions or broadcasts per second within the monitoring interval.

5.2.29 Monitor RSCBE


 
RTR> Monitor rscbe 
 
 
Most Recent RSC Dclsrv Calls History on Backend LENGTH Thu Mar  4 1999,15:19:41 
 
Key Range Id:   16777216     Partition Start Time: THU MAR  4 15:18:22 1999 
Image Name:   RTR.EXE 
 
T-delta  RSC calls           router            state                 seq_nr 
       0 send_dcl_to_master  sfranc            wait_for_response              0 
       1 recv_status_ok      sfranc            rstart_rvy                     1 
       1 send_dcl_to_master  sfranc            rstart_rvy_incomp              1 
       1 recv_status_ok      sfranc            rstart_rvy                     1 
       1 send_dcl_to_master  sfranc            rstart_rvy_incomp              1 
       1 recv_status_ok      sfranc            active                         1 
       1 send_dcl_to_other   depth             active                         1 
       1 recv_status_ok      depth             active                         1 

Displays backend request to server messages and declared state for routers

5.2.30 Monitor RTR


RTR> Monitor RTR 
 
                   RTR COUNTERS  7-JAN-1999 14:35:05, NODE: -ALL- 
 
ACP_WAKEUPS         310484   QRM_QCXTS               48 
ACP_WAKE_REQS       859200   QRM_QIRS                 0 
CM_BYTES_PRESENT    1048576  QRM_RAES                 0 
CM_BYTES_IN_USE      51968   QRM_MAXQUOTA        278718 
CM_FREECHCPCKT        3738   QRM_CURQUOTA        278718 
TIM_TIMER_SETS      238349   QRM_RDES                18 
TIM_TIMER_CANCELS    34165   QRM_QARS                 0 
TIM_TIMER_DELIVERS  204174   QRM_QUERIES_SENT    481308 
TM_QRM_QUERIES_SENT      0   QRM_QUERIES_RCVD    481308 
TM_QRM_QUERIES_RCVD      0   QRM_RESPONS_SENT      3082 
NCF_REJECTEE             1   QRM_RESPONS_RCVD      3092 
NCF_REJECTER             1   QRM_RETRIES              7 
NCF_FACILITY_UP          2   QRM_TIMEOUTS             0 
NCF_FACILITY_DOWN        0 
RSC_ALLOC_MEM         2098 

Displays various per node counters.

5.2.31 Monitor Stalls


                NETWORK STALLS AT 29-JAN-1999 15:35:03, ON NODE: TR1 
                     QIOs      Bytes  Link             Stalls 
                 Issued Rate    Sent Drops Secs  <3s  <10s  <30s  >30s Tot 
Total              5467  0.0  327148   2     33   23     1     0     0  24 
TR1 -> TR1           29  0.0    3718   0      0    0     0     0     0   0 
TR1 -> FE2          509  0.0   20707   0      4    4     0     0     0   4 
TR1 -> BE1          303  0.0   13707   0      3    3     0     0     0   3 
TR2 -> TR2          111  0.0   11682   0      0    0     0     0     0   0 
TR2 -> BE1          504  0.0   22743   0     18    8     1     0     0   9 Stall 
FE1 -> FE1           64  0.0    6645   0      0    0     0     0     0   0 
FE1 -> FE2          373  0.0   18890   0      2    2     0     0     0   2 
FE1 -> BE1          310  0.0   24487   0      0    0     0     0     0   0 
FE2 -> FE2          231  0.0   18900   0      0    0     0     0     0   0 
FE2 -> BE1          284  0.0   22503   1      1    1     0     0     0   1 
FE2 -> TR1          536  0.0   28166   0      0    0     0     0     0   0 
FE2 -> FE1          396  0.0   23643   0      0    0     0     0     0   0 
BE1 -> BE1          355  0.0   28121   0      0    0     0     0     0   0 
BE1 -> FE2          284  0.0   13014   1      0    0     0     0     0   0 
BE1 -> TR2          515  0.0   27502   0      2    2     0     0     0   2 
BE1 -> TR1          328  0.0   25698   0      1    1     0     0     0   1 
BE1 -> FE1          335  0.0   17022   0      2    2     0     0     0   2 

Displays in real time any network links that are currently stalling (that is, waiting to transmit outbound traffic) and provides a history of the stalls that the various links have encountered during their lifetime. The display shows:

5.2.32 Monitor System


                  System Status at 10:27:51 Fri Apr  9 1999 
                           node: NODEA 
 
     Resource                OK   Warning 
Facility QUORUM states......          x 
 
JOURNAL free space..........  x 
                                            Note: Additional detail 
Link CONNECTS...............          x     about a resource can be 
                                            obtained by monitoring 
Link traffic STALLS.........  x             the subsystem specified 
                                            in capital letters. 
FLOW control credits........  x 
                                            For example, to get more 
PARTITION states............  x             information on links 
                                            type MONITOR CONNECTS 
CALL Msg outstanding..........        x 
                                            To modify threshold values, 
Transaction QUEUES............ x            edit the file SYSTEM.MON. 
 
Transaction REJECTS........... x 
 
Broadcast EVENT discards...... x 

Displays the state of critical resources within the RTR environment. If a resource has exceeded a predefined threshold, a warning indicator is displayed.

The default thresholds are as follows:
Quorum Warn if any roles are inquorate.
Journal Warn if journal free space is less than 30% of total.
Links Warn if any link is disconnected.
Stalls Warn if 10 second stalls are greater than 1% of all messages sent.
Flow Warn if the wait is more than one second for 10% of the total credit requests.
Partition Warn if any of the partitions are not in one of the following states: Standby, Active, Pri_act, or Sec_act.
Calls Warn if any messages have been pending for more than 30 seconds.
Queues Warn if the transaction queue cannot be emptied within 10 seconds.
Rejects Warn if the number of rejects (non-user) is greater than 5% of the total transactions processed, or a reject (non-user) has occurred within the last 30 minutes.
Events Warn if the number of discards is greater than 5% of the total events sent.

Threshold values can be customized by editing the file SYSTEM.MON.


Previous Next Contents Index