PROBLEM:  (QAR 58405)	(Patch ID: TCR150-002)
********
This patch fixes a problem booting a second member into a cluster.  The node 
may boot and form a cluster from the point of view of cnxshow.  However, 
attempts to access a DRD which is expected to be served from the booting node 
generates the following message on the console:

        "bss_rm_iorequest: recvd I/O from dead host 0"

Another problem may appear if a node of this "cluster" is rebooted.  In this 
case cnxshow will not show a normal cluster.

One can tell if a cluster has this problem from the boot time messages on the 
node already up.  Let the booting node be node 0 and the node up be node 1. 
Then the messages on node 1 when 0 boots should look like this:

        memory channel request from node 0
        memory channel update request from node 0
        memory channel - adding node 0

If the last message, "memory channel - adding node 0", is missing the problem 
exists.  One can also tell by dbxing the kernel and looking at 
bss_server_work.last_bitmap:

 dbx -k /vmunix
 (dbx) p bss_server_work.last_bitmap

This should have the same value on all nodes and the number of bits set should 
be the same as the number of nodes in the cluster.

It is possible that "memory channel - adding node 0" is present and the 
bss_server_work.last_bitmaps look good but cnxshow does not display a good 
cluster.  This happens when the cluster, at some point in the past, exhibited 
the missing "memory channel - adding node 0" behavior.

Note that the cluster may exhibit the missing

 "memory channel - adding node 0" frequently or infrequently.

PROBLEM:  (QAR 60982)    (Patch ID: TCR150-015)
********
In a virtual hub cluster, shutting down one node can cause the other to crash.
Typical panic strings on the node that crashes are
"rm_failover_self" and "rm_failover_all: target rail offline".

PROBLEM:     (Patch ID: TCR150-021)
********
Various repairs in Memory Channel error handling.  Fixes for virtual hub
booting with cable unplugged.

Typical panic string for the boot with cable unplugged:
        "rm_delete_context: fatal MC error"

Panics removed in the  failover code are:
         "rm_failover_request_int_common: failed to free error_cnt lock"
        "rm_failover_request_int_common: failed to get error cnt lock"

A panic removed in error handling is:
        "rmerror_int: failed to free error count lock"

A fix for noticing that a node has gone down during error handling keeps
another node from panicing with:
	"rm_delete_context: fatal MC error"

PROBLEM:  (QAR 58777, QAR 59100, QAR 59466, QAR 59898, QAR  62225)
	  (Patch ID: TCR150-026)
********
This patch corrects the following:     

Various problems with MC error handling discovered in cable pull under load
tests. Typical panic strings are:

	"rm_delete_context: fatal MC error" and "Kernel Memory Fault".
 
In addition the nodes may hang.

It is not recommended that cables be pulled, even with this patch, this is
because we are still sorting out some problems that result in memory
corruption when cables are pulled.  This patch has audits to detect some
of the corruption and will crash a node if corruption is detected. To test
error handling in a safe way, one should power down the active hub.

While cable pull is not trouble free with this patch, it is felt that error
handling in general is more robust in this implementation.

PROBLEM:    	(Patch ID: TCR150-029)
********
Hubless MC2 systems hang during boot and/or experience error interrupts.

PROBLEM:  (none)    (Patch ID: TCR150-052)
********
Reliable datagram (RDG) messaging delivers low latency, high bandwidth
networking for cluster applications. Cluster applications wishing to
utilize these features code to the api defined in the rdg shared library,
librdg.so.

PROBLEM:  ()    (Patch ID: TCR150-065)
********
Applications that are developed to the Reliable Datagram API may see
a problem where RdgIoPoll() indicates an IO has completed, when one
actually has not.

PROBLEM:  (QAR 75850)    (Patch ID: TCR150-078)
********

This patch fixes a kernel memory fault in rm_lock_update_retry().

PROBLEM:  ('QAR 73648')    (Patch ID: TCR150-069)
********

This patch fixes a problem where both nodes in a cluster will panic
at the same time with a simple_lock timeout panic.