PROBLEM: (CFS.68084) (Patch ID: TCR160-029) ******** If two nodes in a cluster are communicating using the mc-api, for example running an MPI application, and a third node, not involved in the calculation, is re-booted then the first two nodes can hang requiring a re-boot to resolve the hang. When a node enters or leaves the cluster (boots or shutsdown) the Memory Channel failover code is invoked. The problem was that the failover code was attempting to take a high level lock which under certain circumstances was held by another process and so the mc-api failover code would wait forever in kernel mode resulting in a system hang. The problem was fixed by changing mc-api failover take the lower level lock. PROBLEM: (76608, 76806) (PATCH ID: TCR160-050) ******** This patch fixes a problem that can cause a panic in mcs_wait_cluster_event() when using the Memory Channel API. The following is an example stack trace from a lockmode=4 panic: 0 boot() 1 panic() 2 simple_lock_fault() 3 simple_unlock_count_violation() 4 mcs_wait_cluster_event() 5 mcs_configure() 6 kmodcall() 7 syscall() 8 _Xsyscall() PROBLEM: (CFS.76473) (PATCH ID: TCR160-064) ******** This patch fixes a problem with the Memory Channel API whereby a node crashes holding an mc-api lock, under certain circumstances the lock will not be released after the node crashes. For the problem to occur there must be 3 or more nodes in the cluster and the node handling the cleanup after a node crashes (known as the primary mapper) does not have the lock allocated. PROBLEM: (DE_G03574) (PATCH ID: TCR160-080) ******** This patch solves a problem in the MC-API with signal handling. A CPU would be totally blocked by one process until a busy condition was removed. On a single CPU machine, this problem will bring the whole machine down.