PROBLEM: (QAR 48495) (Patch ID: TCR141-002) ******** One node would hang when another goes down. This only is seen when the cluster is VERY busy. The driver kept an image of the last known cluster configuration in memory that was updated regularly during interrupt processing. But handling a large number of notification interrupts sometimes caused that image to be updated before processing the interrupt coming from a node entering or leaving the cluster. When processing the node entering/leaving interrupt under these circumstances, the driver couldn't determine what had changed, and did nothing. Now the driver keeps a separate image of the cluster configuration. PROBLEM: (QAR 47824) (Patch ID: TCR141-002) ******** System would panic with the message "MC stuck removing {primary/alternate}'s node" on an 8-node cluster. PROBLEM: (QAR 48549) (Patch ID: TCR141-002) ******** The driver thought it was setting a 1-2 second inactivity timeout when, in fact, it was setting a 4-8 second timeout, and vice-versa. This only happens for virtual hub systems, and has not actually caused a problem, other than being wrong. PROBLEM: (QAR 47370) (Patch ID: TCR141-002) ******** For redundant Memory Channel configurations supporting failover, turning off the (current) primary hub successfully causes a failover to the old alternate, now primary, hub. Now, the old primary hub can be turned on again, and the current primary hub turned off, and the system will failover successfully. Without this fix, the cluster would crash. PROBLEM: (CLD MGO101413) (Patch ID: TCR141-036) ******** The 'sysconfig -q rm' command may crash a node. Without this patch, the command causes the rm_spur driver to probe memory channel hardware in a way that essentially takes the adapter offline momentarily. Any data transfer in process will fail catastrophically. Typical crash data might look like: 0 boot src/kernel/arch/alpha/machdep.c : 2634 1 panic src/kernel/bsd/subr_prf.c : 707 2 thread_block src/kernel/kern/sched_prim.c : 1925 3 thread_preempt src/kernel/kern/sched_prim.c : 3820 4 boot src/kernel/arch/alpha/machdep.c : 2581 5 panic src/kernel/bsd/subr_prf.c : 791 6 tlaser_pci_clr_err src/kernel/io/dec/pci/pcia.c : 3467 7 _XentInt src/kernel/arch/alpha/locore.s : 1049 8 read_io_port src/kernel/io/dec/pci/pcia.c : 764 9 tlaser_pci_config src/kernel/io/dec/pci/pcia.c : 4141 10 rmGetWindowSize src/kernel/io/dec/pci/rm_spur.c : 2394 11 rm_window_size src/kernel/io/dec/pci/rm_spur.c : 2376 12 rm_spur_configure src/kernel/io/dec/pci/rm_spur_cfg.c : 176 13 kmodcall src/kernel/bsd/kern_kmodcall.c : 360 14 syscall src/kernel/arch/alpha/syscall_trap.c : 564 15 _Xsyscall src/kernel/arch/alpha/locore.s : 1209 and stack trace : CONTEXT: PANIC PID: 32727 COMMAND: "sysconfig" THREAD: fffffc00ef584b00 CPU: 0 (active) EVENT: ---------------- STATE: RUN CPU: 0 PID: 32727 THREAD: fffffc00ef584b00 COMMAND: "sysconfig" 1: panic+172: boot(0x0, 0x4) 2: thread_block+120: panic("thread_block: interrupt level call") 3: thread_preempt+260: thread_block() 4: boot+512: thread_preempt(0x26) 5: panic+588: boot(0x0, 0x0) 6: tlaser_pci_clr_err+904: panic("pciaerror") 7: _XentInt+124: tlaser_pci_clr_err(0xfffffc000069c5f0) 8: read_io_port+604: _XentInt() r0, v0: 0x10000008 r16, a0: 0xc300040218 r1, t0: 0x40000000000 r17, a1: 0x1 r2, t1: 0x2000 r18, a2: 0x300000000 r3, t2: 0x1 r19, a3: 0 r4, t3: 0x1 r20, a4: 0 r5, t4: 0x2000000007265 r21, a5: 0x3 r6, t5: 0x2000000007265 r22, t8: 0xc r7, t6: 0x300000000000000 r23, t9: 0xfffffc00c2d0f450 r8, t7: 0x1 r24,t10: 0xc300040200 r9, s0: 0xfffffc00ffde8000 r25,t11: 0x18 r10, s1: 0xc300000000 r26, ra: 0xfffffc0000649618 r11, s2: 0 r27,t12: 0xfffffc000064df24 r12, s3: 0x4 r28, at: 0xfffffc00005c2730 r13, s4: 0xc300002010 r29, gp: 0xfffffc0000753b00 r14, s5: 0x4 r30, sp: 0xfffffffe91ccb670 r15, s6: 0xc300000010 r31,zer: 0x1 9: tlaser_pci_config+736: read_io_port(0xc300002010, 0x4, 0x0) 10: rmGetWindowSize+172: tlaser_pci_config(0xc300000010, 0x4, 0x0, 0x10) 11: rm_window_size+32: rmGetWindowSize(0xfffffc00ffde8000) 12: rm_spur_configure+304: rm_window_size(0x1) 13: kmodcall+2892: rm_spur_configure(0x2, 0xfffffc0057f85200, 0xc, 0x0, 0x0) 14: syscall+752: kmodcall(???, 0xfffffffe91ccb8f0, 0xfffffffe91ccb8e0) 15: _Xsyscall+116: syscall(0x1)