Connectrix Cisco - Switch is powered off due to Kernel panic

Summary: MDS switch was down and did not reload after kernel panic. Had to manually power ON the switch

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Symptoms

Output of Stack-Trace shows only 5 out of 6 CPUs online due to which Switch was offline and didn’t reload.
 
`show system reset-reason`
----- reset reason for module 1 (from Supervisor in slot 1) ---
1) At 65175 usecs after Tue Oct 13 02:48:39 2020
    Reason: Kernel Panic
    Service:
    Version: 8.1(1a)
   
‘show logging nvram’
2020 Oct 14 18:44:02.277 switch %SYSLOG-2-SYSTEM_MSG : Syslogs wont be logged into logflash until logflash is online
2020 Oct 14 18:44:05.851 switch %KERN-0-SYSTEM_MSG: [    0.036807] Host controller irq 55 - kernel
2020 Oct 14 18:44:05.880 switch %KERN-0-SYSTEM_MSG: [    0.057612] Assign root port irq 55 - kernel
2020 Oct 14 18:44:05.881 switch %KERN-0-SYSTEM_MSG: [    0.057633] Host controller irq 54 - kernel
2020 Oct 14 18:44:05.883 switch %KERN-0-SYSTEM_MSG: [    0.059007] Assign root port irq 54 - kernel
2020 Oct 14 18:44:05.912 switch %KERN-0-SYSTEM_MSG: [    0.774555] Enabling all PCI devices - kernel

show logging onboard module stack-trace
CPU 0 
Call Trace:
[b0463bc4] do_raw_spin_lock+0xec/0x120(unreliable)
[b0698738] dev_watchdog+0x58/0x29c
[b04324c4] call_timer_fn+0x48/0xfc
[b0432770] run_timer_softirq+0x1f4/0x248
[b042af38] __do_softirq+0x16c/0x368
[b042b428] irq_exit+0x68/0x90
[b040a3c8] timer_interrupt+0x204/0x278
[b040f388] ret_from_except+0x0/0x18

CPU 1   Process: swapper/1 (pid 0)
Call Trace:
[b04230a4] cpm_idle_wait+0x14/0x24(unreliable)
[b0469d4c] cpu_startup_entry+0x124/0x1e8
[b0410d78] start_secondary+0x1fc/0x200
[b0402234] start_secondary_47x+0x24/0x48

CPU 2   Process: sysinfo (pid 3855)
Call Trace:
[b0480d58] smp_call_function_many+0x268/0x274(unreliable)
[b04182f8] flush_tlb_mm+0x58/0x68
[b042532c] copy_process.part.69+0xbb0/0x11a0
[b0425a84] do_fork+0xd0/0x318
[b040ecac] ret_from_syscall+0x0/0x3c

CPU 3   Process: swapper/3 (pid 0)
Call Trace:
[b04230a4] cpm_idle_wait+0x14/0x24(unreliable)
[b0469d4c] cpu_startup_entry+0x124/0x1e8
[b0410d78] start_secondary+0x1fc/0x200
[b0402234] start_secondary_47x+0x24/0x48

CPU 4   Process: swapper/4 (pid 0)
Call Trace:
[b04230a4] cpm_idle_wait+0x14/0x24(unreliable)
[b0469d4c] cpu_startup_entry+0x124/0x1e8
[b0410d78] start_secondary+0x1fc/0x200
[b0402234] start_secondary_47x+0x24/0x48

Cause

One of the CPU was offline which caused the Switch to power off.

Resolution

Permanent Fix:
  • If all the 6 CPUs are not online, replace the Switch and make sure new switch is running on NX-OS v8.4(1a) or later.
  • If all the 6 CPUs are online, upgrade the firmware to NX-OS v8.4(1a) or later to avoid re-occurrence.

Additional Information

This applicable only for Cisco MDS-9396s Switches

Affected Products

Connectrix MDS 9396S
Article Properties
Article Number: 000181486
Article Type: Solution
Last Modified: 07 Jan 2021
Version:  1
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.