Connectrix Cisco — 交换机因内核死机而关闭
Summary: MDS 交换机已关闭,并且在内核崩溃后未重新加载。必须手动打开交换机电源
This article applies to
This article does not apply to
This article is not tied to any specific product.
Not all product versions are identified in this article.
Symptoms
Stack-Trace 的输出显示 6 个 CPU 中只有 5 个处于联机状态,因此 Switch 处于离线状态且未重新加载。
“show system reset-reason”
-----模块 1 的重置原因(来自插槽 1 中的 Supervisor)---
1) 在 2020 年 10 月 13 日星期二 02:48:39 之后的 65175 微秒原因:
内核崩溃
服务:
版本:8.1(1a)
“show logging nvram”
2020 Oct 14 18:44:02.277 switch %SYSLOG-2-SYSTEM_MSG:在 logflash 联机之前,系统日志不会记录到 logflash 中
2020 Oct 14 18:44:05.851 switch %KERN-0-SYSTEM_MSG: [0.036807] Host controller irq 55 - kernel
2020 Oct 14 18:44:05.880 switch %KERN-0-SYSTEM_MSG:[0.057612] Assign root port irq 55 - kernel
2020 Oct 14 18:44:05.881 switch %KERN-0-SYSTEM_MSG:[0.057633] Host controller irq 54 - kernel
2020 Oct 14 18:44:05.883 switch %KERN-0-SYSTEM_MSG:[0.059007] Assign root port irq 54 - kernel
2020 Oct 14 18:44:05.912 switch %KERN-0-SYSTEM_MSG:[0.774555] 启用所有 PCI 设备 -kernel'show
logging onboard module stack-trace'CPU
0
call trace:
[b0463bc4] do_raw_spin_lock+0xec/0x120(不可靠)[
b0698738] dev_watchdog+0x58/0x29c
[b04324c4] call_timer_fn+0x48/0xfc
[b0432770] run_timer_softirq+0x1f4/0x248
[b042af38] __do_softirq+0x16c/0x368
[b042b428] irq_exit+0x68/0x90
[b040a3c8] timer_interrupt+0x204/0x278
[b040f388] ret_from_except+0x0/0x18
CPU 1 进程:交换器/1 (pid 0)
调用跟踪:
[b04230a4] cpm_idle_wait+0x14/0x24(不可靠)[
b0469d4c] cpu_startup_entry+0x124/0x1e8
[b0410d78] start_secondary+0x1fc/0x200
[b0402234] start_secondary_47x+0x24/
0x48CPU 2 进程:sysinfo (pid 3855)
Call Trace:
[b0480d58] smp_call_function_many+0x268/0x274(unreliable)[
b04182f8] flush_tlb_mm+0x58/0x68
[b042532c] copy_process.part.69+0xbb0/0x11a0
[b0425a84] do_fork+0xd0/0x318
[b040ecac] ret_from_syscall+0x0/0x3c
CPU 3 Process: swapper/3 (pid 0)
Call Trace:
[b04230a4] cpm_idle_wait+0x14/0x24(不可靠)[
b0469d4c] cpu_startup_entry+0x124/0x1e8
[b0410d78] start_secondary+0x1fc/0x200
[b0402234] start_secondary_47x+0x24/0x48
CPU 4 进程:交换器/4 (pid 0)
调用跟踪:
[b04230a4] cpm_idle_wait+0x14/0x24(不可靠)[
b0469d4c] cpu_startup_entry+0x124/0x1e8
[b0410d78] start_secondary+0x1fc/0x200
[B0402234] start_secondary_47x+0x24/0x48
-----模块 1 的重置原因(来自插槽 1 中的 Supervisor)---
1) 在 2020 年 10 月 13 日星期二 02:48:39 之后的 65175 微秒原因:
内核崩溃
服务:
版本:8.1(1a)
“show logging nvram”
2020 Oct 14 18:44:02.277 switch %SYSLOG-2-SYSTEM_MSG:在 logflash 联机之前,系统日志不会记录到 logflash 中
2020 Oct 14 18:44:05.851 switch %KERN-0-SYSTEM_MSG: [0.036807] Host controller irq 55 - kernel
2020 Oct 14 18:44:05.880 switch %KERN-0-SYSTEM_MSG:[0.057612] Assign root port irq 55 - kernel
2020 Oct 14 18:44:05.881 switch %KERN-0-SYSTEM_MSG:[0.057633] Host controller irq 54 - kernel
2020 Oct 14 18:44:05.883 switch %KERN-0-SYSTEM_MSG:[0.059007] Assign root port irq 54 - kernel
2020 Oct 14 18:44:05.912 switch %KERN-0-SYSTEM_MSG:[0.774555] 启用所有 PCI 设备 -kernel'show
logging onboard module stack-trace'CPU
0
call trace:
[b0463bc4] do_raw_spin_lock+0xec/0x120(不可靠)[
b0698738] dev_watchdog+0x58/0x29c
[b04324c4] call_timer_fn+0x48/0xfc
[b0432770] run_timer_softirq+0x1f4/0x248
[b042af38] __do_softirq+0x16c/0x368
[b042b428] irq_exit+0x68/0x90
[b040a3c8] timer_interrupt+0x204/0x278
[b040f388] ret_from_except+0x0/0x18
CPU 1 进程:交换器/1 (pid 0)
调用跟踪:
[b04230a4] cpm_idle_wait+0x14/0x24(不可靠)[
b0469d4c] cpu_startup_entry+0x124/0x1e8
[b0410d78] start_secondary+0x1fc/0x200
[b0402234] start_secondary_47x+0x24/
0x48CPU 2 进程:sysinfo (pid 3855)
Call Trace:
[b0480d58] smp_call_function_many+0x268/0x274(unreliable)[
b04182f8] flush_tlb_mm+0x58/0x68
[b042532c] copy_process.part.69+0xbb0/0x11a0
[b0425a84] do_fork+0xd0/0x318
[b040ecac] ret_from_syscall+0x0/0x3c
CPU 3 Process: swapper/3 (pid 0)
Call Trace:
[b04230a4] cpm_idle_wait+0x14/0x24(不可靠)[
b0469d4c] cpu_startup_entry+0x124/0x1e8
[b0410d78] start_secondary+0x1fc/0x200
[b0402234] start_secondary_47x+0x24/0x48
CPU 4 进程:交换器/4 (pid 0)
调用跟踪:
[b04230a4] cpm_idle_wait+0x14/0x24(不可靠)[
b0469d4c] cpu_startup_entry+0x124/0x1e8
[b0410d78] start_secondary+0x1fc/0x200
[B0402234] start_secondary_47x+0x24/0x48
Cause
其中一个 CPU 处于离线状态,导致交换机关机。
Resolution
永久修复:
- 如果所有 6 个 CPU 均未联机,请更换交换机并确保新交换机在 NX-OS v8.4(1a) 或更高版本上运行。
- 如果所有 6 个 CPU 都处于联机状态,请将固件升级到 NX-OS v8.4(1a) 或更高版本,以避免再次出现。
Additional Information
这仅适用于 Cisco MDS-9396s 交换机
Affected Products
Connectrix MDS 9396SArticle Properties
Article Number: 000181486
Article Type: Solution
Last Modified: 07 Jan 2021
Version: 1
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.