Connectrix:Cisco MDS:模块重置系统错误代码0x42b8001e致命错误。
Summary: 出现该问题时,模块重置,少数端口进入“hw_Failure”状态。应将识别的特定端口范围置于硬件故障状态,但不要重新加载整个模块。错误“F16_PLDA_RETRY_MERR”是多位 ECC 错误,无法纠正的硬件故障。
This article applies to
This article does not apply to
This article is not tied to any specific product.
Not all product versions are identified in this article.
Symptoms
特定端口范围变为“hw_Failure”状态:
`show interface brief` ----------------------------------------------------------------------------------------- Interface Vsan Admin Admin Status SFP Oper Oper Port Logical Mode Trunk Mode Speed Channel Type Mode (Gbps) ----------------------------------------------------------------------------------------- fc9/41 1400 FX off hwFailure swl -- -- -- -- fc9/42 1400 FX off hwFailure swl -- -- -- -- fc9/43 1400 FX off hwFailure swl -- -- -- -- fc9/44 1400 FX off hwFailure swl -- -- -- -- fc9/45 1 FX off hwFailure swl -- -- -- -- fc9/46 1 FX off hwFailure swl -- -- -- -- fc9/47 1400 FX off hwFailure swl -- -- -- -- fc9/48 1 E on hwFailure swl -- -- 57 --
错误代码显示在模块内部异常日志中,如下所示:
`show module internal exceptionlog module 9`
********* Exception info for module 9 ********
exception information --- exception instance 1 ----
Module Slot Number: 9
Device Id : 204
Device Name : F16 Generic Driver
Device Errorcode : 0xccc05600
Device ID : 204 (0xcc)
Device Instance : 05 (0x05)
Dev Type (HW/SW) : 06 (0x06)
ErrNum (devInfo) : 00 (0x00)
System Errorcode : 0x42b8001e fatal error
Error Type : FATAL error
PhyPortLayer : Fibre Channel
Port(s) Affected : fc9/41-48
Error Description : F16_PLDA_RETRY_MERR
DSAP : 0 (0x0)
Time : Mon Jan 6 22:22:32 2025
(Ticks: 677CAC08 jiffies)
show logging nvram 中显示模块重置:
`show logging nvram` 2025 Jan 6 22:22:32 WTXA19710H15 %MODULE-2-MOD_SOMEPORTS_FAILED: Module 9 (Serial number: JAE18280N1K) reported failure on ports fc9/41-48 (Fibre Channel) due to fatal error in device DEV_F16_CMN (device error 0xccc05600) 2025 Jan 6 22:22:33 WTXA19710H15 %CALLHOME-2-EVENT: PORT_FAILURE
`show logging log` 2025 Jan 6 22:22:32 WTXA19710H15 %MODULE-2-MOD_SOMEPORTS_FAILED: Module 9 (Serial number: JAE18280N1K) reported failure on ports fc9/41-48 (Fibre Channel) due to fatal error in device DEV_F16_CMN (device error 0xccc05600) 2025 Jan 6 22:22:32 WTXA19710H15 %PORT-CHANNEL-5-PORT_DOWN: port-channel57: fc9/48 is down 2025 Jan 6 22:22:33 WTXA19710H15 %PORT-5-IF_DOWN_HW_FAILURE: %$VSAN 1%$ Interface fc9/48 is down (Hardware Failure) port-channel57 ISL to WTXA19710C02 fc7/22 2025 Jan 6 22:22:33 WTXA19710H15 %CALLHOME-2-EVENT: PORT_FAILURE 2025 Jan 6 22:22:33 WTXA19710H15 %PORT-5-IF_DOWN_HW_FAILURE: %$VSAN 1400%$ Interface fc9/47 is down (Hardware Failure) ltx15brwccas01_h0 2025 Jan 6 22:22:33 WTXA19710H15 %PORT-5-IF_DOWN_HW_FAILURE: %$VSAN 1%$ Interface fc9/46 is down (Hardware Failure) 2025 Jan 6 22:22:33 WTXA19710H15 %PORT-5-IF_DOWN_HW_FAILURE: %$VSAN 1%$ Interface fc9/45 is down (Hardware Failure) 2025 Jan 6 22:22:33 WTXA19710H15 %PORT-5-IF_DOWN_HW_FAILURE: %$VSAN 1400%$ Interface fc9/44 is down (Hardware Failure) 2025 Jan 6 22:22:33 WTXA19710H15 %PORT-5-IF_DOWN_HW_FAILURE: %$VSAN 1400%$ Interface fc9/43 is down (Hardware Failure) 2025 Jan 6 22:22:33 WTXA19710H15 %PORT-5-IF_DOWN_HW_FAILURE: %$VSAN 1400%$ Interface fc9/42 is down (Hardware Failure) ltx14brwccas02_h0 2025 Jan 6 22:22:33 WTXA19710H15 %PORT-5-IF_DOWN_HW_FAILURE: %$VSAN 1400%$ Interface fc9/41 is down (Hardware Failure) ltx14brwccas01_h0
Cause
Cisco MDS 9000 系列交换机上的 DEV_F16_CMN 错误通常表示与 F16 ASIC 相关的硬件问题。此错误通常会导致模块重新启动,因为出现不可恢复的多位纠错码 (ECC) 错误。
Resolution
从hw_Failure恢复接口的唯一方法是中断性地重新加载模块。
#reload module x
如果接口处于临时硬件故障状态,则可通过模块重新加载来恢复状态。如果接口出现永久性硬件故障,则继续进行模块更换。
警告:该活动具有破坏性,应在维护窗口期间执行。
Additional Information
SR #203555104
Affected Products
Connectrix MDS-Series HardwareArticle Properties
Article Number: 000271449
Article Type: Solution
Last Modified: 03 Feb 2025
Version: 1
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.