Connectrix:Cisco:okButDiagFailed on Supervisor Module 并且无法收集 show tech-support details

摘要: Supervisor 模块状态为 HA-Standby,但在 DIAG 失败之上,错误在 NFDC 上流式传输。

本文适用于 本文不适用于 本文并非针对某种特定的产品。 本文并非包含所有产品版本。

症状

为了调查此问题,收集交换机日志存在一个缺点,并且 TMP 文件夹已满。 

Show tech details will take 4-8 minutes to complete. Please Wait ...
Collecting show-tech at Tue Nov  5 09:35:02 2024

---- Show of Part 0 Completed ----

---- Show of Part 1 Completed ----

---- Show of Part 2 Completed ----

---- Show of Part 3 Completed ----

RSFP-DBG: Total Time Taken = 0s

RSFP-DBG: ID 00 PID 30420 30420: 0s

RSFP-DBG: ID 01 PID 30421 30421: 0s

RSFP-DBG: ID 02 PID 30422 30422: 0s

RSFP-DBG: ID 03 PID 30423 30423: 0s
Done collecting show-tech at Tue Nov  5 09:35:02 2024

############Collecting Data from Line=cards###########
/isan/bin/tcap_bash_nounzip: line 18: cannot create temp file for here-document: No space left on device
/isan/bin/tcap_bash_nounzip: line 18: cannot create temp file for here-document: No space left on device
/isan/bin/tcap_bash_nounzip: line 18: cannot create temp file for here-document: No space left on device

 

+ Var 温度为 100%。

 

show system internal flash
df: write error: No space left on device
Mount-on                  1K-blocks      Used   Available   Use%  Filesystem


show system internal dir /var/tmp
                                                                ./     1536040
                                                               ../         380
                                                 cfg_status.log_1            0
                               esrs_curl_response_155235395781085            0
                               esrs_http_response_155235395781085            1
                               esrs_curl_response_222655613294187            0
                               esrs_http_response_222655613294187            1
                               esrs_curl_response_102722484048004            1
                               esrs_http_response_102722484048004            1
                               esrs_curl_response_234654123105028            1
                               esrs_http_response_234654123105028            1
                               esrs_curl_response_234154112134614            1
                               esrs_http_response_234154112134614            1
                               esrs_curl_response_115721101643319            1
                               esrs_http_response_115721101643319            1
                               esrs_curl_response_010652840900059            1
                               esrs_http_response_010652840900059            1
                               esrs_curl_response_010152761078453            1
                               esrs_http_response_010152761078453            1
                               esrs_curl_response_005652696830477            1
                               esrs_http_response_005652696830477            1

 

如果发现类似问题,请联系戴尔支持以从 Cisco 获取 DPlug,并清除 TMP 空间以收集日志。 

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 

收集日志后,发现由于测试被 errDisabled 而导致模块显示 BIOS 错误。

exception information --- exception instance 7 ----
Module Slot Number: 4
Device Id         : 0
Device Name       : undef
Device Errorcode  : 0x00000000
Device ID         : 00 (0x00)
Device Instance   : 00 (0x00)
Dev Type (HW/SW)  : 00 (0x00)
ErrNum (devInfo)  : 00 (0x00)
System Errorcode  : 0x40710022 BIOS file write error 
Error Type        : Warning
PhyPortLayer      : 0x0
Port(s) Affected  : 
Error Description : Secondary BootROM test failed
DSAP              : 0 (0x0)
UUID              : 483 (0x1e3)
Time              : Sat Mar  9 11:05:18 2024
                    (Ticks: 65EC345E jiffies) 

2024 Feb 23 14:57:32 OSL-D1-9706-31-Fabric2 %PORT-5-IF_UP: %$VSAN 100%$ Interface fc1/15 is up in mode F   
2024 Mar  9 11:05:18 OSL-D1-9706-31-Fabric2 %DIAGCLIENT-2-EEM_ACTION_HM_SHUTDOWN: Test <PrimaryBootROM> has been disabled as a part of default EEM action
2024 Mar  9 11:05:18 OSL-D1-9706-31-Fabric2 %DEVICE_TEST-2-PRIMARY_BOOTROM_FAIL: Module 4 has failed test PrimaryBootROM 20 times on device Primary BootROM due to error BIOS file write error
2024 Mar  9 11:05:18 OSL-D1-9706-31-Fabric2 %MODULE-4-MOD_WARNING: Module 4 (Serial number: JAE22290BR9) reported warning 4/1-4/0 due to BIOS file write error in device DEV_UNDEF (device error 0x0)
2024 Mar  9 11:05:18 OSL-D1-9706-31-Fabric2 %DIAGCLIENT-2-EEM_ACTION_HM_SHUTDOWN: Test <SecondaryBootROM> has been disabled as a part of default EEM action
2024 Mar  9 11:05:18 OSL-D1-9706-31-Fabric2 %DEVICE_TEST-2-SECONDARY_BOOTROM_FAIL: Module 4 has failed test SecondaryBootROM 20 times on device Secondary BootROM due to error BIOS file write error
2024 Mar  9 11:05:18 OSL-D1-9706-31-Fabric2 %CALLHOME-2-EVENT: MODULE_WARNING
2024 Mar  9 11:05:18 OSL-D1-9706-31-Fabric2 %MODULE-4-MOD_WARNING: Module 4 (Serial number: JAE22290BR9) reported warning 4/1-4/0 due to BIOS file write error in device DEV_UNDEF (device error 0x0)
2024 Mar  9 11:05:18 OSL-D1-9706-31-Fabric2 %CALLHOME-2-EVENT: GOLD-minor

原因

由于测试被 errDisabled 导致模块显示 BIOS 错误。

解决方案

运行 BIOS 测试以修复模块状态。

# show system verify bios flash 0
# show system verify bios flash 1

#diagnostic clear result module 4 test 5
#diagnostic clear result module 4 test 6

# diagnostic start module 4 test 5
# diagnostic start module 4 test 6

# diagnostic stop module 4 test 5
# diagnostic stop module 4 test 6

#show diagnostic result module 4  detail 
#show module
 
# show module
Mod  Online Diag Status
---  ------------------
1    Pass
2    Pass
3    Pass
4    Pass <<<<<<<<
5    Pass

受影响的产品

Connectrix MDS-9706, Connectrix MDS-9706-V2
文章属性
文章编号: 000289853
文章类型: Solution
上次修改时间: 27 2月 2025
版本:  1
从其他戴尔用户那里查找问题的答案
支持服务
检查您的设备是否在支持服务涵盖的范围内。