ECS:RAP015:温度故障;症状代码:2010
Summary: 节点上的温度传感器报告已达到严重级别。
This article applies to
This article does not apply to
This article is not tied to any specific product.
Not all product versions are identified in this article.
Symptoms
温度传感器检测到温度高于严重阈值。
某个组件可能运行不正常,导致温度传感器报告已达到严重级别。
节点上的温度传感器报告已达到严重级别。
某个组件可能运行不正常,导致温度传感器报告已达到严重级别。
节点上的温度传感器报告已达到严重级别。
Cause
出现导致温度传感器超过严重级别的问题。
Resolution
对于第 2 代,滚动到底部。
第 3 代硬件:
1.使用 cs_hal 检查报告的节点上的温度传感器的状态。
命令:
第 3 代硬件:
1.使用 cs_hal 检查报告的节点上的温度传感器的状态。
命令:
#cs_hal sensors temp
示例:对于 Gen3,只有三个温度传感器,如下所示。
admin@n1-mgmt:~> cs_hal sensors temp Entity Type Label Status Info ----- ----- ----- ----- ----- Processor Temperature Temp OK 53 Degrees Celsius Processor Temperature Temp OK 54 Degrees Celsius System Board Temperature Inlet Temp CRIT 40 Degrees Celsius; above critical threshold System Board Temperature Exhaust Temp OK 50 Degrees Celsius NOTE: on Axum and EX-series, use "sudo -i racadm getsensorinfo" to obtain sensor information. admin@n1-mgmt:~>
2. 检查机架中的所有节点,并查看 其他节点 是否报告温度传感器 不是“OK”
命令:
示例:在此示例中,机架上半部分的多个节点报告入口温度过高。
3.可能的情况:
年4.重要信息:使用 https://central.dell.com/case-lookup/ 并查找 PSNT (产品序列号标签)来检查历史记录。检查过去 3-6 个月内发生了多少次事件。检查问题是持续存在并影响多个节点,还是整个机架受到影响并且入口温度高于正常值,则表明存在需要解决的持续环境问题。除非有明确的行动计划和结论来解决温度问题,否则请勿将案例作为重复案例关闭。
5.如果 PE 团队未发现问题,或者历史记录包含同一警报中的多次实例(3 个月或更长时间),则通过 Swarm 咨询 L2 ,并 准备向 CE 发出工单 ,以检查受影响的机架和节点的环境状况。
命令:
viprexec -i cs_hal sensors temp
示例:在此示例中,机架上半部分的多个节点报告入口温度过高。
admin@n1-mgmt:~> viprexec -i cs_hal sensors temp Output from host : 192.168.219.1 Entity Type Label Status Info ----- ----- ----- ----- ----- Processor Temperature Temp OK 53 Degrees Celsius Processor Temperature Temp OK 53 Degrees Celsius System Board Temperature Inlet Temp CRIT 40 Degrees Celsius; above critical threshold System Board Temperature Exhaust Temp OK 50 Degrees Celsius NOTE: on Axum and EX-series, use "sudo -i racadm getsensorinfo" to obtain sensor information. Output from host : 192.168.219.2 Entity Type Label Status Info ----- ----- ----- ----- ----- Processor Temperature Temp OK 47 Degrees Celsius Processor Temperature Temp OK 49 Degrees Celsius System Board Temperature Inlet Temp CRIT 39 Degrees Celsius; above critical threshold System Board Temperature Exhaust Temp OK 50 Degrees Celsius NOTE: on Axum and EX-series, use "sudo -i racadm getsensorinfo" to obtain sensor information. Output from host : 192.168.219.3 Entity Type Label Status Info ----- ----- ----- ----- ----- Processor Temperature Temp OK 46 Degrees Celsius Processor Temperature Temp OK 46 Degrees Celsius System Board Temperature Inlet Temp OK 35 Degrees Celsius System Board Temperature Exhaust Temp OK 47 Degrees Celsius NOTE: on Axum and EX-series, use "sudo -i racadm getsensorinfo" to obtain sensor information. Output from host : 192.168.219.4 Entity Type Label Status Info ----- ----- ----- ----- ----- Processor Temperature Temp OK 48 Degrees Celsius Processor Temperature Temp OK 50 Degrees Celsius System Board Temperature Inlet Temp OK 35 Degrees Celsius System Board Temperature Exhaust Temp OK 47 Degrees Celsius NOTE: on Axum and EX-series, use "sudo -i racadm getsensorinfo" to obtain sensor information. Output from host : 192.168.219.5 Entity Type Label Status Info ----- ----- ----- ----- ----- Processor Temperature Temp OK 48 Degrees Celsius Processor Temperature Temp OK 50 Degrees Celsius System Board Temperature Inlet Temp WARN 38 Degrees Celsius; above non-critical threshold System Board Temperature Exhaust Temp OK 49 Degrees Celsius NOTE: on Axum and EX-series, use "sudo -i racadm getsensorinfo" to obtain sensor information. Output from host : 192.168.219.6 Entity Type Label Status Info ----- ----- ----- ----- ----- Processor Temperature Temp OK 50 Degrees Celsius Processor Temperature Temp OK 52 Degrees Celsius System Board Temperature Inlet Temp CRIT 39 Degrees Celsius; above critical threshold System Board Temperature Exhaust Temp OK 51 Degrees Celsius NOTE: on Axum and EX-series, use "sudo -i racadm getsensorinfo" to obtain sensor information. Output from host : 192.168.219.7 Entity Type Label Status Info ----- ----- ----- ----- ----- Processor Temperature Temp OK 45 Degrees Celsius Processor Temperature Temp OK 48 Degrees Celsius System Board Temperature Inlet Temp OK 36 Degrees Celsius System Board Temperature Exhaust Temp OK 47 Degrees Celsius NOTE: on Axum and EX-series, use "sudo -i racadm getsensorinfo" to obtain sensor information. Output from host : 192.168.219.8 Entity Type Label Status Info ----- ----- ----- ----- ----- Processor Temperature Temp OK 51 Degrees Celsius Processor Temperature Temp OK 49 Degrees Celsius System Board Temperature Inlet Temp OK 31 Degrees Celsius System Board Temperature Exhaust Temp OK 43 Degrees Celsius NOTE: on Axum and EX-series, use "sudo -i racadm getsensorinfo" to obtain sensor information. Output from host : 192.168.219.9 Entity Type Label Status Info ----- ----- ----- ----- ----- Processor Temperature Temp OK 52 Degrees Celsius Processor Temperature Temp OK 51 Degrees Celsius System Board Temperature Inlet Temp OK 30 Degrees Celsius System Board Temperature Exhaust Temp OK 42 Degrees Celsius NOTE: on Axum and EX-series, use "sudo -i racadm getsensorinfo" to obtain sensor information. Output from host : 192.168.219.10 Entity Type Label Status Info ----- ----- ----- ----- ----- Processor Temperature Temp OK 54 Degrees Celsius Processor Temperature Temp OK 51 Degrees Celsius System Board Temperature Inlet Temp OK 28 Degrees Celsius System Board Temperature Exhaust Temp OK 41 Degrees Celsius NOTE: on Axum and EX-series, use "sudo -i racadm getsensorinfo" to obtain sensor information. 192.168.219.7 Entity Type Label Status Info ----- ----- ----- ----- ----- Processor Temperature Temp OK 45 Degrees Celsius Processor Temperature Temp OK 48 Degrees Celsius System Board Temperature Inlet Temp OK 36 Degrees Celsius System Board Temperature Exhaust Temp OK 47 Degrees Celsius Output from host : 192.168.219.11 Entity Type Label Status Info ----- ----- ----- ----- ----- Processor Temperature Temp OK 56 Degrees Celsius Processor Temperature Temp OK 55 Degrees Celsius System Board Temperature Inlet Temp OK 27 Degrees Celsius System Board Temperature Exhaust Temp OK 40 Degrees Celsius NOTE: on Axum and EX-series, use "sudo -i racadm getsensorinfo" to obtain sensor information. Output from host : 192.168.219.12 Entity Type Label Status Info ----- ----- ----- ----- ----- Processor Temperature Temp OK 59 Degrees Celsius Processor Temperature Temp OK 59 Degrees Celsius System Board Temperature Inlet Temp OK 26 Degrees Celsius System Board Temperature Exhaust Temp OK 38 Degrees Celsius NOTE: on Axum and EX-series, use "sudo -i racadm getsensorinfo" to obtain sensor information. Output from host : 192.168.219.13 Entity Type Label Status Info ----- ----- ----- ----- ----- Processor Temperature Temp OK 51 Degrees Celsius Processor Temperature Temp OK 49 Degrees Celsius System Board Temperature Inlet Temp OK 26 Degrees Celsius System Board Temperature Exhaust Temp OK 36 Degrees Celsius NOTE: on Axum and EX-series, use "sudo -i racadm getsensorinfo" to obtain sensor information. Output from host : 192.168.219.14 Entity Type Label Status Info ----- ----- ----- ----- ----- Processor Temperature Temp OK 57 Degrees Celsius Processor Temperature Temp OK 60 Degrees Celsius System Board Temperature Inlet Temp OK 26 Degrees Celsius System Board Temperature Exhaust Temp OK 38 Degrees Celsius NOTE: on Axum and EX-series, use "sudo -i racadm getsensorinfo" to obtain sensor information. Output from host : 192.168.219.15 Entity Type Label Status Info ----- ----- ----- ----- ----- Processor Temperature Temp OK 59 Degrees Celsius Processor Temperature Temp OK 59 Degrees Celsius System Board Temperature Inlet Temp OK 26 Degrees Celsius System Board Temperature Exhaust Temp OK 39 Degrees Celsius NOTE: on Axum and EX-series, use "sudo -i racadm getsensorinfo" to obtain sensor information. Output from host : 192.168.219.16 Entity Type Label Status Info ----- ----- ----- ----- ----- Processor Temperature Temp OK 56 Degrees Celsius Processor Temperature Temp OK 56 Degrees Celsius System Board Temperature Inlet Temp OK 26 Degrees Celsius System Board Temperature Exhaust Temp OK 38 Degrees Celsius NOTE: on Axum and EX-series, use "sudo -i racadm getsensorinfo" to obtain sensor information. admin@n1-mgmt:~>
3.可能的情况:
- 一个节点仅 报告传感器或多个传感器:如果仅在温度报告不是“OK”的一个节点上出现此问题,则这很可能表明存在部件问题或节点通风不畅,因为更有可能是内部问题,而不是机架问题。
- 多个节点 受到影响,这更多的是机架本身或数据中心内的环境问题
4.检查风扇运行正常。如果没有,则可能需要更换风扇。
命令:
#cs_hal sensors fan示例:
admin@ecs:~>cs_hal sensors fan Output from host : 192.168.219.1 Entity Type Label Status Info ----- ----- ----- ----- ----- System Board Fan Fan1 OK 12600 RPM System Board Fan Fan2 OK 12600 RPM System Board Fan Fan3 OK 16920 RPM System Board Fan Fan4 OK 16800 RPM System Board Fan Fan5 OK 17040 RPM System Board Fan Fan6 OK 16920 RPM System Board Fan Fan Redundancy OK fully redundant; NOTE: on Axum and EX-series, use "sudo -i racadm getsensorinfo" to obtain sensor information.3.如果所有风扇都报告正常,则表示风扇系统没有问题。联系 Power Edge 团队 ,以检查部件是否需要更换。如果有任何风扇报告问题,请遵循 ECS:呼叫总部:风扇故障;SymptomCode:2008
年4.重要信息:使用 https://central.dell.com/case-lookup/ 并查找 PSNT (产品序列号标签)来检查历史记录。检查过去 3-6 个月内发生了多少次事件。检查问题是持续存在并影响多个节点,还是整个机架受到影响并且入口温度高于正常值,则表明存在需要解决的持续环境问题。除非有明确的行动计划和结论来解决温度问题,否则请勿将案例作为重复案例关闭。
5.如果 PE 团队未发现问题,或者历史记录包含同一警报中的多次实例(3 个月或更长时间),则通过 Swarm 咨询 L2 ,并 准备向 CE 发出工单 ,以检查受影响的机架和节点的环境状况。
第 2 代:
1.使用 cs_hal 检查温度传感器的状态。
示例:
# cs_hal sensors temp Entity Type Label Status Info ----- ----- ----- ----- ----- System Board Temperature SSB Therm Trip OK System Board Temperature BB Inlet Temp OK 32 Degrees Celsius CPU (DCMI Compat) Temperature HSBP Temp OK -222 Degrees Celsius System Board Temperature SSB Temp OK 60 Degrees Celsius System Board Temperature BB BMC Temp OK 51 Degrees Celsius System Board Temperature P1 VR Temp OK 38 Degrees Celsius System Board Temperature IB Temp OK 46 Degrees Celsius System Board Temperature Exit Air Temp OK 54 Degrees Celsius Front Panel Temperature IOM Temp OK 43 Degrees Celsius Drive Backplane Temperature HSBP PSOC OK 37 Degrees Celsius Front Panel Temperature LAN NIC Temp OK 67 Degrees Celsius Power Supply Temperature PS1 Temperature OK 34 Degrees Celsius Power Supply Temperature PS2 Temperature OK 34 Degrees Celsius Processor Temperature P1 Therm Margin OK 216 Degrees Celsius Processor Temperature P2 Therm Margin OK 206 Degrees Celsius Processor Temperature P1 Therm Ctrl % OK 0 Unspecified Processor Temperature P2 Therm Ctrl % OK 0 Unspecified Processor Temperature P1 DTS Therm Mgn OK 216 Degrees Celsius Processor Temperature P2 DTS Therm Mgn OK 206 Degrees Celsius Processor Temperature P1 VRD Hot OK Processor Temperature P2 VRD Hot OK System Board Temperature DIMM Thrm Mrgn 1 OK 201 Degrees Celsius System Board Temperature DIMM Thrm Mrgn 2 OK 200 Degrees Celsius System Board Temperature DIMM Thrm Mrgn 3 OK 198 Degrees Celsius System Board Temperature DIMM Thrm Mrgn 4 OK 197 Degrees Celsius System Board Temperature Agg Thrm Mgn 1 OK 233 Degrees Celsius
2.对于第 3 代版本,请遵循相同的步骤(但不向 PowerEdge 报告),未来将更新第 2 代的更多详细信息。
Affected Products
ECS ApplianceProducts
ECS ApplianceArticle Properties
Article Number: 000046763
Article Type: Solution
Last Modified: 30 Apr 2024
Version: 6
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.