ECS:系統偵測到節點上的高溫
Summary: 如果我收到電子郵件警示,通知我系統偵測到節點上的高溫感應器讀數,該檢查什麼?
This article applies to
This article does not apply to
This article is not tied to any specific product.
Not all product versions are identified in this article.
Instructions
-
確認警示的節點為何硬體。
admin@node1:~> sudo xdoctor -x Telegraf Version: 3.8.0.2-1549.73c8abc2 Fabric Version: 3.8.0.2-4347.d30cd09 Fabric-Zookeeper Version: 3.8.0.2-120.b4a1c5c Utilities Version: 3.7.0.4-1166.b78f3fe Influxdb Version: 3.8.0.2-1549.73c8abc2 Grafana Version: 3.8.0.2-1549.73c8abc2 Syslog Version: 3.8.0.2-4347.d30cd09 Service Version: 9.0.0.0-22840.479b013c74 Os Version: 3.8.0.2-2113.3fa664c.3 Fluxd Version: 3.8.0.2-1549.73c8abc2 Throttler Version: 3.8.0.2-1549.73c8abc2 Object Image Version: 3.8.0.2-138636.7343cd5c2c3 -------------------- ECS Version: 3.8.0.2 -------------------- HW Gen : 2 HW Model: U-Series HW Code : S2600KP ------------------------- xDoctor Version: 4.8-98.0 -------------------------若為 Gen 1/2 節點,請回覆電子郵件中需要協助的表單。若為 Gen 3 節點,請遵循此知識庫的其餘部分。
-
檢查溫度感應器的目前狀態。下方我們在兩個節點上看到「CRIT」,表示這兩個節點上的問題。如果所有節點都回報為「OK」,但最近多次收到此警示,則這可能是一個反覆出現的問題。若是如此,請回覆電子郵件中的表單,說明需要協助,並定期發出溫度警示。
admin@node1:~> viprexec -i cs_hal sensors temp Output from host : xxx.xxx.xxx.xxx Entity Type Label Status Info ----- ----- ----- ----- ----- Processor Temperature Temp OK 53 Degrees Celsius Processor Temperature Temp OK 53 Degrees Celsius System Board Temperature Inlet Temp CRIT 40 Degrees Celsius; above critical threshold System Board Temperature Exhaust Temp OK 50 Degrees Celsius NOTE: on Axum and EX-series, use "sudo -i racadm getsensorinfo" to obtain sensor information. Output from host : xxx.xxx.xxx.xxx Entity Type Label Status Info ----- ----- ----- ----- ----- Processor Temperature Temp OK 47 Degrees Celsius Processor Temperature Temp OK 49 Degrees Celsius System Board Temperature Inlet Temp CRIT 39 Degrees Celsius; above critical threshold System Board Temperature Exhaust Temp OK 50 Degrees Celsius NOTE: on Axum and EX-series, use "sudo -i racadm getsensorinfo" to obtain sensor information. Output from host : xxx.xxx.xxx.xxx Entity Type Label Status Info ----- ----- ----- ----- ----- Processor Temperature Temp OK 46 Degrees Celsius Processor Temperature Temp OK 46 Degrees Celsius System Board Temperature Inlet Temp OK 35 Degrees Celsius System Board Temperature Exhaust Temp OK 47 Degrees Celsius NOTE: on Axum and EX-series, use "sudo -i racadm getsensorinfo" to obtain sensor information. ... ... ...
如果多個節點回報為「OK」 則問題可能存在於您的資料中心環境中檢查 ECS 所在區域沒有可能提高 ECS 溫度的問題。
-
檢查 ECS 風扇的狀態。
admin@ecs:~>cs_hal sensors fan Output from host : xxx.xxx.xxx.xxx Entity Type Label Status Info ----- ----- ----- ----- ----- System Board Fan Fan1 OK 12600 RPM System Board Fan Fan2 OK 12600 RPM System Board Fan Fan3 OK 16920 RPM System Board Fan Fan4 OK 16800 RPM System Board Fan Fan5 OK 17040 RPM System Board Fan Fan6 OK 16920 RPM System Board Fan Fan Redundancy OK fully redundant; NOTE: on Axum and EX-series, use "sudo -i racadm getsensorinfo" to obtain sensor information.
-
回復電子郵件中的表格,說明需要協助,包括溫度感應器輸出和風扇輸出。
Affected Products
ECSArticle Properties
Article Number: 000227188
Article Type: How To
Last Modified: 30 Jul 2024
Version: 2
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.