ECS: xDoctor: RAP072: xDoctor reports hardware event - The system inlet temperature is greater than the upper critical threshold

Summary: The system inlet temperature is greater than the upper critical threshold.

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Symptoms

The ECS node has logged a hardware event on one of its components.
---------------------------------------------------
ERROR - xDoctor detected a hardware event.
--------------------------------------------------
Node = Nodes
Extra = {"Nodes": {"169.254.1.8": {"model": "ECSv3_R740XD2", "errors": {"TMP0121": {"category": "System", "severity": "Critical", "timestamp": "2023-10-21 14:05:22", "seq_number": "330430", "dell_model": "", "message": "The system inlet temperature is greater than the upper critical threshold."}}, "service_tag": "BBXXXX"}}} 
RAP = RAP072
Solution = KB 521400
Timestamp = 2023-11-16_125259
PSNT = CKM00xxxxxxx @ 4.8-94.1

Cause

Inlet temperature is used to detect environment temperature around the node.

Resolution

Check SEL logs for the alert on other nodes on the ECS.
admin@ecsnode1: svc_exec "sudo ipmitool sel elist | grep Temp"
svc_exec v1.0.6 (svc_tools v2.14.0)                 Started 2023-11-22 07:10:02

Output from node: r1n1                                retval: 1

Output from node: r1n2                                retval: 1

Output from node: r1n3                                retval: 1

Output from node: r1n4                                retval: 1

Output from node: r1n5                                retval: 1

Output from node: r1n6                                retval: 0
   c | 10/21/2023 | 13:48:59 | Temperature Inlet Temp | Upper Non-critical going high | Asserted | Reading 37 > Threshold 37 degrees C
   d | 10/21/2023 | 14:20:50 | Temperature Inlet Temp | Upper Non-critical going high | Deasserted | Reading 34 > Threshold 37 degrees C

Output from node: r1n7                                retval: 0
   c | 10/21/2023 | 13:50:13 | Temperature Inlet Temp | Upper Non-critical going high | Asserted | Reading 37 > Threshold 37 degrees C
   d | 10/21/2023 | 14:20:41 | Temperature Inlet Temp | Upper Non-critical going high | Deasserted | Reading 34 > Threshold 37 degrees C

Output from node: r1n8                                retval: 0
   e | 10/21/2023 | 13:52:17 | Temperature Inlet Temp | Upper Non-critical going high | Asserted | Reading 37 > Threshold 37 degrees C
   f | 10/21/2023 | 14:05:21 | Temperature Inlet Temp | Upper Critical going high | Asserted | Reading 39 > Threshold 39 degrees C
  10 | 10/21/2023 | 14:20:52 | Temperature Inlet Temp | Upper Critical going high | Deasserted | Reading 36 > Threshold 39 degrees C
  11 | 10/21/2023 | 14:21:21 | Temperature Inlet Temp | Upper Non-critical going high | Deasserted | Reading 34 > Threshold 37 degrees C

Output from node: r1n9                                retval: 0
   c | 10/21/2023 | 13:56:51 | Temperature Inlet Temp | Upper Non-critical going high | Asserted | Reading 37 > Threshold 37 degrees C
   d | 10/21/2023 | 14:14:28 | Temperature Inlet Temp | Upper Critical going high | Asserted | Reading 39 > Threshold 39 degrees C
   e | 10/21/2023 | 14:20:44 | Temperature Inlet Temp | Upper Critical going high | Deasserted | Reading 36 > Threshold 39 degrees C
   f | 10/21/2023 | 14:21:04 | Temperature Inlet Temp | Upper Non-critical going high | Deasserted | Reading 34 > Threshold 37 degrees C

Output from node: r1n10                               retval: 0
   e | 10/21/2023 | 13:59:00 | Temperature Inlet Temp | Upper Non-critical going high | Asserted | Reading 37 > Threshold 37 degrees C
   f | 10/21/2023 | 14:21:07 | Temperature Inlet Temp | Upper Non-critical going high | Deasserted | Reading 34 > Threshold 37 degrees C

In the above example, the alert is seen on multiple nodes in the rack. There is a power or AC issue in the data center like the AC not working properly. This would have caused the inlet temperature of the nodes to rise.

The user has to check their data center for any such power or AC issues.

If no issues found at the data center, open a service request with Dell ECS support for further investigation.

Affected Products

ECS
Article Properties
Article Number: 000220323
Article Type: Solution
Last Modified: 21 Mar 2024
Version:  3
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.