Connectrix B-Series: Switch went down due to incorrect sensor temperature
Summary: Switch went down due to incorrect sensor temperature. High temperature (12821 C) exceeds system temperature limit. System will shut down within 2 minutes.
Symptoms
Switch shuts down and the following messages can be observed in the RASLOG. Observed in FOS 9.1.1c
[EM-1014], 16758, CHASSIS, ERROR, Unable to read sensor on Switch (-23).
[EM-1014], 16759, CHASSIS, ERROR, Unable to read sensor on Fan 1 (-10).
[EM-1014], 16761, CHASSIS, ERROR, Unable to read sensor on PS 1 (-1).
[HIL-1506], 16767, FFDC | CHASSIS, CRITICAL, High temperature (12821 C) exceeds system temperature limit. System will shut down within 2 minutes.
[MAPS-1020], 16770, FID 128, WARNING, Switch wide status has changed from HEALTHY to CRITICAL.
[HIL-1509], 16771, FFDC | CHASSIS, CRITICAL, High temperature (12821 C). Warning time expired. System preparing for shutdown.
Cause
This is due to FOS-855493, where the switch shutdown due to incorrect sensor temperature.
Resolution
The defect has been fixed in FOS v9.1.1d, v9.2.0b, v9.2.1a, and v9.2.2. Here is the defect info from the release note of v9.2.0b:
--------------
Defect ID: FOS-855493
Technical Severity: Medium
Probability: Low
Product: Fabric OS
Technology Group: Monitoring/RAS
Reported In Release: FOS9.0.1
Technology: Equipment Status
Symptom: Switch shutdown after abnormal sensor temperatures such as (-1 C) or (191 C) are reported: [HIL-1506], 3498/333, FFDC | , CRITICAL, sw0, High temperature (-1 C) exceeds system temperature limit. System will shut down within 2 minutes., OID:0x43000000, SPOID:0x4300000
Condition: A broadcast storm generated an extremely high CPU load condition. This resulted in the i2c sensor read to intermittently fail and used an invalid temperature value to shutdown system.