Article Number: 521925

printer Print mail Email

Data Domain: DD9800 - voltage is faulty - alert

Summary: Data Domain DD9800 Revision2 - Troubleshooting Alert: The storage processor (SP) has failed - Voltage is faulty

Primary Product: Data Domain

Product: Data Domain more...

Last Published: 23 Mar 2020

Article Type: Break Fix

Published Status: Online

Version: 5

Data Domain: DD9800 - voltage is faulty - alert

Article Content

Issue


Due to an overlapping bitmap entry, DD9800 Revision2  may indict an SP due to "Voltage is Faulty" , when the real issue is a CPU External IERR.
When both events ( CPU External IERR & "Voltage is faulty") occur simultaneously, it is safe to ignore the voltage alert, and focus troubleshooting on why the CPU encountered 
a CPU IERR. For more details on troubleshooting CPU IERR errors on DD9800 platforms, please contact your Support provider for assistance.

Affected Systems:
- DD9800 Rev 2
- DDOS 5.7.x / 6.0.x  / <6.1.3.0

Symptoms:

Posted Alert 
=========
Time:           Tue Mar 27 07:19:53 2018
Alert Id:       p0-57
Event Id:       EVT-ENVIRONMENT-00032
Event Message:  The storage processor has failed
Object:         Enclosure=1
Additional Information: Cause=Voltage is faulty

"Voltage is faulty" event found in messages.engineering log
===============================================
Mar 15 05:48:27  platmon: CRITICAL: The storage processor has failed. Enclosure=1 Cause="Voltage is faulty"
Mar 15 05:48:28 platmon: INFO: Event posted: p0-275 (11000113:28521xxxx): EVT-ENVIRONMENT-00032: The storage processor has failed EVT-OBJ::Enclosure=1 EVT-INFO::Cause=Voltage is faulty
Mar 15 05:48:28  platmon: INFO: _ems_post_pubsub_event: Publishing event  for alert EVT-ENVIRONMENT-00032

CPU IERR event found in bios.txt log
===============================
   1 | 03/15/2018 | 05:33:39 | SMI Critical Interrupt Events Enter_SMI | SMI Critical Interrupt | Asserted | Used AUX Log (LSB 0x0) Used AUX Log (MSB 0x0)
   2 | 03/15/2018 | 05:33:41 | CPU Status Events CPU2_Status | CPU IERR | Asserted |  CPU External IERR
   3 | 03/15/2018 | 05:33:41 | Entering IERR Interrupt Events Enter_SMI | IERR Interrupt | Asserted | Used AUX Log (LSB 0x24) Used AUX Log (MSB 0x0)
   4 | 03/15/2018 | 05:33:42 | BMC Chassis Ctrl Events BMC_Chassis_Ctrl | Reset through BMC | Asserted
   5 | 03/15/2018 | 05:34:04 | Power Unit DC_State | State Asserted | Deasserted


 
Cause

This issue is found in DDOS versions 5.7 , 6.0, & 6.1

The root cause is SP fault bitmap is overlapped, so when IERR event happens , the warning message incorrectly displays "voltage is faulty"

#define  APL_FRU_FAULT_SP_CPUMISC       (1 << 17)
#define   APL_FRU_FAILEDMAP_VOLTFAULT_SP    (1 << 17)

 

Resolution
This overlapping bitmap will be fixed in DDOS 6.1.3.x
 
Notes


 

Issue


Due to an overlapping bitmap entry, DD9800 Revision2  may indict an SP due to "Voltage is Faulty" , when the real issue is a CPU External IERR.
When both events ( CPU External IERR & "Voltage is faulty") occur simultaneously, it is safe to ignore the voltage alert, and focus troubleshooting on why the CPU encountered 
a CPU IERR. For more details on troubleshooting CPU IERR errors on DD9800 platforms, please contact your Support provider for assistance.

Affected Systems:
- DD9800 Rev 2
- DDOS 5.7.x / 6.0.x  / <6.1.3.0

Symptoms:

Posted Alert 
=========
Time:           Tue Mar 27 07:19:53 2018
Alert Id:       p0-57
Event Id:       EVT-ENVIRONMENT-00032
Event Message:  The storage processor has failed
Object:         Enclosure=1
Additional Information: Cause=Voltage is faulty

"Voltage is faulty" event found in messages.engineering log
===============================================
Mar 15 05:48:27  platmon: CRITICAL: The storage processor has failed. Enclosure=1 Cause="Voltage is faulty"
Mar 15 05:48:28 platmon: INFO: Event posted: p0-275 (11000113:28521xxxx): EVT-ENVIRONMENT-00032: The storage processor has failed EVT-OBJ::Enclosure=1 EVT-INFO::Cause=Voltage is faulty
Mar 15 05:48:28  platmon: INFO: _ems_post_pubsub_event: Publishing event  for alert EVT-ENVIRONMENT-00032

CPU IERR event found in bios.txt log
===============================
   1 | 03/15/2018 | 05:33:39 | SMI Critical Interrupt Events Enter_SMI | SMI Critical Interrupt | Asserted | Used AUX Log (LSB 0x0) Used AUX Log (MSB 0x0)
   2 | 03/15/2018 | 05:33:41 | CPU Status Events CPU2_Status | CPU IERR | Asserted |  CPU External IERR
   3 | 03/15/2018 | 05:33:41 | Entering IERR Interrupt Events Enter_SMI | IERR Interrupt | Asserted | Used AUX Log (LSB 0x24) Used AUX Log (MSB 0x0)
   4 | 03/15/2018 | 05:33:42 | BMC Chassis Ctrl Events BMC_Chassis_Ctrl | Reset through BMC | Asserted
   5 | 03/15/2018 | 05:34:04 | Power Unit DC_State | State Asserted | Deasserted


 
Cause

This issue is found in DDOS versions 5.7 , 6.0, & 6.1

The root cause is SP fault bitmap is overlapped, so when IERR event happens , the warning message incorrectly displays "voltage is faulty"

#define  APL_FRU_FAULT_SP_CPUMISC       (1 << 17)
#define   APL_FRU_FAILEDMAP_VOLTFAULT_SP    (1 << 17)

 

Resolution

This overlapping bitmap will be fixed in DDOS 6.1.3.x
 

Notes


 

Article Attachments

Attachments

Attachments

Article Properties

First Published

Wed Jun 06 2018 21:40:02 GMT

First Published

Wed Jun 06 2018 21:40:02 GMT

Rate this article

Accurate
Useful
Easy to understand
Was this article helpful?
0/3000 characters