PowerFlex 4.X: SMART_AGGREGATED_STATE_FAILED - DAX Device - SIO03.02.0000013

Summary: PowerFlex reports that one of the dax devices is about to fail and should be replaced. NVDIMM correctable memory errors can cause a dax device to show as "Failed Now" in PowerFlex when the device is not truly failed. ...

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Symptoms

This error usually is triggered with standard PowerFlex Device, the recommended action in this case will be to replace the Disk, however this article explains extra troubleshooting steps if the Device in question is DAX device which is an NVDIMM not a Disk.


PowerFlex reports the below alert, noting that one of the dax devices is about to fail or is operating at reduced performance. This is reflected in the form of a DialHome Alert as well:

 

SIO03.02.0000013
The disk may be about to fail, or may be operating with reduced performance.    
SMART_AGGREGATED_STATE_FAILED_NOW    
SDS.Device.SMART_Aggregated_State_Failed_Now        
Recommended Action: Consider replacing the disk.

 

SMART_AGGREGATED_STATE_FAILED_NOW

Cause

PowerFlex detects that a correctable or uncorrectable error occurred on one of the NVDIMMs for a storage node. Once an error is detected, it generates the Smart Aggregated State Failed Now error.

Resolution

Sanitizing NVDIMM can solve the problem without replacing the NVDIMM, replace NVDIMM only if the problem persists after sanitization. 


  • For 15G nodes, you can only sanitize all NVDIMMs - Remove all devices (SDS device + DAX device) before doing sanitization to avoid data corruption 
  • For 14G nodes, you can sanitize only the one NVDIMM in question or all NVDIMMs, follow the steps below to identify the failing NVDIMM 

 

Validating the Failure

Step 1: In PowerFlex Manager, go to Block > Devices

 

Step 2: Select the columns box to the upper right and add "S.M.A.R.T State" to the list.

 

S.M.A.R.T State

 

Step 3: Review the list of dax devices and confirm if there are any that show "Failed Now"

 

Failed Now

 

 

Step 4: For each dax device in a "Failed Now" state, determine which SDS the dax corresponds to and log in to the iDRAC.

 

Step 5: Determine if there are any correctable errors noted for each of the NVDIMMs in the Life-Cycle Logs.

  • Correctable memory errors on NVDIMMs can cause a dax device to show "Failed Now" in PowerFlex, despite not being failed.

 

Life-Cycle Logs.

 

 

Step 6: SSH into the Primary MDM for the corresponding cluster and run the following commands. This is used to validate the dax health, slot information, and SN of the NVDIMM showing failed:

  • Log in to the Primary MDM with admin login. use single quotes ' ' if password has special characters
    scli --login --username admin --password '<password>' --management_system_ip <pfmp_ip>

  

 

  • Run the following command to list all SDSs details:
scli --query_all_sds

  

Example:

SDS ID: 2e400d3800000004 Name: LAB10_SDS3 State: Connected, Joined IP: XXX
SDS ID: 2e400d3700000003 Name: LAB10_SDS2 State: Connected, Joined IP: XXX
SDS ID: 2e400d3600000002 Name: LAB10_SDS4 State: Connected, Joined IP: XXX
SDS ID: 2e400d3500000001 Name: LAB10_SDS1 State: Connected, Joined IP: XXX
SDS ID: 2e400d3400000000 Name: LAB10_SDS5 State: Connected, Joined IP: XXX

 


  
  • Location the impacted SDS and note down the SDS name then run the following command
scli --query_sds --sds_name <sds name>
  • Note down the device path for all devices for this SDS, will be used later to add the devices back in step 8 
  • Note down the impacted device id, then run the below command
# scli --query_sds_device_info --device_id <device_id>

Example output:

# scli --query_sds_device_info --device_id XXXXX

        Device ID: _________ Name: /dev/dax1.0 Path: /dev/dax1.0

                ScaleIO Device Configuration:

                        Original Path: /dev/dax1.0

                        Acceleration Pool: _________

                        Used for RFcache: no

                        Capacity Limit: 31.4 GB (32107 MB)

                        Device State: Normal

                Physical Device Information:

                        Device Type: UNKNOWN

                        Media Type: NVDIMM

                        Auto Detected Media Type: UNKNOWN

                        Vendor Name: N/A

                        Model Name: nmem2

                        Serial Number: _________

                        Slot Number: B5

                        Firmware Version: N/A

                        Cache Look-ahead: not Active

                        Write Cache: not Active

                        ATA Security: not Active

                        Logical Sector Size: 0 B

                        Physical Sector Size: 0 B

                        Capacity: 0 GB

                        LED Setting: OFF

                Background Device Scanner Information:

                        Scanned: 0 MB

                        Error Fixes: 0

                        Compare Errors: 0

                SMART Information:

                        Aggregated State: FAILED_NOW

                        Temperature State: NEVER_FAILED

                                Current Value: 34 Worst Value: 34 Threshold: 0

                        Media Wearout Indicator State: NEVER_FAILED

                                Current Value: 95 Worst Value: 95 Threshold: 5

                RAID Controller Information:

                        Serial Number: N/A

                        RAID vDisk Status: N/A

                        RAID vDisk Type: N/A

                        RAID vDisk Cache: N/A

 

 

Step 7: After identifying which dax corresponds to the noted NVDIMM, follow the standard procedure to sanitize the NVDIMMs through the iDRAC if the errors are correctable. Follow the standard guidelines to NVDIMM troubleshooting should the errors be uncorrectable.

 

Step 8: After the NVDIMMs have been sanitized add back the dax devices to the SDS then the powerflex devices accordingly to the previously noted device path, the dax devices will show as "Never Failed" The corresponding alert will also be cleared from the Alerts tab automatically.

check steps 9 and 10 in article PowerFlex: How to Replace NVDIMM in 15G and later PowerEdge Node

Additional Information

Affected Products

PowerFlex rack, ScaleIO
Article Properties
Article Number: 000438714
Article Type: Solution
Last Modified: 30 حزيران 2026
Version:  3
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.