PowerFlex 5.X: Disk Group Capacity fault tolerance state is dual degraded
Summary: An alert is raised when a Disk Group (DG) enters a dual degraded capacity fault tolerance state, indicating that the system is at risk of data unavailability if another failure occurs.
Symptoms
Alert Message:
Device Group: dg_title (ID <DG ID>) capacity fault tolerance state is dual degraded.
Impact
The system has experienced two simultaneous failures within the same DG. If another failure occurs before the rebuild completes, data availability will be at risk.
Cause
A DG enters a dual degraded state when:
- Two devices or nodes within the group are unavailable or failed.
- The system is unable to maintain full redundancy.
Resolution
Step 1: Validate Rebuild Progress
From the Primary MDM, run the following command to monitor the current rebuild status:
If a rebuild is in progress, wait for it to complete before taking further action.
scli --query_all | grep -i rebuild
Step 2: Increase Spare Capacity
If the rebuild has not started, the system may lack sufficient spare capacity.
Use the following command to adjust spare capacity:
scli --modify_device_group --device_group_id <Device Group ID> --spare_storage_node_count <Number of storage nodes> --spare_device_count <Number of devices>
--spare_storage_node_count (0–2): Number of storage nodes to allocate for spare
--spare_device_count (0–24): Number of devices to allocate for spare
Step 3: Recovery Actions
- Add or repair Storage Nodes or Devices to restore redundancy.
- Clear the device(s) error from PowerFlex.
- Free up capacity in the Storage Pool, if possible.
- Exit Maintenance Mode if any nodes are in MM:
scli --exit_storage_node_maintenance_mode
Impacted Versions
PowerFlex 5.x