PowerFlex: Exceeding Critical Capacity Threshold Impact
Summary: How running your PowerFlex storage capacity above the 90% critical threshold can impact the cluster.
Symptoms
Being above the Critical Capacity Threshold you see that an alert from your user interface. Alerts view stating the following:
Severity: High - Capacity utilization above critical threshold
Cause
Due to thinly provisioned volumes or snapshot usage, the capacity utilization of the storage pool reaches the critical threshold. Once a critical threshold is reached, PowerFlex does not attempt to rebalance until the situation is corrected. If the cluster is unable to rebalance, new write operations to the disks fail because thin volumes get capacity allocated as needed at the time of the write, so the ability to allocate capacity on all devices of a storage pool is necessary for thin writes to succeed.
In a cluster so full and unable to rebalance, due to small inconsistencies of device allocations, at some point one of the devices reaches 100% capacity and IO stops. Without rebalancing active, these inconsistencies never get fixed.
Resolution
Temporarily changing the critical threshold percentage higher allows the rebalance process to resume and fix the capacity allocation, which releases capacity on the full devices and allow IO service to restore.