XtremIO: How to Resolve and Manage Cluster Physical Capacity alerts (XTR0200302-4, XTR0203792 and XTR0203892) (User Correctable)
Summary: An XtremIO article detailing the cause and resolution of the following cluster physical capacity alerts: sys_ud_ssd_space_limited (XTR0200302), sys_ud_ssd_space_very_limited (XTR0200303), sys_ud_ssd_space_no_free (XTR0200304), user_physical_capacity_high (XTR0203792) and user_physical_capacity_very_high (XTR0203892) ...
Symptoms
The following alerts are created when the used physical capacity on an XtremIO array exceeds a predefined threshold or completely runs out:
| Alert Name | Symptom Code | Description |
|---|---|---|
|
|
|
Cluster-free physical capacity is low. Threshold: more than 85 percent is used |
|
|
|
Cluster-free physical capacity is critically low. Threshold: more than 90 percent is used |
|
|
|
The cluster has no free physical capacity |
|
|
|
Exceeded SSD high utilization threshold. There are 70%(free_ud_ssd_space)s KB remaining.
Note: This can be changed to a user-defined percentage
|
|
|
|
Exceeded SSD very high utilization threshold. There are 80%(free_ud_ssd_space)s KB remaining.
Note: This can be changed to a user-defined percentage
|
There is no current impact on the cluster if any of the above symptom codes are raised, except for symptom code XTR0200304.
Symptom code XTR0200304 indicates that cluster physical space is fully consumed, in which case the cluster does not accept any more writes and cluster data become read-only/write-protected from a host perspective. This may cause some hosts to disconnect, or have read-only access to data, or both.
Cause
Refer to the Issue section of this article for the cause of each of the listed alerts.
Resolution
If the listed alerts are being reported for a Disaster Recovery (DR) array using RecoverPoint replication, then refer to KB 494416 - XtremIO running out of free capacity (RecoverPoint DR site) for potential known issues and their resolution.
If the above is not applicable, then free-up physical capacity on the affected XtremIO cluster by doing one or more of the following:
-
Reclaim space
Deleted space should be reclaimed on the host side. For all other host OS types, follow the instructions in the relevant Space Reclamation section of the XtremIO Storage Array Host Configuration Guide -
Delete or remove unused volumes
Assess the volumes actively in use within your XtremIO cluster. Consider deleting or removing any volumes that are no longer required by external applications. Instructions detailing how to delete or remove unused volumes can be found documented within the Deleting Volumes section of the XtremIO Storage Array User Guide corresponding to the software version of the affected cluster. You can also refer to the Resolution section of KB 468164 - XtremIO: Managing Cluster Capacity in Response to Product Alerts for detailed instructions.Note: Although physical capacity is automatically reclaimed by XtremIO following volume deletion or removal, deleted space should also be reclaimed on the host side by following the instructions in step 1 -
Offload under-utilized volumes to another platform
-
Perform Online Cluster Expansion (OCE) to add more storage to the array. This can be scheduled by contacting your local account team (SAM, DSM, and ASR) to discuss an action plan to scale out the XtremIO array
Note: To raise an alert with a predefined capacity threshold (ranging from 0 to 100), connect to the XMCLI in the UI or PuTTy and run the following command:xmcli (admin)> modify-alert-definition alert-type="user_physical_capacity_very_high" activity-mode=enabled clearance-mode="ack_required" threshold=<threshold_value> Modified Alert Definition user_physical_capacity_high
- activity-mode=enabled raise an alert when the defined threshold is reached. The default for user_physical_capacity alerts is activity-mode=disabled
- clearance-mode=ack_required cause the alert to disappear when the capacity percentage is below the threshold AND the alert is acknowledged
- threshold=<threshold_value> allows to change the percentage of space to be reached before triggering the alert
To check how the alerts are set up, run the show-alert-definitions command using XMCLI.
Additional Information
An automatic Global Services Service Request (SR) will be generated for symptom codes XTR0200302, XTR0200303, and XTR0200304 of this article.