Data Domain: After space issues are resolved with the cloud provider, cloud profile still in "Disconnected" status with error "Insufficient storage at the cloud provider"

Summary: On systems with Long-Term Retention/Cloud Tier (LTR/CT) configured, the cloud profile may stay in "Disconnected" status even after resolving the capacity issues with the cloud provider. ...

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Symptoms

DDOS marks a cloud profile as "Unavailable" or "Disconnected" if protocol to or from the configured cloud provider returns errors. For the particular case of capacity issues in the cloud provider or account, HTTP 507 type errors would be returned.

For DDOS releases prior to 7.7.2 this error code would be treated such as any other one, and the cloud profile and provider may end up unavailable if the errors repeat frequently. However,

HTTP/507 being a special kind of error, one which the cloud provider uses to report being at capacity, DDOS 7.7.2 and later handles this in a special way, namely, setting the cloud profile to the status shown below:
Name        Profile             Status         Reason
---------   -----------------   ------------   -------------------------------------------
cloud_unit  cloud_profile       Disconnected   Insufficient storage at the cloud provider.
---------   -----------------   ------------   -------------------------------------------
 
So that the DD administrator knows why the cloud unit is not working anymore a persistent registry key is set, indicating the cloud provider is unavailable. The DD FS process will not contact the cloud provider anymore.

After the capacity issues in the cloud provider are resolved, the cloud profile and unit still stay in the same status. No indication of cloud activity can be seen in the usual "ddfs.info" and "csm.log" files. Restarting the FS does not solve the issue.

Cause

When the cloud provider reports to be at capacity (through HTTP/507 errors) the cloud profile and provider are marked as disconnected, and a registry key is set. This registry key prevents the cloud code from reaching out to the cloud provider again, so even if the space issue is resolved, the cloud unit cannot be accessed without further intervention of the administrator.

Although this issue has been seen only for ECS as a cloud provider, there is nothing preventing it from occurring for other cloud providers, as long as they can return HTTP 507 error codes to the DD when capacity issues exist (that is, when the account that is used for storage has reached some given quota).

Resolution

There is a new command in DDOS 7.7.2 and later releases which must be used to clear the cloud unit state to be able to access it again after the cloud turned disconnected due to capacity:
#  cloud unit clear-state <unit-name> {nospace}
                                       Clear the registry entry indicating
                                       cloud unit is not available for writing
 

For the example above, the command below would need be run after resolving the cloud provider capacity issues to allow for the cloud provider to be used again:
# cloud unit clear-state cloud-unit nospace
 

Affected Products

Data Domain
Article Properties
Article Number: 000203168
Article Type: Solution
Last Modified: 12 May 2023
Version:  3
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.