Cloud unit does not free space after multiple cleaning cycles
Summary: Cloud unit does not free space after multiple cleaning cycles.
Symptoms
This can happen in any of the DD with cloud tier enabled.
Cloud cleaning does not have any utilization drop, if there are many GC pending cycle stuck to process background delete.
Cause
If the cloud unit timeouts are too frequent, the DD cannot process background delete.
EVT-CLOUD-00001: Unable to access provider for cloud unit edccloudtier2.EVT-OBJ::CloudUnit=edccloudtier2 EVT-INFO::Cause=Timeout was reached EVT-CLOUD-00001: Unable to access provider for cloud unit edccloudtier2.EVT-OBJ::CloudUnit=edccloudtier2 EVT-INFO::Cause=Cloud profile state is DOWN
Resolution
Below are the possible issues when a cloud unit gets disconnected.
- DDFS goes into a not-responding state or gets paniced.
- Data movement either stop or is suspended.
- Cloud cleaning either be aborted or does not clean expired data from cloud tier.
- Cloud recall does not work or takes time or shows data not found.
The cloud unit will not show a utilization drop after cleaning cycle if multiple pending cycles are in a stopped responding state. The evidence can be seen in auto-support.
GC remote delete stats for CP edccloudtier2*: Recent Cumulative Number of delete list containers <cid,offset> processed: 174 1799 Number of delete list containers <cid> skipped: 173 1219 Number of regions to delete: 0 627732657 Number of regions deleted: 8236883 215044853 Bytes deleted: 623322937273 16683421699021 Run time (s): 332316 9324202 Deletion rate (region/s): 24 23 >Pending cycles: 17 17
Follow these steps as per the scenario.
- Check connectivity with cloud providers and see if
- There is any issue with connection on port(443)
- Check name resolution of cloud provider from DD
- If everything is proper between DD and cloud tier, go to DD and see if the file system does not respond or has panic.
- Reboot the DD to release any stopped responding process to get proper cloud connectivity and process pending GC cycles.
- The issue is resolved by making the cloud connectivity proper and run below commands on DD to stop and restart background deletes.
Stop the async deletes# cloud clean background-delete stop
Restart the bulk deletes.# cloud clean background-delete start
- Bug #234809 - Upgrade the DD to 6.1.2.x if you have more than one cloud unit and having cloud cleaning issue.
Additional Information
The cloud unit will not show a utilization drop after cleaning cycle if there are multiple pending cycles are in a stopped responding state. The evidence can be seen in auto-support.
Before resolution
GC remote delete stats for CP edccloudtier2*: Recent Cumulative Number of delete list containers <cid,offset> processed: 174 1799 Number of delete list containers <cid> skipped: 173 1219 Number of regions to delete: 0 627732657 Number of regions deleted: 8236883 215044853 Bytes deleted: 623322937273 16683421699021 Run time (s): 332316 9324202 Deletion rate (region/s): 24 23 >Pending cycles: 17 17
After resolution
GC remote delete stats for CP edccloudtier2*: Recent Cumulative Number of delete list containers <cid,offset> processed: 1285 0 Number of delete list containers <cid> skipped: 776 0 Number of regions to delete: 0 0 Number of regions deleted: 165625861 0 Bytes deleted: 11764930559267 0 Run time (s): 19897 0 Deletion rate (region/s): 8313 0 Pending cycles: 1 1
The evidences can be seen by analyzing multiple auto-supports and looking at the GC pending cycles. Below are some examples.
autosupport_2018-11-08.out:Pending cycles: 5 5 autosupport_2018-11-13.out:Pending cycles: 6 6 autosupport_2018-11-24.out:Pending cycles: 7 7 autosupport_2018-12-15.out:Pending cycles: 9 9 autosupport_2018-12-16.out:Pending cycles: 10 10 autosupport_2018-12-28.out:Pending cycles: 11 11 autosupport_2018-12-29.out:Pending cycles: 12 12 autosupport_2019-01-11.out:Pending cycles: 13 13 autosupport_2019-01-18.out:Pending cycles: 14 14 autosupport_2019-01-19.out:Pending cycles: 15 15 autosupport_2019-02-05.out:Pending cycles: 16 16 autosupport_2019-02-07.out:Pending cycles: 17 17 autosupport_2019-02-08.out:Pending cycles: 1 1 autosupport_2019-02-09.out:Pending cycles: 1 1 autosupport_2019-02-10.out:Pending cycles: 1 1
The cloud unit timeout issues can be seen in logs or in "# alerts show history"
INFO: Event posted: m0-1585 (21000631:553649713): EVT-CLOUD-00001: Unable to access provider for cloud unit edccloudtier2.EVT-OBJ::CloudUnit=edccloudtier2 EVT-INFO::Cause=Cloud profile state is DOWN INFO: Event posted: m0-1586 (21000632:553649714): EVT-CLOUD-00001: Unable to access provider for cloud unit edccloudtier2.EVT-OBJ::CloudUnit=edccloudtier2 EVT-INFO::Cause=Cloud profile state is DOWN INFO: Event posted: m0-1587 (21000633:553649715): EVT-CLOUD-00001: Unable to access provider for cloud unit edccloudtier2.EVT-OBJ::CloudUnit=edccloudtier2 EVT-INFO::Cause=Cloud profile state is DOWN INFO: Event posted: m0-1588 (21000634:553649716): EVT-CLOUD-00001: Unable to access provider for cloud unit edccloudtier2.EVT-OBJ::CloudUnit=edccloudtier2 EVT-INFO::Cause=Cloud profile state is DOWN INFO: Event posted: m0-1589 (21000635:553649717): EVT-CLOUD-00001: Unable to access provider for cloud unit edccloudtier2.EVT-OBJ::CloudUnit=edccloudtier2 EVT-INFO::Cause=Cloud profile state is DOWN INFO: Event posted: m0-1591 (21000637:553649719): EVT-CLOUD-00001: Unable to access provider for cloud unit edccloudtier2.EVT-OBJ::CloudUnit=edccloudtier2 EVT-INFO::Cause=Cloud profile state is DOWN INFO: Event posted: m0-1578 (2100062a:553649706): EVT-CLOUD-00001: Unable to access provider for cloud unit edccloudtier2.EVT-OBJ::CloudUnit=edccloudtier2 EVT-INFO::Cause=Cloud profile state is DOWN INFO: Event posted: m0-1579 (2100062b:553649707): EVT-CLOUD-00001: Unable to access provider for cloud unit edccloudtier2.EVT-OBJ::CloudUnit=edccloudtier2 EVT-INFO::Cause=Cloud profile state is DOWN INFO: Event posted: m0-1580 (2100062c:553649708): EVT-CLOUD-00001: Unable to access provider for cloud unit edccloudtier2.EVT-OBJ::CloudUnit=edccloudtier2 EVT-INFO::Cause=Cloud profile state is DOWN INFO: Event posted: m0-1581 (2100062d:553649709): EVT-CLOUD-00001: Unable to access provider for cloud unit edccloudtier2.EVT-OBJ::CloudUnit=edccloudtier2 EVT-INFO::Cause=Cloud profile state is DOWN INFO: Event posted: m0-1582 (2100062e:553649710): EVT-CLOUD-00001: Unable to access provider for cloud unit edccloudtier2.EVT-OBJ::CloudUnit=edccloudtier2 EVT-INFO::Cause=Cloud profile state is DOWN INFO: Event posted: m0-1583 (2100062f:553649711): EVT-CLOUD-00001: Unable to access provider for cloud unit edccloudtier2.EVT-OBJ::CloudUnit=edccloudtier2 EVT-INFO::Cause=Cloud profile state is DOWN INFO: Event posted: m0-1584 (21000630:553649712): EVT-CLOUD-00001: Unable to access provider for cloud unit edccloudtier2.EVT-OBJ::CloudUnit=edccloudtier2 EVT-INFO::Cause=Cloud profile state is DOWN