Data Domain: Scheduled cleaning fails to start, posting WARNING "EVT-GC-00002: Unable to start scheduled file system cleaning"
Summary: DataDomain clean (GC) is scheduled to run on particular days and times. In more recent DDOS versions, when there is any such schedule and, for any reason, the clean process cannot be started, this is noticed by the system monitoring daemon, which raises an alert. ...
This article applies to
This article does not apply to
This article is not tied to any specific product.
Not all product versions are identified in this article.
Symptoms
DataDomain clean (Garbage Collection, GC) is scheduled to run on particular days and times. In DDOS 6.0.x and later versions, when there is any such schedule and, for any reason, the clean process cannot be started, this is noticed by the system monitoring daemon and eventually raises an alert such as the one below:
# alerts show current # alerts show current Id Post Time Severity Class Object Message ----- ------------------------ -------- ----------- --------- ----------------------------------------------------------------------------------------- m0-11 Tue Jun 27 16:32:03 2017 WARNING Filesystem EVT-GC-00002: Unable to start scheduled file system cleaning on Tue Jun 27 16:04:00 2017. ----- ------------------------ -------- ----------- --------- -----------------------------------------------------------------------------------------
Also, an alert ASUP is sent with details like the following one:
Hostname: dd-6800 Location: Lab4_Row_M System SerialNo: APMxxxxxxxxxxxxxx Chassis SerialNo: FCxxxxxxxxxxxxxxx ModelNo: DD6800 Version: 6.0.0.1 Time: Tue Jun 27 16:15:02 2017 Alert Id: m0-11 Event Id: EVT-GC-00002 Event Message: Unable to start scheduled file system cleaning on Tue Jun 27 16:04:00 2017. Event Description: Cleaning has not started as scheduled. Space for deleted files will not be reclaimed until cleaning completes. This may impact the ability to backup. Recommended Action: Determine the reason why cleaning did not start. Manually start cleaning if free space needs to be reclaimed before the next scheduled cleaning. If problem persists, contact your contracted support provider or visit us online at https://support.emc.com.
Cause
The alert only informs there is a scheduled clean process which could not be started at the time it was supposed to start. Multiple possible reasons for this, most of which are not an indication of any problem. Reasons why the alert may be triggered include:
Another reason we have seen in the past, albeit very infrequently, for GC to be skipped, is some inconsistency for the clean schedule in the registry. For example, the registry and the CLI both show GC is scheduled to run on Sundays at 06.00 AM local time:
However, a different registry key (collection.1.crontab.expunge), which is used by the "crontab" process scheduler to start the configured jobs, is incorrect, for example:
- DD GC was already running at the time the scheduled clean process had to start. As only one GC process can be run at any given time, and attempting one will not preempt a running GC, the scheduled one was skipped, and hence the alert
- Actions incompatible with GC, such as, for example, running data-movement (FMIG) from Active to Archive storage tier, or running Cloud Tier cleaning at the time Active tier GC was about to start
- A previous change in the system time zone could have caused the internal "cron" daemon in charge of scheduled tasks to still be running on the old time zone, instead of the new one, so depending on the previous and current time zones, DD GC may be run several hours earlier or later than expected, hence raising the alert for the skipped GC. You may check KB Data Domain: How to modify the date/time and/or time zone on a Data Domain Restorer (DDR) for more details regarding time zone changes in a DD
- Internally, the DD clean is started by submitting a job to the internal "sms" daemon for the "filesys clean start" command. If "sms" is not responsive, or the FS fails to respond to "sms" on time, GC will not start, and will be skipped. You may want to check the "sms.info" log for matching entries such as these ones, which would indicate clean was attempted but the job failed to be started:
02/28 12:00:26.495 (tid 0xa79c040): completed job: 3278752 for operation: sms_filesys_clean_start, duration: 25067 msec, status: **** The filesystem is not responding.
- Same as the above but due to "Time backward jump" cron's service is not synched back with the new time set
We can find something like the below on ASUP:
config.snmp.trapinfo.17 = File system is disabled due to a critical condition.EVT-OBJ::Enclosure=1 EVT-INFO::Cause=System Time backward jumped config.snmp.trapinfo.19 = Unable to start scheduled file system cleaning on Tue Nov 15 06:00:00 2022.
- If the FS is down, unresponsive, or there was a HA failover taking place at the time, or the DD was rebooting or down, GC may have been skipped as well
Another reason we have seen in the past, albeit very infrequently, for GC to be skipped, is some inconsistency for the clean schedule in the registry. For example, the registry and the CLI both show GC is scheduled to run on Sundays at 06.00 AM local time:
# reg show collection.1.expunge.schedule
collection.1.expunge.schedule.days = Sun
collection.1.expunge.schedule.time = 0600
# filesys clean show config
Filesystem Cleaning Configuration
---------------------------------
50 Percent Throttle
Filesystem cleaning is scheduled to run "Sun" at "0600".
However, a different registry key (collection.1.crontab.expunge), which is used by the "crontab" process scheduler to start the configured jobs, is incorrect, for example:
# reg show collection.1.crontab.expunge collection.1.crontab.expunge = 00 6 * * 2 root /ddr/bin/ddsh -s filesys clean start nowait scheduled
The above registry key indicates scheduled clean is to be started at 06.00 AM local time on Tuesdays (2 in the fifth "crontab" job specification) instead of Sundays (0).
Resolution
You may clear the alert at any time, but doing so will not resolve the underlying issue nor result in clean immediately starting. Depending on the cause for the skipped GC cycle the approach will be different, and this KB will not go into further details about it. Please check the DELL EMC DataDomain KB articles for assistance or, if not, reach out to your contracted support provider.
In the case of 'Time backward jump' we can just double-check if the reg config matches the 'filesys clean' schedule and restart the cron service:
* Note: the command needs a bash mode console, in case open a new SR to get help from Data Domain Support.
After doing this, confirm that the registry key indicating clean to be scheduled for the wrong day has been updated:
In the case of 'Time backward jump' we can just double-check if the reg config matches the 'filesys clean' schedule and restart the cron service:
* Note: the command needs a bash mode console, in case open a new SR to get help from Data Domain Support.
1 | double-check job configuration # filesys clean show schedule Filesystem cleaning is scheduled to run "Wed" at "1600". # reg show collection.1.crontab.expunge collection.1.crontab.expunge = 0 16 * * 3 root /ddr/bin/ddsh -s filesys clean start nowait scheduled 2 | set a new schedule if needed # filesys clean set schedule Wed 1600 3 | Restart the cron service [you can use one of them] # /etc/init.d/crond restart or # systemctl restart crond.service
For the issue with the inconsistent registry entries only, the fix is to forcibly set the correct clean schedule from either the CLI or the CLI. So continuing with the example, administrator would have to set the clean schedule to Sundays at 06.00 AM , even if "filesys clean show schedule" already reports that to be the case:
# filesys clean show schedule Filesystem cleaning is scheduled to run "Sun" at "0600". # filesys clean set schedule Sun 0600 Filesystem cleaning is scheduled to run "Sun" at "0600". # filesys clean show schedule Filesystem cleaning is scheduled to run "Sun" at "0600".
After doing this, confirm that the registry key indicating clean to be scheduled for the wrong day has been updated:
# reg show collection.1.crontab.expunge collection.1.crontab.expunge = 0 6 * * 0 root /ddr/bin/ddsh -s filesys clean start nowait scheduled
Affected Products
Data Domain, DD OS 6.0Article Properties
Article Number: 000052147
Article Type: Solution
Last Modified: 17 Jul 2023
Version: 4
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.