FluidFS Customer Notification: How the reclaimer service frees space on a NAS pool
Summary: This Customer Notification explains how the reclaimer service frees space on a NAS pool.
Symptoms
When a snapshot is deleted, it is scanned to free the blocks it exclusively owned. This process, aka reclaimer, runs in the background.
The bigger the snapshot the longer this operation takes. After the reclaimer process has finished, the entire freed space becomes available to the volume.
It must run before blocks can be unmapped using SCSI unmap (if enabled) into the backend SAN volumes.
The reclaimer process queues up to run when any data is deleted on the NAS pool, including data deleted from shares, NAS volumes, and snapshot deletions.
Cause
Known limitations and Issues
- The
reclaimerservice cannot be run manually or stopped for an extended period of time. Once it begins, it must finish its queue before space is released to the NAS pool. - Reclaiming snapshots is resource-intensive. If a lot of reclaiming activity occurs concurrently, it could cause performance problems across the cluster.
-
- Resource-intensive reclaim operations can become so performance-impacting that it can affect client access to the cluster.
- There is a snapshot creation and expiration limit that varies by appliance based on overall system load. This could directly impact the reclaimer and system functionality.
-
- While
reclaimerhas been improved in FluidFS firmware v6 for snapshot deletions, it is possible for an overloaded reclaimer service to affect client access. These events are reported as"clients may encounter a long period of partial data access"
- While
Check whether the performance problems occur around the time that some snapshots expire.
There are multiple types of snapshots:
- Manual snapshots - Snapshots that expire when the administrator deletes them, or according to the expiry time set by the administrator.
- Scheduled snapshots - Snapshots that expire according to the schedule details. The names are based on the schedule name.
- Network Data Management Protocol (NDMP) snapshots - Snapshots that expire when the NDMP backup completes. The names start with ndmp.
- Replication snapshots - Snapshots that expire after the next replication completes successfully. (During a replication there are two snapshots, the previous snapshot and the current snapshot.) Replication snapshots names begin with rep.
Resolution
Staggering snapshots tasks (Standard Snapshots, Replication, NDMP)
If many snapshots expire simultaneously, it might cause performance issues.
Fewer, but larger snapshots that expire simultaneously can also cause performance problems.
It is recommended to stagger hourly snapshots across time (steps of 10 minutes), and stagger daily snapshots across the day (preferably expiring at night). Weekly snapshots should preferably expire on weekends.