When a snapshot is deleted, it is scanned to free the blocks it exclusively owned. This process, aka reclaimer, is performed in background. The bigger the snapshot - the longer this operation takes. After the reclaimer process has finished, the entire freed space becomes available to the volume.
It must run before blocks can be unmapped using scsi unmap (if enabled) into the backend SAN volumes.
Reclaimer will queue up to run when any data is deleted on the NAS pool, including data deleted from shares, NAS volume and snapshot deletions.
Known limitations and Issues
- The reclaimer service cannot be ran manually or stopped for an extended period of time, once it begins it must finish its queue before space is released to the NAS pool.
- Reclaiming snapshots is resource intensive. If a lot of reclaiming activity occurs concurrently, it could cause performance problems across the cluster.
- There is a snapshot creation/expiration rate outlined in the compatibility matrix of 300 per hour. This could directly impact the reclaimer and system functionality.
- While reclaimer has been improved in FluidFS firmware v6 for snapshot deletions specifically, it is possible for an overloaded reclaimer service to affect client access. These events will typically be reported as "clients may encounter a long period of partial data access"
"Clients may encounter a long period of partial data access"
Check whether the performance problems occur around the time that some snapshots expire.
Note that there are multiple types of snapshots:
- Ad-hoc snapshots – Snapshots that expire when the administrator deletes them, or according to the expiry time set by the administrator. The name is defined by the administrator who created them.
- Scheduled snapshots – Snapshots that expire according to the schedule details. The names are based on the schedule name.
- NDMP snapshots – Snapshots that expire when the NDMP backup completes. The names start with ndmp.
- Replication snapshots – Snapshots that expire after the next replication completes successfully. (During a replication there are two snapshots, the previous one and the current one). Replication snapshots names begin with rep.
Staggering snapshots tasks (Standard Snapshots, Replication, NDMP)
If many snapshots expire at the same time, it might cause performance issues.
Fewer, but larger, snapshots that expire at the same time can also cause performance problems.
It is recommended that you stagger hourly snapshots across time (steps of 10 minutes), and stagger daily snapshots across the day (preferably expiring at night). Weekly snapshots should preferably expire on weekends.