Avamar: Space reclamation process Part 1: Garbage Collection
Summary: This KB article describes the first part of the Avamar space reclamation process. This is known as garbage collection.
Instructions
GSAN and on the hard drives.
The current implementation of garbage collection (GC) was introduced with Avamar v7.0, and its design remains largely unchanged.
What does garbage collection do?
Garbage collection is the first stage of the process where Avamar reclaims space that was used to store backup data.
GSAN by removing data chunks no longer referenced by any backup:
-
Data is said to be "defined" if it can be looked up in the index.
-
Data is referenced if it exists as part of a backup (that is the hash is present in the User Accounting System, composite stripes, or directory elements).
Space reclaimed by garbage collection cannot be reused until after crunching has run. Crunching runs immediately after the daily scheduled garbage collection has finished. See Avamar: Space reclamation process Part 2: Crunching.
When does garbage collection run?
Garbage collection runs at the beginning of the maintenance window, before the checkpoint/hfscheck/checkpoint cycle. During this time, incoming backups to the system should be limited, so garbage collection can run without loading the system heavily.
How long does garbage collection run for?
By default, garbage collection runs for 4 hours. If two passes do not complete within this time, the run time of the next garbage collection will be increased by 15 minutes. This continues until either two passes complete successfully, or the default limit of 7 hours (420 minutes) is reached.
What can prevent garbage collection from running successfully?
-
Maintenance scheduler or, more specifically, garbage collection is disabled. The output of status.dpn to can confirm this.
-
Operating system capacity is above the disknogc value (which may be 86% to 89%).
-
Time synchronization issues between Avamar nodes.
-
Index stripes are splitting.
-
Hash referenced bit maps are not able to reset.
For an up-to-date list, see Avamar: Troubleshooting Garbage Collection (GC) Failures (Resolution Path)
How garbage collection works
-
-
Garbage collection reads entries in the user accounting system, the composite stripes, and directory elements to build a Table Of Reference Counts (TORC).
-
In the TORC, garbage collection records all hashes on the system and how many times each hash is referenced.
-
-
-
Once the TORC is complete, each node loads a subset of its individual index stripes into memory. The gccount parameter defines the number of stripes read. For each hash defined in the index, garbage collection looks up the hash in the TORC to check if it is referenced.
-
If the hash exists in both the index and the TORC, there is nothing to do. Every hash in the TORC has a reference count of at least 1, so the hash is both defined and referenced.
-
If the hash exists in the index, but not in the TORC, the hash is defined but not referenced, so can be removed.
-
hfscheck failure.
-
- As noted earlier, hashes that are not referenced are not part of any backup, so can be safely removed from the Avamar.
- To do this, garbage collection:
- a. Removes the entry in the index.
- b. Zeroes out the entry for the hash in the Chunk Header Descriptor (CHD). The CHD defines where individual chunks are inside the stripe container.
-
-
If the chunk which garbage collection removed was a composite, the TORC must be updated.
-
Going back to step 1, the reference counts in the TORC include references made by composite stripes, which contain composite chunks.
-
Since a composite chunk was removed, the reference count in the TORC can be decremented by one for any hashes referenced by that composite chunk.
-
Garbage collection does this by reading in the composite, to see which hashes it references, and then updating the TORC.
-
-
- Garbage collection unloads the previous set of index stripes from memory, and then loads a new set.
- Steps 2-4 are repeated for these new index stripes.
- Once all the index stripes have been read, any data chunk (known as 'atomic' chunks) in the TORC that has no references (from step 4), is removed.
-
- Once all the indexes have been read, garbage collection starts a new pass.
- All the index stripes are re-read, looking for data that is no longer referenced after the previous passes. This is necessary because hashes are not read in a logical order, but rather in the order they are stored in the indexes.
-
- Garbage collection is not certain to find the hashes in the optimal order. A hash can remain referenced until the end of the pass.
- Two passes of garbage collection can comfortably maintain a "steady-state" capacity in most Avamar server environments.
- Garbage collection performs passes until it runs out of time, or a pass completes without removing any data.
Manual garbage collection
Micromanaging an Avamar server should not be required. The scheduler is intended to automate the running of maintenance tasks. If Avamar capacity is high, see the Avamar Operational Best Practices Guide and Avamar: Capacity Management Concepts and Training
On rare occasions, running garbage collection might help alleviate acute issues where the GSAN "User capacity" is so high that the system enters read-only mode.
In these cases, garbage collection is run manually to bring down the capacity level to just below the read-only threshold. This allows the backup window to run.
Automated garbage collection can continue working as normal.
Avamar Support should fully investigate and understand the situation before manual garbage collection is considered.
It is never appropriate to request that Support runs manual garbage collection on a system without authorization from an L2 support engineer after such an investigation.