Avamar: Space reclamation process Part 1: Garbage Collection

Summary: This KB article describes the first part of the Avamar space reclamation process. This is known as garbage collection.

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Instructions

This article is the first in a series which documents how Avamar recycles space, both within the GSAN and on the hard drives.
 

The current implementation of garbage collection (GC) was introduced with Avamar v7.0, and its design remains largely unchanged.

What does garbage collection do?

Garbage collection is the first stage of the process where Avamar reclaims space that was used to store backup data.

It operates on the cur directory, and frees up space within the GSAN by removing data chunks no longer referenced by any backup:
  • Data is said to be "defined" if it can be looked up in the index.
  • Data is referenced if it exists as part of a backup (that is the hash is present in the User Accounting System, composite stripes, or directory elements).

Space reclaimed by garbage collection cannot be reused until after crunching has run. Crunching runs immediately after the daily scheduled garbage collection has finished. See Avamar: Space reclamation process Part 2: Crunching.

When does garbage collection run?

Garbage collection runs at the beginning of the maintenance window, before the checkpoint/hfscheck/checkpoint cycle. During this time, incoming backups to the system should be limited, so garbage collection can run without loading the system heavily.

How long does garbage collection run for?

By default, garbage collection runs for 4 hours. If two passes do not complete within this time, the run time of the next garbage collection will be increased by 15 minutes. This continues until either two passes complete successfully, or the default limit of 7 hours (420 minutes) is reached.

What can prevent garbage collection from running successfully?

Common issues are listed below. Some articles may require authentication on the Dell Support site to be viewed.
 

For an up-to-date list, see Avamar: Troubleshooting Garbage Collection (GC) Failures (Resolution Path)

How garbage collection works

Step 1 - Building the table of reference counts (TORC):
    • Garbage collection reads entries in the user accounting system, the composite stripes, and directory elements to build a Table Of Reference Counts (TORC).
    • In the TORC, garbage collection records all hashes on the system and how many times each hash is referenced.
Step 2 - Reading the indexes:
    • Once the TORC is complete, each node loads a subset of its individual index stripes into memory. The gccount parameter defines the number of stripes read. For each hash defined in the index, garbage collection looks up the hash in the TORC to check if it is referenced.
    • If the hash exists in both the index and the TORC, there is nothing to do. Every hash in the TORC has a reference count of at least 1, so the hash is both defined and referenced.
    • If the hash exists in the index, but not in the TORC, the hash is defined but not referenced, so can be removed.
Note: If the hash existed in the TORC but not in the index, this would be a data integrity error (hash that is referenced but not defined). This results in hfscheck failure.
 
Step 3 - Remove unreferenced hashes:
    • As noted earlier, hashes that are not referenced are not part of any backup, so can be safely removed from the Avamar.
    • To do this, garbage collection:
      • a. Removes the entry in the index.
      • b. Zeroes out the entry for the hash in the Chunk Header Descriptor (CHD). The CHD defines where individual chunks are inside the stripe container.
Avamar has marked the area that the hash was occupying as empty. For performance and, or capacity reasons, the data is not deleted at this stage.
 
Step 4 - Update the TORC:
    • If the chunk which garbage collection removed was a composite, the TORC must be updated.
    • Going back to step 1, the reference counts in the TORC include references made by composite stripes, which contain composite chunks.
    • Since a composite chunk was removed, the reference count in the TORC can be decremented by one for any hashes referenced by that composite chunk.
    • Garbage collection does this by reading in the composite, to see which hashes it references, and then updating the TORC.
Step 5 - Read the next set of indexes:
    • Garbage collection unloads the previous set of index stripes from memory, and then loads a new set.
    • Steps 2-4 are repeated for these new index stripes.
    • Once all the index stripes have been read, any data chunk (known as 'atomic' chunks) in the TORC that has no references (from step 4), is removed.
Step 6 - Start a new pass:
    • Once all the indexes have been read, garbage collection starts a new pass.
    • All the index stripes are re-read, looking for data that is no longer referenced after the previous passes. This is necessary because hashes are not read in a logical order, but rather in the order they are stored in the indexes.
    • Garbage collection is not certain to find the hashes in the optimal order. A hash can remain referenced until the end of the pass.
    • Two passes of garbage collection can comfortably maintain a "steady-state" capacity in most Avamar server environments.
    • Garbage collection performs passes until it runs out of time, or a pass completes without removing any data.
 

Manual garbage collection

Micromanaging an Avamar server should not be required. The scheduler is intended to automate the running of maintenance tasks. If Avamar capacity is high, see the Avamar Operational Best Practices Guide and Avamar: Capacity Management Concepts and Training

On rare occasions, running garbage collection might help alleviate acute issues where the GSAN "User capacity" is so high that the system enters read-only mode. 

In these cases, garbage collection is run manually to bring down the capacity level to just below the read-only threshold. This allows the backup window to run.

Automated garbage collection can continue working as normal.

Avamar Support should fully investigate and understand the situation before manual garbage collection is considered.

It is never appropriate to request that Support runs manual garbage collection on a system without authorization from an L2 support engineer after such an investigation.

See Avamar: About the use of manual Garbage Collection

Affected Products

Avamar

Products

Avamar, Avamar Server
Article Properties
Article Number: 000068726
Article Type: How To
Last Modified: 05 Aug 2025
Version:  12
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.