Avamar-Data Domain: High DD utilization on Target: Analysis & Best Practices

Summary: The Source and Target Data Domains are not expected to be exactly equal in disk utilization. This document describes the possible reasons the target Data Domain might show higher utilization than the source Data Domain. It is important to note that the discrepancy in utilization may be a result of a combination of the reasons below. ...

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Symptoms

Target Data Domain shows higher utilization than Source Data Domain.

Cause

From Avamar perspective:

Rollback
In case of a rollback on the source Data Domain, the destination Data Domain can hold extra days of data depending on the rollback time. This discrepancy will exist until the extra backups on the destination expire.

Example: DD1 replicates to DD2. Since the rollback is 2 days back, we see that there are 3 backups on the source, but five backups replicated to destination.

Partial Replications
In case a replication does not complete successfully, the data that has already been replicated is being stored for a minimum of seven days and cleaned by the Data Domain cleaning.  Partial replications contain data and fingerprints that allow the subsequent re-try of the replication of data to run faster.
The Partial Replication overhead could be as high as the amount of replicated data if replications fail right before they complete.

Difference in retention
In Avamar server configuration it is possible to set to keep the replicas on the destination server for longer than a source. This will cause differences in capacity utilization.

Avamar configuration differences
A checkpoint backup on Avamar server could be significantly large. If it is only configured on the destination Avamar, it will increase the utilization of Data Domain on the destination.

From Data Domain perspective:

Fingerprint.

When data is sent to Data Domain during replication it is being de-duped. A fingerprint of the data is being sent to the destination Data Domain first to check if the destination has the data.

  • If the Data Domain returns that the fingerprint is there, the data does not need to be re-sent

  • If the Data Domain does not return that the fingerprint is not found, it means that either:

    • the fingerprint is not there

    • Destination Data Domain has fingerprint but wants the data to be sent anyway to improve the special locality on Data Domain.

    • Data Domain is busy and does not want to complete the whole search.

If duplicate data is sent to Data Domain, the data will be deduplicated during cleaning by removing extra copies of the data.
The Destination Data Domain will have higher utilization, but the variation should not be large.

Metadata overhead.
Each backed up file comes with its file information metadata and it also contains fingerprint for each.

Example: For a 1TB file, the utilization cost is 0.3% of the file size.

For an average 8kB chunk of data there is 82B of metadata.  This is about 0.01% overhead for post-comp capacity.
This overhead additionally increases with Avamar integration since Avamar combines the backups to obtain a synthetic full back up from incremental each time the backup is completed.
We also observe that metadata overhead increases when there are skipped backups, or the data is replicated out of order.
The only backups that do not create this overhead are VM Backups. The metadata cost is minimized.

Example: When the backup is replicated out of order it is creating L0 backup on target which has a much larger metadata overhead than Inc.  Let’s say we have 5 days’ worth of backups.

Replication Oldest to Newest:

First replication will be L0, then all subsequent will be Inc.
1xL0 + 4xInc

Replication Newest to Oldest:

All replications will be L0 because n-1 day is not available to base Inc on.
5xL0

Replication skips a backup:

Let’s say the backup on day 3 was skilled.  The Day 1 is L0, Day 2 is Inc, then Day4 will again be L0.
L0+Inc+L0+Inc


File Tracking
Data Domain must know how to build each file from the deduplicated chunks. In case Data Domain does not have this information, it must rebuild it and re-create the fingerprint chain. This might cause a significant increase in capacity.
There are two scenarios that can cause a significant increase in capacity on the destination Data Domain:


1. File tracking is lost:

Example: If destination Data Domain is set in DNS with multiple IPs and the IPs are distributed in round robin, the source Data Domain will connect to different IPs each time. The copy of data sent yesterday would not be recognized and more data is sent, which also increases the metadata cost.

2. File tracking is not enabled:
Example: The SFS_BFT_ENABLED needs to be set to true to ensure the Base File Tracking can synthesize backups on the destination system. This allows for incoming replications to be optimized for storage. If the SFS_BFT_ENABLED is set to false, the data is being saved to the final backup location on DD is equal to the pre-comp incoming data.

This issue may occur when the SFS_BFT_ENABLED is left as false after Avamar server rollback is complete.

This may result in a very large discrepancy.  The space is reclaimed once the backups expire.

In-line dedupe
Data Domain will ask for duplicate data up to 6% of a logical size of the data in order to optimize its in-line dedupe.

Difference in the deduplication and compression.
Data Domains do their own deduplication and compression of data on their local storage independently and depending on how the destination data is being stored, this will not be equal causing difference in utilization.

Data Domain cleaning
If the source and destination Data Domains are running cleaning on different days, or if one of the data domains is running it more often or longer, there will be discrepancy in capacity utilized.

Resolution

Best Practices:

Since there will be discrepancies in utilization between the two Data Domain Systems, source and destination, there are some best practices that can help minimize the difference:

  1. Minimize the possibility of rollback by attending to hfscheck failures and hardware failures as soon as they occur.

  2. Ensure that replications are completing successfully. If there is an on-going issue with having replications complete, please reach out to Dell Technologies support to review the configuration.

  3. If you require to keep the two data domains at similar utilization, keep the same retention on source and target copies of the backups and ensure the checkpoint backup is set the same on both Avamar servers.

  4. Ensure replications are always Oldest to Newest and no backups are skipped.

  5. If Data Domain is configured with multiple IPs, ensure that the IPs are not distributed in round-robin fashion.

  6.  Have both Data Domain systems, source and destination, run cleaning on the same day and time.

  7. Have SFS_BFT_ENABLED set to true.  This must be enabled by Tech Support (Raise an SR & reference this KB# - 182755)

Affected Products

Avamar Server
Article Properties
Article Number: 000182755
Article Type: Solution
Last Modified: 20 Sept 2024
Version:  6
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.