Avamar

Last reply by 06-11-2021 Solved
Start a Discussion
2 Bronze
2 Bronze
438

Duplicate data on 2 vms in a data domain.

Ok.

Im new to the Avamar world

Lets say I have i haver serverA it has 1TB of data. This has been backing up for 3 years to a data domain. 

Then server2 is setup as a new server and the 1TB data copied to server2.

Hows does avamar treat the duplicate data on server2?

Is it scanned as new data and backed up as new or is it recognized as duplicate data and not backed up again.

Basically I want to know is the same data on 2 different VMs treated as new data or as already backed?

So i delete the server1 backups i would save a bunch of space as the data is already moved to server2?

Thanks

Solution (1)

Accepted Solutions
4 Ruthenium
431

For guest level backups, the data will be de-duplicated.

For image level backups, Avamar uses Data Domains fixed segmentation size mode for performance reasons. If the data is allocated to the disk along the same block boundaries, the data will have excellent (though not perfect) de-duplication. However, if the data is written to the new VM at an offset compared with the copy on the other VM, the deduplication will be poor because the shift in alignment makes all the blocks on the new VM look like new data to the de-duplication algorithm.

Please do note that DDBoost will sometimes send duplicate segments on purpose to ensure that data is stored contiguously on disk on the appliance. This is for performance reasons. These duplicate segments are cleaned up automatically during the weekly cleaning cycle.

If the data was de-duplicated, deleting serverA will not recover very much (if any) space. If you review the Avamar DPN Summary report for the time period when the data was copied to server2, you should be able to see whether there was a large jump in the bytes sent value. If there was a large jump, deleting serverA may recover some space. If there wasn't a significant ingest of new bytes, you'd be better off keeping the backups for serverA around since they're not costing you much additional capacity.

View solution in original post

Reply (1)
4 Ruthenium
432

For guest level backups, the data will be de-duplicated.

For image level backups, Avamar uses Data Domains fixed segmentation size mode for performance reasons. If the data is allocated to the disk along the same block boundaries, the data will have excellent (though not perfect) de-duplication. However, if the data is written to the new VM at an offset compared with the copy on the other VM, the deduplication will be poor because the shift in alignment makes all the blocks on the new VM look like new data to the de-duplication algorithm.

Please do note that DDBoost will sometimes send duplicate segments on purpose to ensure that data is stored contiguously on disk on the appliance. This is for performance reasons. These duplicate segments are cleaned up automatically during the weekly cleaning cycle.

If the data was de-duplicated, deleting serverA will not recover very much (if any) space. If you review the Avamar DPN Summary report for the time period when the data was copied to server2, you should be able to see whether there was a large jump in the bytes sent value. If there was a large jump, deleting serverA may recover some space. If there wasn't a significant ingest of new bytes, you'd be better off keeping the backups for serverA around since they're not costing you much additional capacity.

Latest Solutions
Top Contributor