Data Domain: Replication Resync Explanation
Summary: This article explains how the "replication resync" determines what data to send across the network.
Instructions
PURPOSE
- This article explains how "replication resync" determines what data to send across the network. For the procedure on how to perform a resynchronization, see the article: Data Domain: How to Break and Resync Directory Replication.
APPLIES TO
- All Data Domain (DD) Models
- Directory Replication
- All software releases 4.3 and above
SOLUTION
Resync is nearly identical to initialize, and can even be specified in lieu of initialize for directory replication contexts, with the following differences:
The replica directory is *not* required to be empty. Internally, the destination moves the existing files out of the way.
(In 4.3 to 4.5 inclusive, this is by renaming these files to a $destdir/.ddrsaved/ directory. In 4.6+, it is by making a snapshot then deleting the files from the replication context $destdir.)
For each file in the source directory, it checks if the same relative path exists on the destination (checking the $destdir/.ddrsaved, or the snapshot).
If the file exists, it checks if the destination file is identical to the source file.
If it is identical, the existing replica file is hard linked into place with no further requirement to filter or send any of its contents.
The file identity check happens in constant time.
The check succeeds if the replica file had been created by directory replication >= 4.3.
This implies that seeding can be accomplished by running collection replication (convenient if the Data Domains are in a LAN), then breaking collection replication, then doing directory replication resync.
If a replica file is not found or the content does not match, the file is replicated normally, that is, filtered on a segment by segment basis.
The whole point of resync is to avoid this filtering where possible.
A potential pitfall is that human activity or application logic that renames files could cause the originator or replica files to be moved or renamed after replication was broken, prior to resync.
This causes the pathname lookup in resync to find no match, and necessitates full filtering of each file.
Since the segments exist on the destination, high replication compression is achieved, but an opportunity is missed to avoid sending the segment references and filtering them in the first place.
-
In 4.5 resync fails if the replica has, or ever had, retention lock enabled.
-
In 4.6 and above, there are some additional requirements related to retention-locked files on the replica. In essence, any retention-locked file on the destination must also exist and must have matching content and attributes on the originator. This is to prevent replication from attempting to remove retention-locked files on the replica.