Data Domain: Best Practices for Directory and Pool Replication

Summary: Best Practices for Directory Replication

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Instructions

Best Practices for Directory Replication

PURPOSE

This article defines best practices for configuring directory replication.

APPLIES TO

  • All Data Domain systems
  • All Software Releases

RECOMMENDATIONS

  1. Spread the workload across as many contexts as possible.
    Ideal single-context precompressed throughput is in the 200-300MB/sec range. In configurations where multistreaming is available, ideal single-context performance is similar to ideal multicontext performance; however there are several variables which limit the effectiveness of multistreaming:

    • If the source DDR has many replication contexts, the logic to divide multistreaming streams among contexts limit the available number of streams.
    • Multistreaming is not active during snapshot-based initialize/recover. By default a snapshot-based initialize is in effect if the source context has more than 1 million entries.
    • In 5.0 onwards, multistreaming was introduced for replicating CIFS data.

    The ideal multicontext precompressed throughput by model varies from around 200MB/sec to 500MB/sec or more.

  2. Design the workload with moderately-sized files.
    File size can have a significant impact on the overall performance of any replication context. In general, files less than 10MB in size cannot be replicated efficiently.

    Also, when a replication pair reconnects after an unexpected disconnect, the source must restart from the beginning of the file which was being replicated during the disconnect. If the file is very large, and there are frequent disconnects (for example, due to an unreliable network) replication can effectively become?stuck? trying to replicate the same file over and over again. This is most commonly seen with files exceeding 100GB in size. There is no performance implication due to the file size itself.

  3. Design the workload to take advantage of replication scheduling.
    Files are queued for replication when they are closed internally. The timing of the file closes is as follows:

    When a modified file is closed, a replication log "close" record is generated for the file. Replication queues the new data in the file for send. If there are no other replication operations ahead of it in the queue (that is unprocessed log records), the new data is sent immediately. Otherwise, the file will be replicated after previous log records are processed.

    • 10 minutes after the last access, NFS will close the file.
    • All files are closed every hour regardless of how recently they were written.
    • If many files are being accessed or written, files may be closed sooner than the above rules would dictate. Backup software writing files in smaller fragments (say, 1MB) may result in replication initiating sooner due to the number of files being generated.
  4. Use a dedicated network if possible.
    Packet loss rates as low as 0.1% can severely degrade network throughput, particularly for high bandwidth-delay networks. For networks with bandwidth <= T2, RTT (Round-Trip Time) up to one second provide good throughput. For networks of >= T3, there is significant throughput degradation starting at an RTT of 300-500ms.

    More generally, throughput under packet loss is approximately
    Throughput = MSS /(RTT * sqrt(p)) where MSS := minimum segment size (typically 1460 bytes) RTT := round-trip time p := probability of packet loss

  5. Evaluate Delta Replication (Low-Bandwidth Optimization).
    In DD OS 4.8 and higher, delta replication, also called "low bandwidth optimization," can increase the virtual throughput of directory or pool replication across links with less than 6 mega bits per second (Mbps) of available bandwidth. Delta replication incurs significant additional CPU and I/O overhead on both the source and destination Data Domain systems. If low-bandwidth optimization is enabled across links with greater than 6 Mbps of bandwidth, it is unlikely that any gain in virtual throughput is realized. Generally speaking, if:

    • The data to be replicated is less than 96% identical to data already existing on the destination system
    • There are less than 6 Mbps of available bandwidth
    • Both systems have spare CPU and I/O capacity

      Low-bandwidth optimization should be enabled. Monitor the output of "replication show history" over several weeks. The "Low-bw-optim" ratio should average 2.00 or more, and the network throughput (Network bytes divided by time interval) should not be much less than the available bandwidth. If the "Low-bw-optim" ratio does not average 2.00 or more, then delta compression is probably not effective on the dataset and should be disabled. If the network throughput is much less than the available bandwidth, then most likely one or both Data Domain systems do not have enough spare CPU or I/O capacity to support delta replication, and it should be disabled.

  6. Follow Best Practices for other components and third-party backup applications.
    Our Best Practices guides are written with overall performance in mind. Deviations from Data Domains suggested best practices can have significant performance implications across several areas, though it may not be immediately obvious.

REFERENCE

Troubleshooting Replication Lag 180482

 

Affected Products

Data Domain

Products

Data Domain
Article Properties
Article Number: 000012092
Article Type: How To
Last Modified: 01 Sep 2025
Version:  6
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.