Avamar: Information to collect to troubleshoot Avamar Replication Performance Issues (Resolution Path)
Summary: This article should be used for collecting the initial information for troubleshooting Avamar (and Avamar with Data Domain Integration) Replication Performance issues.
Instructions
For general replication, the starting point should include background and concepts, and items to test and changes to perform.
See the Resolution Path article Avamar: Replication Performance Troubleshooting and Tuning (Resolution Path) for these topics.
This article is specific to collecting information about replication performance-specific issues including replication timeout-end, but not other replication failures for non-performance configuration issues.
For other replication issues for NON-PERFORMANCE on information to collect, see Avamar: How to gather information to troubleshoot replication issues
-
Most of these following items require discussions between the technical support teams and customers.
-
Without all pieces of information listed in this article, troubleshooting and time to resolution may increase depending on possible issues found.
-
Reference login articles:
-
Using SSH (SSH) to log in to a remote Data Domain without giving a password [with data domain]
General Environment Questions and Information:
- Discuss the physical locations, cities, countries, or miles distance between source and target sites with the customer.
- Discuss and provide what is the goal to accomplish in the performance tuning or what specifically must be fixed (beyond performance)?
- (Example: To catch up on X days behind after rollback or failures, to complete initial replication seeding for the first time, to complete a root2root (R2R) migration being performed by Professional Services and so forth)
- Configuration design:
One-to-one,One-to-many,Many-to-one,Cross replication,Cascading replication, or Other
- From the Source and target Avamar (and DD server if applicable), get grid hostname, version, and capacity:
- For Avamar: Run the status.dpn command on all related Avamar server utility nodes, Avamar Virtual Editions (AVE), or single nodes. Avamar: How to understand the output generated by the status.dpn command
- For Data Domain on every Avamar server: Run the "
mccli dd show-prop" command- This DD information can also be gathered from replication logs or from ddrmaint commands. See appendix at the end of this article.
- Hardware type (including DD if applicable):
- What is the hardware type and version? This can affect the number of streams, and the amount of disk I/O ingest.
- What size of capacity per node and total overall backup data capacity?(This is important to know because it helps add to knowledge how much data there might be to replicate or catch-up.)
- For Data Domain, this info is found in replication logs or from ddrmaint commands.
- For Dell Support, all Avamar and DD hardware types can be found on Avalanche and the Autosupport (ASUP) if they are configured for Email Home
- The following command can also be run on the Data Domain:
system show model
- The following command can also be run on the Data Domain:
- Network: This is NOT a section for testing speed, but a discussion between Dell Support and customers regarding:
- The customer expectations of the network speed and expectations for replication
- Is the replication network shared by other applications or usages
- Ask the customer if they currently have a dedicated secondary network for replication (or plan to potentially configure in the future)
- If yes, what are the internal and external IP addresses for source and target
- If a Data Domain is involved, Check if anything other than the Avamar grid under review is also replicating to that same Data Domain
- If yes, are there multiple Avamar grids, or other backup solutions
- If yes, are they simultaneous, or staggered
- The amount of data
- Determine if there are any customer firewalls or QoS network throttles outside of the Avamar product configured or present
- Does the customer have any WAN Accelerators on their network?
- NOTE if there are WAN Accelerators, these can be revealed at a later step in testing when iperf shows faster results but nothing else in terms of data transmission is nearly as fast. Iperf is a simple Linux-based "network speed test tool" and its traffic is both very compressible and de-duplicatable. However, real client backup data is not nearly as compressible and de-duplicatable by comparison since it is already compressed and deduplicated prior to replication over the network.
- On Avamar, incorrect usage of WAN Accelerators can make tuning the replication performance more difficult. While they can inaccurately inflate performance test results from iperf alone, they often do not help Avamar replication at all. More often, they make performance tuning more difficult and time-consuming. Discuss further with Avamar Support about the limitations and possible harm to performance tuning as dedup/compression-type WAN Accelerators provide zero performance benefit to
Avamar-onlytraffic and can slow the performance tuning procedure as well. - For Data Domain, the presence of a WAN accelerator on the network can adversely impact replication performance. Verify with the network administrator whether a WAN accelerator exists within the Data Domain system network. While working with the network administrator and confirming minimum impact on the overall network, disable the WAN accelerator. This should be done as a limited test. Reference the Data Domain article Data Domain: Analyzing Slow Replication Issues [on DD].
- For Accelerators usage addressing unsupported amount of high network ping latency and communicating over User Datagram Protocol (UDP) protocols, discuss with your Dell Technologies Account Team or Avamar Support for possible benefit. If possible, normal performance tuning using this Resolution Path and without accelerators should resolve most ping latency issues.
- Customer Requirements:
- What are Service Level Objective (SLO) and Service Level Agreements (SLA) requirements in terms of customer backup data, protection and environment
- Must all backups be replicated?
- Are older backups skipped, or is that a possibility?
- Are only certain clients replicated?
- (and so on)
- What are Service Level Objective (SLO) and Service Level Agreements (SLA) requirements in terms of customer backup data, protection and environment
More specific configuration questions:
- Client account general knowledge to help with configuration impacts. From discussion with the customer, roughly:
- How many clients exist on the server in total (If only a subset is replicated, how many)
- What are the different client plugin types (file system, Exchange, NDMP, and so on)
- What are the largest sizes of client backups in general
This information is best to validate and confirm beyond discussion if uncertain as this can be a limiting factor depending on the client backup size and especially the backend type, gsan vs. DD. Try running the "Bytes Protected Client 2" report in the Avamar Admin UI, select a date range of the last couple days (just in case a backup has not yet run in the previous day yet) and sort the output by size. How to run reports can be found in the current Avamar Administration Guide Technical Note.
- If DD integration is present, determine what are the backend storage types of above very large clients by type and size on the Avamar vs. Data Domain backend. For example, are NDMP clients all backed up to Data Domain but file system clients all to Avamar backend? Is the backend dependent on size, mixed, or a random pattern?