Skip to main content
  • Place orders quickly and easily
  • View orders and track your shipping status
  • Create and access a list of your products
  • Manage your Dell EMC sites, products, and product-level contacts using Company Administration.

Avamar: Troubleshooting slow backup performance

Summary: This article explains breaks down Avamar backup performance into component parts. It provides practical guidelines on how to investigate a slow Avamar backup, identify bottlenecks, and mitigate their effects. ...

This article may have been automatically translated. If you have any feedback regarding its quality, please let us know using the form at the bottom of this page.

Article Content


Symptoms

This article focuses on:
  • Avamar clients which back up file systems or databases, to an Avamar server or Data Domain back-end.
  • L1 backups where the initial backup has completed and a full backup is present on the Avamar server.


Why optimize client backup performance?

      
    Typical symptoms of slow backup performance:
    • Backup fails to complete within the scheduled window. The activity monitor reports "Client time out - end"
    • Backup does not get a chance to start before the scheduled window ends. The activity monitor reports "Client time out - start"
    • Garbage collection regularly fails with MSG_ERR_BACKUPSINPROGRESS or MSG_ERR_TRYAGAINLATER 


    Understanding what happens during an Avamar backup from a performance perspective

    A detailed explanation of what happens in the background to influence Avamar client backup performance and behavior can be found in:

     

      Cause

      See Resolution for a list of causes.

      Resolution

      Gather information:
      Gather detailed information about the issue: 


      Determine which part of the backup chain has the most severe bottleneck:
      The following schematic shows the main components in a backup system.  
      Diagram showing Avamar backup chain from the backup data through data storage, the avamar client, the network and finally the Avamar and Data Domain servers.

      Bottlenecks ALWAYS exist, but we should work to understand where they are.
      If we can do this, and mitigate the bottleneck, performance should improve. 
      Once a bottleneck is mitigated, another bottleneck may become apparent. Our end goal is to reach a situation where backup duration is acceptable. 


      Avamar server-side bottlenecks:
      If ALL backups to an Avamar server are slow, consider the possibility of a server-side issue.  
      If ALL backups to an Avamar server are slow during certain times of day, consider server-side contention or a network bottleneck.
      If there is a performance issue with one or a few backup clients, focus on each client by itself.


      Server Health:
      A healthy Avamar server is unlikely to be a bottle-neck for backups. 

      Check the health of the backup server.  


      Avamar restricts client connections to preserve acceptable levels of performance.
      See Avamar: How many simultaneous client sessions can be made to the Avamar server? (versions 6.1 and later)  

      Server Contention:

      If there are times of day when backup performance is poor, this might indicate contention.

       
      • Arrange maintenance and backup schedules so they do not overlap.
      • Review the output of the status.dpn and top commands to check the load on the data nodes
      • Run mapall 'iostat -x' on the data nodes. Check %iowait and %idle and %util to see if I/O bandwidth of any disk is saturated.
      • To isolate a particular client's performance, test the backup when the Avamar server is not performing maintenance tasks or other backups or replication.


      Data Domain backup ingestion performance:
      Log in to the Dell Support portal and review:



      Network side bottlenecks:
      The network may be a bottleneck if a client is backed up over a WAN.

      Network latency:
      This affects the rate at which clients can check if hashes are present on the Avamar server.  

      • Run ping from the client to the Avamar server and check the network's packet loss and latency

      Network bandwidth:
      During a backup, new data must be sent over the network to the Avamar server. See the log for a completed backup and learn the amount being sent.
      2014-11-20 04:45:30 avtar Info <5156>: Backup #1180 timestamp 2014-11-20 04:45:28, 23 files, 5 folders, 291.7 GB (23 files, 4.316 GB, 1.48% new) 
      
      If client and server are separated by a WAN, can the link can transmit the necessary data within the backup window?
      In this case the data that needs to be transmitted is 4.316 GB.

      These values are all interrelated:
      • Amount of new backup data
      • Time available for backup
      • Effective network bandwidth

      Image showing that backup completion depends on amount of new data, network bandwidth and time available

      Greater amounts of new data require more network bandwidth or a longer backup time.
      These factors have practical limits but can be controlled to some degree by the user.
      Consider if any of them can be manipulated to accommodate a timely backup.


      If a network bottleneck or server communication problem is suspected:
      Confirm network throughput between the client and the backup device. 


      Enable avtar comstats logging to facilitate troubleshooting.

       

      Client-side bottlenecks:
      View the avtar backup log in a sophisticated text editor such as Notepad++.


      Ensure this is not the client's initial backup to the server:
      First-time backups are expected to be slow.

      If this is a mature client, check if the backup configuration has recently changed.


      Ensure that the backup was not prematurely canceled:
      Search the backup log for 'canceled'. Below is an example where an impatient user canceled a L1 backup.
       

      2013-11-05 12:15:29 avtar Info <5157>: PARTIAL Backup #14 timestamp 2011-11-05 12:13:36, 2,030 files, 562 folders, 397.3 MB (691 files, 17.44 MB, 4.39% new)
      2013-11-05 12:15:29 avtar Info <7539>: Label "MOD-xxxxxxxxxx", scheduled to expire 11/12/11, none backup
      2013-11-05 12:15:29 avtar Info <6083>: Backed-up 397.3 MB in 1.36 minutes: 17 GB/hour (89,593 files/hour)
      2013-11-05 12:15:29 avtar Info <7883>: Finished at 2011-11-05 12:15:29 GMT Standard Time, Elapsed time: 0000h:01m:21s
      2013-11-05 12:15:29 avtar Info <8468>: Sending wrapup message to parent
      2013-11-05 12:15:29 avtar Info <5314>: Command failed (exit code 10013: Externally canceled)
      


      In cases such as this, where a backup terminates gracefully, the data is retained as a 'PARTIAL' backup.

      Although partial backup logs indicate backup performance, proper analysis requires the log from a completed backup.


      Check the log for file cache or hash cache sizing issues:



      Check if throttling flags are passed to avtar:
      Avtar CPU or network throttling greatly reduces backup performance. 
      See Avamar : How to throttle an Avamar client's consumption of system resources (CPU, network, I/O & memory).

      This can be detected in the backup log.

      2013-09-06 14:22:13 avtar Info <6557>: Network bandwidth throttling is enabled, limiting to approx. 0.512 Mbps (62.50 KB/sec)
      2013-09-06 14:22:13 avtar Info <6558>: CPU throttling is enabled, limiting CPU usage to approx. 70%
      


      Is there an Avamar client CPU or memory bottleneck?
      An Avamar backup runs as fast as hardware allows and competes with other services for resources. Be mindful of the client's "day job" and when it is busy. 

      Monitor the client using Task Manager or Process Explorer (on Windows) or the 'top' command (UNIX or Linux). These can reveal if CPU saturation occurs during the backup. 

      Dell has an internal "LogAnalyzer" tool which charts resource consumption and performance over time. Work with Support to use this.

      Cache files are loaded into memory during the backup. Check the client's memory usage to watch for page faults or clues that the client is short of RAM.

      This is less of an issue where Avamar v7.x clients to Data Domain leverage the 'paging cache' (f_cache2.dat).
      The paging cache reduces memory footprint on a client compared with the traditional 'monolithic' avtar cache.


      Check for a client-side I/O bottleneck:
      After client cache sizing, the next factor determining backup performance is the storage system which hosts the backup data and feeds it to avtar.


      Ensure that the target storage is healthy:
      Ensure that there are no problems with the target storage device preventing optimum performance. 
       

      Ensure that third-party software is not competing with avtar for I/O:
      Are any applications on the client competing with the Avamar client for storage I/O?
      Anti-virus software real-time or on-access scanning drastically impact Avamar client performance.  



      Can the file scan be configured to run in parallel? 
      Sometimes, backup data is hosted across multiple volumes serviced by separate read heads. In these scenarios, it may be possible to configure volume parallelism so that Avamar scans multiple volumes simultaneously. 



      Ensure that the client is not backing up data using CIFS or NFS:
      Backup of CIFS or NFS data is only supported through an NDMP accelerator. 



      Check if storage compression or encryption is in use:
      Backup performance may be lower than expected if the target data resides on target storage where data is compressed or encrypted at a file system level.  


      Analyzing Windows client resource bottlenecks with Perfmon:
      The following article helps create performance graphs to understand if the client is waiting on any particular resource at a certain moment in time. Consider using with graphs produced by the LogAnalyzer tool.



      Backup of Outlook archive .pst files
      A backup with many, or large .pst files may perform slowly. 



      Benchmarking storage performance 
      Check the performance of the storage device where the target data is hosted.



      Poor backup performance due to the data being backed up:
      The most common cause of slow backups is due to the characteristics of the data being backed up.


      Check if there is a lot of new or changed data:

      A few large new or modified files may cause an otherwise fast backup to overrun the backup window. To identify those files see:

      Windows clients

      Linux and UNIX Clients - Check if the client's dataset contains any large, sparse files. 



      Check the backup summary lines to understand the backup scope and identify outlier values:
      Search the backup log for the strings "Backup #" or "Backed-up".

      2017-06-07 20:21:38 avtar Info <5156>: Backup #441 timestamp 2017-06-07 20:21:38, 2,653,523 files, 255,181 folders, 1,566 GB (10,777 files, 668.4 MB, 0.04% new)
      2017-06-07 20:21:38 avtar Info <6083>: Backed-up 1,566 GB in 1281.60 minutes: 73 GB/hour (124,228 files/hour)
      
      These can save you a lot of time when investigating backup performance.
      For the output above, consider:
      1. Whether this is an initial or level 1 backup. (Unlikely, since the backup label is #441)
      2. Whether the number of files in the backup is reasonable. (2.6 million files are reasonable)
      3. The file to folder ratio? (It is 10:1, this is typical)
      4. The total amount of data in the dataset. (~1.5 TB)
      5. The number of files to be processed and the proportion of the total number of files. (~11 K out of 2.5M files is reasonable)
      6. The total size of all files to be processed. (this can only be an estimate)
      7. The amount of changed data to be sent to the Avamar server. (668 MB)
      8. Whether the change rate is reasonable. Higher change rates can be tolerated for smaller datasets (0.04% is reasonable)
      9. Whether the performance per hour, given the overall size and scope of the backup, is reasonable. (124 K files/hour would be considered slow performance given the other figures)

      Frequently, these details provide us with enough data to understand the cause of poor backup performance.
      If necessary, review the status line messages that are generated while the backup runs.

      Determine if any of the values in these two log lines are outliers. In other words, are they larger or smaller than is typical?
      If you are familiar with the backup behaviour it is easier to detect anomalies.



      File to folder ratio
      Most customer datasets have a file to folder of approximately 10:1, and avtar is tuned to reflect this.
      If a dataset has a low file to folder ratio as in the example below, the backup may not run as efficiently without minor tuning.  

      2015-11-18 00:34:32 avtar Info <5156>: Backup #75 timestamp 2015-11-18 00:24:43, 4,007,032 files, 1,974,043 folders, 1,589 GB (2,680 files, 419.4 MB, 0.03% new)
      

      See Avamar client backup performance tuning for datasets with low ratio of files to folders.



      Performance analysis using avtar log Status information messages:
      Using Notepad++ or similar, filter the log for avtar Info lines which contain Status messages. These may be filtered using the code entries containing <5100> or <8688> depending on the version of the Avamar client. These lines are periodic status messages reported by avtar.



      Check for third-party applications unexpectedly updating file metadata:
      Some applications may change file metadata. If this happens, Avamar will back up the entire file.


      Review the use of include and exclude flags. Avoid 'include' statements:
      The Operational Best Practices guide discusses Include and Exclude lists. 

      Avamar must compare every file in the backup dataset with both lists to determine whether to back up the file. This comparison process adds overhead, and can increase backup runtime.

      Check the client's avs\var directory for the presence of an avtar.cmd file.
      Check if that file contains any active --exclude or --exclude-from-file statements.
      If a directory or file system is excluded, but include flags are used, avtar scans it for items which it has been told to 'include'.



      Check if the dataset contains reparse points or stub files:
      Be wary if a dataset contains stub files or pointers to data stored on another device.
      Backup performance suffers if avtar has to wait for the remote file to be recalled.
      Examples of such software are: Enterprise Vault Archiver, Moonwalk, and DiskXtender.



      Backups of virtual clients with an Avamar guest installation



      Known backup performance-related issues from v7.2 due to file scanning behavior change
       
       
       

      Additional Information

      Other Notes

      • Ensure that virtual machine clients are not resource limited or adhering to strict hardware limitations that impact the ability of the Avamar backup to complete quickly.  On busy machines the operating system might be overloaded or juggling too many threads, resulting in severe context switching.
      • Use of the Avamar Operational Best Practices guide to optimize the Avamar system, scheduling backups, and tuning client caches.

      Other References

      Article Properties


      Affected Product

      Avamar, Avamar Client

      Last Published Date

      05 Feb 2024

      Version

      17

      Article Type

      Solution