Mosyzlak
1 Copper

Avamar 6.1 Slow Scanning of Files

Jump to solution

We've been trying to get our entire DFS data set backed up by Avamar 6.1 and are very close but still fighting some very poor performance to fit it all within a 24 hour period. The results from a recently completed backup job are below:

2014-05-07 06:59:22 avtar Info <5156>: Backup #70 timestamp 2014-05-07 06:59:22, 8,509,264 files, 838,732 folders, 3,767 GB (566,573 files, 5.636 GB, 0.15% new)

2014-05-07 06:59:22 avtar Info <7539>: Label "12pm-12PM_File_Server", scheduled to expire after 07/05/14 (2014-07-05 02:00:00 UTC), daily backup

2014-05-07 06:59:22 avtar Info <6083>: Backed-up 3,767 GB in 1139.38 minutes: 198 GB/hour (448,101 files/hour)

2014-05-07 06:59:25 avtar Info <5587>: Updating cache files in C:\Program Files\avs\var

2014-05-07 06:59:25 avtar Info <5069>: - Writing cache file "C:\Program Files\avs\var\f_cache.dat"

2014-05-07 06:59:27 avtar Info <5546>: Cache update complete C:\Program Files\avs\var\f_cache.dat (352.0MB of 768MB max)

It is taking 19 hours for the Avamar client to complete the full backup when there is only 5.636GB of new data to actually be backed up. I suspect that it is having to scan far too many files as it shows that 566,573 files have changed but I don't think that this is the actual case.

We have added resources to this Hyper-V VM so it is now using 4 CPU's and 6 to 8 GB dynamic RAM. It is a Windows Server 2012 standard server.

Currently this backup is failing as I had added in some additional data and now it is only getting roughly 30% of the way through but I think there are some underlying issues causing the backup to be quite slow.

Any ideas?

0 Kudos
1 Solution

Accepted Solutions
avamar_exorcist
3 Argentium

Re: Avamar 6.1 Slow Scanning of Files

Jump to solution

Besides Trend, mentioned earlier, the use of Sophos on a client is known to cause file metadata to be changed, resulting in large proportion of files being re-chunked by avtar.

You can read more about it here

185839 - Poor backup performance on Avamar client running Sophos anti-virus

View solution in original post

13 Replies
ionthegeek
4 Beryllium

Re: Avamar 6.1 Slow Scanning of Files

Jump to solution
It is taking 19 hours for the Avamar client to complete the full backup when there is only 5.636GB of new data to actually be backed up.

This is a misleading way to look at Avamar performance. The 5.636GB of data is only the amount of new data sent to the server -- Avamar still has to process the other ~4TB of data to make sure it hasn't changed.

The GB/hour rating on the <6083> log line is the most relevant number -- this system is backing up 198GB/hour which is more than double the typical performance for Windows filesystem backups.

When troubleshooting filesystem backup performance, the most important first step is to ensure that the file cache is sized correctly. The Avamar client generates a hash of the metadata for each file and compares this with the file cache to determine if the file has changed. If there is a file cache miss, the file will be chunked, meaning the client has to read the entire file from disk. If the cache is undersized, the client will needlessly re-chunk the files whose cache entries were discarded. Based on the log snip above, the cache appears to be adequately sized (only 352MB of 768MB are in use).

If the cache is adequately sized, there must be some other reason the client is finding >500,000 changed files.

  • Has this client completed an initial backup of all of these files? Any new files must be backed up at least once before their entries in the file cache will be used.
  • Have prior backups wrapped up cleanly? A sealed backup is created whenever a backup job completes or times out. This sealed backup is used to validate the cache information during subsequent backup jobs. If a backup fails, no partial is created so any new information added to the caches during that backup cannot be used.
  • Is some application changing the modification dates or other metadata for these files without changing the content? A change to the modification date, ACLs, etc. will cause a file cache miss and the file will have to be re-chunked.

Regarding adding resources to the system, the bottleneck for the vast majority of clients is the client's I/O performance. Adding more CPU power or memory is unlikely to improve performance.

0 Kudos
null01
1 Copper

Re: Avamar 6.1 Slow Scanning of Files

Jump to solution

To add to Ian's comment, I would also suggest looking at any virus software running on the client. Trend (and others) have settings that can trigger a virus scan on file Access rather than file Modification.

0 Kudos
Mosyzlak
1 Copper

Re: Avamar 6.1 Slow Scanning of Files

Jump to solution

Ian,

My main concern is that the 5.63GB shows that not much has changed and the majority of the work being performed by the Avamar client is primarily a comparison between filename, location and modified date compared to what is in the cache file. Out of curiosity is the 5.63GB the amount of block level data transfer performed or the total size of the modified files since last backup?

I know the maths works out that 193GB of backup data is processed in an hour but to me this seems quite slow when considering that 0.15% of that data actually needs to be backed up rather than simply parsed. Maybe there's some complexity to the parsing process that I'm not aware of which is making this process actually performing to expectation but as I said above, it seems slow for what is a non-disk intense process.

  • There were 5 previous sequential backups which completed without error right before the backup I extracted that log section from. The log I used was also a successful backup without errors. I had added some data into the backup set during these 6 successful backups and that had added some substantial time to each backup which I assume is due to Avamar hashing the new files.
  • As above
  • I can't think of any particular reason that so many files would be modified in a single day. I have reviewed the completion statistics of a few of the successful backups and they all have a similar number of files listed in the brackets (modified files?). In every run it is at least 550,000 files - sometimes a few 10k more likely due to me expanding the backup set. This is for a company of about 300 users which equates to about 1800 files per user per day so I doubt it is users changing them. DFS shouldn't be modifying these files unless a user has, neither should the virus scanner. The fact that the number is consistent lends me to thinking that there is a specific set or type of files that are not making it into the cache for whatever reason.

The memory was increased initially as it was previously only 4GB, this was restricting the cache size. It is currently assigned dynamic memory but I believe it is always initialising with a max cache of 700+ MB. The CPU was on advise from the BaaS vendor and was probably more of a guess from them than a real idea. I have previosuly monitored the disk queue length to ensure there is no bottle neck there and it seemed fine then, I might try and give it another run to see if anything has changed in the meantime.

Thanks for your idea Ian, always helps to hear others ideas - hopefully something will twig in my head soon!

0 Kudos
Mosyzlak
1 Copper

Re: Avamar 6.1 Slow Scanning of Files

Jump to solution

null0,

We have tried this previously as well to no great effect. It can't hurt to test it again seeing as a fair amount of things have changed in that meantime.

Thanks!

0 Kudos
ionthegeek
4 Beryllium

Re: Avamar 6.1 Slow Scanning of Files

Jump to solution
My main concern is that the 5.63GB shows that not much has changed and the majority of the work being performed by the Avamar client is primarily a comparison between filename, location and modified date compared to what is in the cache file.

It is vanishingly unlikely that the client is spending any significant amount of time running cache lookups. The file metadata lookups ("stat" on Linux; I don't recall the API call for Windows offhand) and file cache lookups are both incredibly fast. On Windows systems, the typical scan rate is about a million files per hour.


For each and every file cache miss, Avamar has to completely re-read and re-chunk the file. That is almost certainly what is taking the time here.

Out of curiosity is the 5.63GB the amount of block level data transfer performed or the total size of the modified files since last backup?

The 5.63GB is the total size of the chunks that were transferred to the Avamar server, not the size of the modified files. The total size of the modified files is not reported in the "Status" log lines.

0.15% of that data actually needs to be backed up rather than simply parsed

We are still re-chunking more than 500,000 files a day and that's going to take time. If these files are not new, some process is touching the files in a way that is making Avamar think they have been changed. The 0.15% change rate of the file content tells me that the content of the files is probably not changing.

Maybe there's some complexity to the parsing process that I'm not aware of which is making this process actually performing to expectation but as I said above, it seems slow for what is a non-disk intense process.

The backup procedure for an individual file is as follows:

  1. "stat" the file and hash the result (we don't hash everything -- for example, the client ignores access time changes by default)
  2. Look up the file metadata hash in the file cache. If there is a cache hit, mark the file added and move on.
  3. If there is a file cache miss, read the file and chunk it up into variable sized chunks. Compress each chunk and hash it. For each chunk:
    1. Look up the chunk in the hash cache. If there is a cache hit, mark the chunk added and move on
    2. If there is a hash cache miss, send the hash to the server and ask if it already has the associated chunk. If the server has the chunk, mark the chunk added and move on
    3. If the server does not have the chunk, send the chunk and its hash over the network, mark the chunk added and move on

I'm leaving "mark <thing> added and move on" deliberately vague because the full explanation is a bit much for a forums post. Please trust me that this process is also very fast.

More than 90% of the performance gain in Avamar comes from the file cache. If we're getting file cache misses for 500,000 files, that's going to cause a big performance hit. Processing the 500,000 changed files is a disk intense process. Those files will be read in their entirety and re-chunked by the client. Whether or not the data needs to be sent to the server at that point is largely irrelevant -- we have already wasted the time reading them off the disk and re-chunking them.

The typical performance rates in an Avamar environment are:

  • 30GB/hr per node server ingest rate (75GB/hr per node for 7.8TB nodes)
  • 700GB/hr logical network throughput for typical Gigabit LAN links (350GB/hr raw with roughly 2:1 compression)
  • 1 million files per hour filesystem stat performance
  • 100GB/hr Windows filesystem read performance
  • 300GB/hr Linux filesystem read performance

For Avamar servers with 4 storage nodes or more, the bottleneck for any given backup is almost certainly the client I/O performance.

The fact that the number is consistent lends me to thinking that there is a specific set or type of files that are not making it into the cache for whatever reason.

That's possible but I can practically guarantee the issue is environmental in some way. The Avamar client's behaviour is extremely predictable. When the caches are correctly sized, I have never seen the client fail to cache a file that should have been cached.


I would recommend enabling --statistics in the dataset. It adds about 20 lines of information to the log showing the breakdown of chunk sizes, etc.. Another option with caveats -- enabling --verbose=2 in the dataset will provide information about each file that causes a file cache miss but please be aware that the resulting log will be extremely large. Shockingly large.


If you are not able to determine why the client is seeing so many files as changed, I would recommend opening a service request and working with support.

Mosyzlak
1 Copper

Re: Avamar 6.1 Slow Scanning of Files

Jump to solution

Ian,

Thanks for the very technical and detailed information, it helps to have a better understanding of what happens within the client rather than thinking of it as more of a black box system. I’ll add the --statistics option and report what information I receive back in 2 days (today’s backup has already begun). Is this added to the “avagent.cmd” file?

Currently the C: drive has 80GB of free space, would enabling verbose logging use anywhere near that much?

Thanks,

Shayne

0 Kudos
ionthegeek
4 Beryllium

Re: Avamar 6.1 Slow Scanning of Files

Jump to solution

You can add the statistics flag to avtar.cmd or the client's dataset. I believe there is an Advanced option in the dataset to control statistics, actually.

The verbose logs won't be 80GB but you should be careful with this flag either way. Besides the large log size, plug-in backups don't like the verbose flag so that one definitely has to go in the dataset, not into avtar.cmd.

0 Kudos
avamar_exorcist
3 Argentium

Re: Avamar 6.1 Slow Scanning of Files

Jump to solution

Besides Trend, mentioned earlier, the use of Sophos on a client is known to cause file metadata to be changed, resulting in large proportion of files being re-chunked by avtar.

You can read more about it here

185839 - Poor backup performance on Avamar client running Sophos anti-virus

View solution in original post

Mosyzlak
1 Copper

Re: Avamar 6.1 Slow Scanning of Files

Jump to solution

Sophos is installed on this server. Do you think that Sophos is causing Avamar to consider a large number of files as having been changed in some way?

0 Kudos