Avamar: File system backup failing due to server timeout issues
Summary: Knowledge Base (KB) article that addresses Avamar backup failures due to connection or timeout issue.
Symptoms
File system backup of a Windows client may fail with the following errors and FATAL messages in the log:
2012/12/28-21:59:34.82899 [ade-async] ERROR: <0001> Message timed out (could not be sent), id=xxx count=0 delay=14105.7"
...
2012-12-29 02:29:34 avtar Error <5774>: Internal Error: backtree: ADD_HASH timeout error
2012-12-29 02:29:34 avtar Stats <15102>: 2012-12-29 02:29:34 COMSTATS:0 sent= 8 recv[0]= 12 pending= 7/175 int= 0/50 send= 82 bytes= 1120+ 209192 sleepms= 0 delay=(16.457 [15.056..19.919] sd=1.946 n= 11) (15.461 [15.461..15.461] sd=0.000 n= 1)
2012-12-29 02:29:34 avtar Info <5726>: Aborting backup due to error (1:EXC_TIMEOUT)
2012/12/28-21:59:34.95399 [avtar] nbackmain connection exception 1 EXC_TIMEOUT
2012-12-29 02:29:34 avtar FATAL <5397>: Server timeout: Communication failure with server, aborting. Correct network connectivity issues, verify access to the server and retry. "
...
<ERROR>Message exceeded retry count in retry.cpp 25415 count=16 </ERROR>
Cause
Several root causes could cause the issue to occur:
-
The Windows client might experience abnormally high or irregular network latency or connection issues.
-
A time synchronization issue between the Avamar grid and the client.
Resolution
Edit or create the avtar.cmd file and add the following flags and retry the backup:
--debug
--comstats
--stats
--conntimeout=2800
--readonly-retry-timeout=4500
--msg-retry-timeout=300
Run a proactive health check on the Avamar grid to verify that there are no network configurations or duplicate IP address issues.
See KB article Avamar: How to run the proactive_check.pl health check script on an Avamar Server for guidance on how to run the proactive health check script.
If no network configuration problems are detected on the grid or if the above flags do not resolve the issue, then it is the network environment. It may include abnormally high network latency, quality of service (QoS) configuration issues, duplex mismatch problems somewhere between the client and grid, and so forth.
Additional Information
The connection timeout parameter gives the client backup more time to retry before it times out. The value is in seconds.
The readonly-retry-timeout parameter indicates the number of seconds for avtar to retry the connection before timing out while the Avamar grid is in read-only mode.
The msg-retry-timeout parameter extends the retry timeout, so the client sends fewer HASH_IS_PRESENT messages for the same hash when the server is busy. Debugging must be enabled to see these messages.