Start a Conversation

Unsolved

This post is more than 5 years old

3073

March 7th, 2013 05:00

Backup failing with network congestion errors


Hello,

Backup is failing intermittently for a linux client with these errors :

avtar Info <7644>: Server not responding (possible network congestion?) (300 seconds)
[retry]  ERROR: <0001> Message exceeded retry count in retry.cpp 20 count=120

avtar Error <5774>: Internal Error: backtree: ADD_HASH timeout error

avtar Info <5726>: Aborting backup due to error (1:EXC_TIMEOUT)

avtar FATAL <5397>: Server timeout: Communication failure with server, aborting

The backup fails for two days and on the third day, it completes successfully.

There is no maintenance tasks running on the Avamar server during the backup and other clients are getting backed up fine

Avamar server version : 6.0.0-592
Linux client plugin version : 5.0.105-169

Any ideas ?

Thanks

2K Posts

March 7th, 2013 06:00

If this is only happening to one client, the issue is likely some kind of intermittent network issue between the client and the server. If you set up ping to run through the backup window and log the results to a file, you should be able to see if the network is dropping.

143 Posts

March 7th, 2013 12:00

We see this on some of our replication streams every few days.  I'd be curious to hear what you find, especially if it ends up being something other than a loss of network connectivity.

143 Posts

March 7th, 2013 14:00

ianderson wrote:

fdxpilot, you may want to work with support for your issue. If it's happening every few days, it sounds like there's something happening on the system that should be looked at.

I see you're way ahead of me! L2 support is already working with Engineering on your issue.

We definitely do our part to keep support and engineering employed.  The more eyes on our SRs and ESCs the better. :-)  Hopefully sandcruise will open an SR and finds a resolution for the backup issue, too.

2K Posts

March 7th, 2013 14:00

fdxpilot, you may want to work with support for your issue. If it's happening every few days, it sounds like there's something happening on the system that should be looked at.

I see you're way ahead of me! L2 support is already working with Engineering on your issue.

2K Posts

March 7th, 2013 14:00

The message itself is generic -- it just means that the client has sent messages to the server and hasn't received a reply for (300|600|900|...) seconds. This can happen if there are network issues but it can also happen if the server is read only, under heavy load, etc..

fdxpilot, you may want to work with support for your issue. If it's happening every few days, it sounds like there's something happening on the system that should be looked at.

11 Posts

March 29th, 2013 04:00

It doesn't seems to be a communication issue, the on-demand backup completes successfully, only the scheduled backup fails.

ianderson Do you think that upgrading the avamar linux client plugin to a newer version can help us as the client is at a lower version than server

fdxpilot If you find a resolution to the issue, please update

Thanks in advance

No Events found!

Top