Lost connection to server, exiting

Question

Running into a very strange issue, wanted to see if anyone else has experienced something like this before.

Backups are failing with a "Lost connection to server, exiting". Near as I can tell, it simply drops the connect. Looking at the network, we are not seeing errors anywhere along the chain (no big packet loss, overflow, port resets), everything is hard coded 100/full end to end, speeds are fantastic when the backup is running, no firewall in place. Nothing that indicates the port is being torn down, yet that appears to be the case.

Shorter/smaller backups appear to run fine, though no real rhyme or reason when they do fail. A daily backup that gets roughly 20-30 GB a day always runs fine. Generally, they need to push 50+ GB of data, though usually more in the 80-100 GB range before it dies. A file-by-file batch job from the client ran without a hitch (each file being a filesystem backup), so I don't think we have a bad file.

Server is running Solaris 8, Networker 7.3.2.
Client was upgraded to Solaris 10, Networker 7.1.3, then upgraded to Networker 7.3.3.

Before this problem cropped up, the client was running Solaris 8 with Networker 7.1.3 without incident. Nothing else has been changed except the client hardware/OS upgrade (ok, not a minor change ).

Currently we have a ticket open with Legato, and they have been going over logs, even a -D9 log, they are still scratching their heads. Hope to get a product specialist on the case soon, but figured I'd post here too, in case someone else has fought strange-ness like this before.

Dave

ble1 · Answer

Have you tried ftp test to see if that one drops? Try to test it both ways (push and pull).

shareef2 · Answer

hi

it is quite funny to reply 2 years later

but i just wanted to know if the root cause have been discovered

and i have a question about your platform ; is your client Sparc based or x86

thanks

Shareef

nandfred1 · Answer

Another year and a half have gone by

Well I havent found root cause, we are seeing the same stuff.

Windows 2008R2 Server.

Linux - Windows - AIX storage nodes.

The lost connections seem to be only towards AIX and Linux clients, and it appears as if it is the control connection to the index server..

But we cant for the death of us, find out where. Network says its the client that resets its control connections.

we tried with out firewalls, sames result. Directly to the backup server, ie. no storage node involved, no difference.

Right now we are testing a linux backup server, see if that helps, and a windows backup server at the same site. maybe that helps, if only to narrow down where to look

nandfred1 · Answer

Heureka or whatever the saying goes, it was there all the time. Page 80 in the 7.6 SP2 release notes

Disable TCP chimney by running the following command:

netsh int tcp set global chimney=disabled

That seemed to have worked in our setup...

ra298 · Answer

That a fix for Windows. I have the same problem with Linux that I'm facing now.

Daniea3 · Answer

Hi rha,   On Linux, is it the mount point that fails with the above error? Regards, Arun

vermad1 · Answer

The error itself says that there is some communication problem. Please check TCP/IP settings and add TCP keepalive to the client. Also check if network has some issues.

mridul_singh · Answer

TCP chimney applies for Linux as well. You can use ethtool command to check the settings and see if tcp chimney is enabled. If so, disable it and that shoud do the trick

NetWorker

Lost connection to server, exiting

Was this post helpful?