Unsolved

This post is more than 5 years old

977

February 20th, 2007 01:00

error from accept call: Software caused connection abort

Hello all,

We have got a very urgent problem so any suggestion will be good received. We have got a very big backup platform, with Alphastor SR 3.1 (Soalris8) and NW 7.2.2 in the NetWorker Server (Solaris8) and all of the Storage Nodes (Solaris 8 and Solaris9). Two L700 controlled by Alphastor and another ADIC i-2000 not controlled by Alphastor.

Since last week, our backup platform is not steady. Each nignt, matching the heavy backup load, the NetWorker server does not respond (the mount request fails, the backup sessions does not enter, etc.) and in the daemon.log we see show messages like "error from accept call: Software caused connection abort
01/26/07 22:56:52 nsrmmdbd: remote machine is 0.0.0.0/0 and local machine is 0.0.0.0/9263.
service at 0.0.0.0/926301/26/07 22:56:52 nsrmmdbd: 1275 cannot accept any more connections - Software caused connection abort". The messages are with all the daemon (not only nsrmmdbd). After this, the machine does not respond too. We can't kill the NetWorker daemon neither doing a reboot of the machine. Finally is neede to reboot from the buttom. We think that it could be possible that the problem source was a corrupted /nsr/res, but we are not sure. We think too that could be relationed with comunications, because the proble match too with a reubication of the backup platform to another place (all the element are in the new place, and is possible thar the ehernet network does not very fine, but we can not demonstrate it).

Any ideas?, we are in a very big trouble. Now, we are going to recreate the /nsr/res from the beginning and probe it.

Thanks in advance, and best regards!!!

4 Operator

 • 

14.4K Posts

February 21st, 2007 09:00

To me it looks as you run out of system resources on the box (open files for example).

February 28th, 2007 01:00

Hello Hrvoje!

Thank you for your response. We did the follow and the problem seems to be solved, at least by the moment (since a week the problem does not repeat):
- nsrim -X
- nsrck -L6
- Rebuild the /nsr/res (with all configuration)

The error message appeared yesterday again (with nsrmmd daemon) but the backup was working rigth and the machine did not fall as last week. I am going to check your idea about open files. Do you know what value is the recommended for a Backup Server (Solaris 8) with around 400 clients more or less?

Thanks in advance Hrvoje!

4 Operator

 • 

14.4K Posts

February 28th, 2007 06:00

rlim_fd_max in /etc/system if I remember correctly. You can also see it via and change it too with ulimit command.

0 events found

No Events found!

Top