Unsolved
This post is more than 5 years old
11 Posts
0
977
February 20th, 2007 01:00
error from accept call: Software caused connection abort
Hello all,
We have got a very urgent problem so any suggestion will be good received. We have got a very big backup platform, with Alphastor SR 3.1 (Soalris8) and NW 7.2.2 in the NetWorker Server (Solaris8) and all of the Storage Nodes (Solaris 8 and Solaris9). Two L700 controlled by Alphastor and another ADIC i-2000 not controlled by Alphastor.
Since last week, our backup platform is not steady. Each nignt, matching the heavy backup load, the NetWorker server does not respond (the mount request fails, the backup sessions does not enter, etc.) and in the daemon.log we see show messages like "error from accept call: Software caused connection abort
01/26/07 22:56:52 nsrmmdbd: remote machine is 0.0.0.0/0 and local machine is 0.0.0.0/9263.
service at 0.0.0.0/926301/26/07 22:56:52 nsrmmdbd: 1275 cannot accept any more connections - Software caused connection abort". The messages are with all the daemon (not only nsrmmdbd). After this, the machine does not respond too. We can't kill the NetWorker daemon neither doing a reboot of the machine. Finally is neede to reboot from the buttom. We think that it could be possible that the problem source was a corrupted /nsr/res, but we are not sure. We think too that could be relationed with comunications, because the proble match too with a reubication of the backup platform to another place (all the element are in the new place, and is possible thar the ehernet network does not very fine, but we can not demonstrate it).
Any ideas?, we are in a very big trouble. Now, we are going to recreate the /nsr/res from the beginning and probe it.
Thanks in advance, and best regards!!!
We have got a very urgent problem so any suggestion will be good received. We have got a very big backup platform, with Alphastor SR 3.1 (Soalris8) and NW 7.2.2 in the NetWorker Server (Solaris8) and all of the Storage Nodes (Solaris 8 and Solaris9). Two L700 controlled by Alphastor and another ADIC i-2000 not controlled by Alphastor.
Since last week, our backup platform is not steady. Each nignt, matching the heavy backup load, the NetWorker server does not respond (the mount request fails, the backup sessions does not enter, etc.) and in the daemon.log we see show messages like "error from accept call: Software caused connection abort
01/26/07 22:56:52 nsrmmdbd: remote machine is 0.0.0.0/0 and local machine is 0.0.0.0/9263.
service at 0.0.0.0/926301/26/07 22:56:52 nsrmmdbd: 1275 cannot accept any more connections - Software caused connection abort". The messages are with all the daemon (not only nsrmmdbd). After this, the machine does not respond too. We can't kill the NetWorker daemon neither doing a reboot of the machine. Finally is neede to reboot from the buttom. We think that it could be possible that the problem source was a corrupted /nsr/res, but we are not sure. We think too that could be relationed with comunications, because the proble match too with a reubication of the backup platform to another place (all the element are in the new place, and is possible thar the ehernet network does not very fine, but we can not demonstrate it).
Any ideas?, we are in a very big trouble. Now, we are going to recreate the /nsr/res from the beginning and probe it.
Thanks in advance, and best regards!!!
0 events found
No Events found!


ble1
4 Operator
•
14.4K Posts
0
February 21st, 2007 09:00
csanchezgonzale
11 Posts
0
February 28th, 2007 01:00
Thank you for your response. We did the follow and the problem seems to be solved, at least by the moment (since a week the problem does not repeat):
- nsrim -X
- nsrck -L6
- Rebuild the /nsr/res (with all configuration)
The error message appeared yesterday again (with nsrmmd daemon) but the backup was working rigth and the machine did not fall as last week. I am going to check your idea about open files. Do you know what value is the recommended for a Backup Server (Solaris 8) with around 400 clients more or less?
Thanks in advance Hrvoje!
ble1
4 Operator
•
14.4K Posts
0
February 28th, 2007 06:00