You bad boy, you killed server Well, when nsr_shutdown did hang what process was left alive? Was it only nsrd? If yes then that's fine (killing media db daemon is something else). Try to start it in debug mode and see where it hangs.
!? So, server cannot find nsrexecd on the machine on which is running. Hm... last time I saw that was when IP of backup server changed by some undefined magic.
So, check your IP is correct. Check DNS. Check /etc/hosts.
Then nsr_shutdown.
Then run nsrexecd. There should be two of them running (unless on 7.3.x). If you want you can even try to run in debug mode in a same way as nsrd.
The daemon.log looks fine but I was getting some wierd timeout errors on some of the backups and everything was moving extremely slow (minutes instead of seconds) so I tried to stop it with the nsr_shutdown and that got hung up too. Something must have happened last night that I don't know about.
I don't know, to be honest I believe it is something related to that box, but it's hard to say what from here. That invalid connection is confusing me. Given that nothing works I would suggest to reboot the box and see what happens then (if there is any change at all). If not, open this with support.
Can you tell me why in the first place nsr_shutdown was executed? What problems did you had (let's see how did it all started as that might give us an answer to what is going on).
ble1
4 Operator
•
14.4K Posts
0
September 14th, 2006 08:00
lalexis
2 Intern
•
253 Posts
0
September 14th, 2006 09:00
bravored1
23 Posts
0
September 14th, 2006 09:00
ble1
4 Operator
•
14.4K Posts
0
September 14th, 2006 10:00
# nsrd -D9 > /nsr/logs/nsrd.debug 2>&1 &
# tail -f /nsr/logs/nsrd.debug
lalexis
2 Intern
•
253 Posts
0
September 14th, 2006 11:00
$ shrimp /nsr/logs> nsrexecd
$ shrimp /nsr/logs> nsrd -D9 > /nsr/logs/nsrd.debug 2>&1 &
[1] 91366
$ shrimp /nsr/logs> tail -f /nsr/logs/nsrd.debug
09/14/06 14:05:31 nsrd: Cannot contact nsrexecd service on shrimp.tridentad.org,
Timed out
09/14/06 14:05:31 nsrd: nsrexecd is unavailable, cannot start.
the nsrexecd processes were running
ble1
4 Operator
•
14.4K Posts
0
September 14th, 2006 11:00
So, check your IP is correct. Check DNS. Check /etc/hosts.
Then nsr_shutdown.
Then run nsrexecd. There should be two of them running (unless on 7.3.x). If you want you can even try to run in debug mode in a same way as nsrd.
Then try to run nsrd in debug mode again.
lalexis
2 Intern
•
253 Posts
0
September 14th, 2006 12:00
09/14/06 15:09:09 nsrexecd: mondaemon_kill_check: entry
09/14/06 15:10:23 nsrexecd: clu_is_localvirthost(): unknown cluster type
09/14/06 15:11:37 nsrexecd: clu_is_localvirthost(): unknown cluster type
09/14/06 15:11:37 nsrexecd: clu_is_localvirthost(): unknown cluster type
09/14/06 15:11:37 nsrexecd: clu_is_localvirthost(): unknown cluster type
09/14/06 15:11:37 nsrexecd: clu_is_localvirthost(): unknown cluster type
09/14/06 15:11:37 nsrexecd: Aborting connection.
invalid connection from 0.0.0.0/34674 to 0.0.0.0/0.
Aborting due to: Connection reset by peer
09/14/06 15:11:37 nsrexecd: mondaemon_kill_check: entry
lalexis
2 Intern
•
253 Posts
0
September 14th, 2006 12:00
lalexis
2 Intern
•
253 Posts
0
September 14th, 2006 12:00
ble1
4 Operator
•
14.4K Posts
0
September 14th, 2006 12:00
ble1
4 Operator
•
14.4K Posts
0
September 14th, 2006 12:00
ble1
4 Operator
•
14.4K Posts
0
September 14th, 2006 12:00
Can you tell me why in the first place nsr_shutdown was executed? What problems did you had (let's see how did it all started as that might give us an answer to what is going on).
lalexis
2 Intern
•
253 Posts
0
September 14th, 2006 12:00
$ shrimp /nsr/logs> ps -ef|grep nsr
root 364866 390518 0 14:58:54 pts/101 0:00 grep nsr
root 1053844 1225912 0 14:57:21 pts/3 0:00 nsrexecd -D9
root 419090 1053844 0 14:57:21 pts/3 0:00 nsrexecd -D9
What I am getting for the debug is:
$ shrimp /nsr> nsrexecd -D9 > /nsr/logs/nsrexec.out
nsrexecd:
clu_init_lc(): ENTRY...
nsrexecd:
get_lc_fspath_vhost_map(): ENTRY ...
nsrexecd: No access to file: /usr/bin/lcmap ...
nsrexecd:
clu_init_lc(): Can't build fspath_vhost_map...
nsrexecd:
dump_map_lc(): ENTRY ...
nsrexecd: Lc_use_local_vhost_list = FALSE
nsrexecd: MOUNTED filesystems
nsrexecd:
dump_map_lc(): EXIT ...
nsrexecd:
clu_is_cluster_host_lc(): ENTRY ...
nsrexecd: process started, pid 419090
09/14/06 14:57:23 nsrexecd: Origin is /usr/bin/
09/14/06 14:57:23 nsrexecd: interface addr = 192.168.1.2
09/14/06 14:57:23 nsrexecd: interface addr = 192.168.1.4
09/14/06 14:57:23 nsrexecd: interface addr = 199.67.19.132
09/14/06 14:57:23 nsrexecd: could not find GUI metadata directory
09/14/06 14:57:23 nsrexecd: Adding port range (service): 7937-9936
09/14/06 14:57:23 nsrexecd: Adding port range (connection): 10001-30000
09/14/06 14:57:23 nsrexecd:
Calling clnttcp_create function
09/14/06 14:57:23 nsrexecd: socket bound to port 13690
09/14/06 14:57:23 nsrexecd:
Calling clnttcp_create function
09/14/06 14:57:23 nsrexecd: socket bound to port 19717
09/14/06 14:57:23 nsrexecd:
Calling clnttcp_create function
09/14/06 14:57:23 nsrexecd: socket bound to port 28304
09/14/06 14:57:23 nsrexecd:
Calling clnttcp_create function
09/14/06 14:57:23 nsrexecd: socket bound to port 12442
09/14/06 14:57:23 nsrexecd: mondaemon_check count 1
09/14/06 14:57:23 nsrexecd: checking file ..
09/14/06 14:57:23 nsrexecd: checking file ...
09/14/06 14:57:23 nsrexecd: checking file nsrla.res.lck.
09/14/06 14:57:23 nsrexecd: checking file .nsr.
09/14/06 14:57:23 nsrexecd: checking file sec.
09/14/06 14:57:23 nsrexecd: checking file product.res.lck.
09/14/06 14:57:23 nsrexecd: checking file nsrdb.lck.
09/14/06 14:59:23 nsrexecd: mondaemon_kill_check: entry
09/14/06 15:01:23 nsrexecd: mondaemon_kill_check: entry
09/14/06 15:03:23 nsrexecd: mondaemon_kill_check: entry
I am still getting the same error when trying to start nsrd
lalexis
2 Intern
•
253 Posts
0
September 14th, 2006 16:00
lalexis
2 Intern
•
253 Posts
0
September 14th, 2006 17:00