4 Operator

 • 

14.4K Posts

September 14th, 2006 08:00

You bad boy, you killed server :D Well, when nsr_shutdown did hang what process was left alive? Was it only nsrd? If yes then that's fine (killing media db daemon is something else). Try to start it in debug mode and see where it hangs.

2 Intern

 • 

253 Posts

September 14th, 2006 09:00

Everything hung when I tried to shut it down. How do I start it in debug mode?

23 Posts

September 14th, 2006 09:00

How to start nsr in debug mode for Solaris?

4 Operator

 • 

14.4K Posts

September 14th, 2006 10:00

# /usr/sbin/nsrexecd
# nsrd -D9 > /nsr/logs/nsrd.debug 2>&1 &
# tail -f /nsr/logs/nsrd.debug

2 Intern

 • 

253 Posts

September 14th, 2006 11:00

I did the steps just the way you have them and this is what I got:

$ shrimp /nsr/logs> nsrexecd
$ shrimp /nsr/logs> nsrd -D9 > /nsr/logs/nsrd.debug 2>&1 &
[1] 91366
$ shrimp /nsr/logs> tail -f /nsr/logs/nsrd.debug
09/14/06 14:05:31 nsrd: Cannot contact nsrexecd service on shrimp.tridentad.org,
Timed out
09/14/06 14:05:31 nsrd: nsrexecd is unavailable, cannot start.

the nsrexecd processes were running

4 Operator

 • 

14.4K Posts

September 14th, 2006 11:00

!? So, server cannot find nsrexecd on the machine on which is running. Hm... last time I saw that was when IP of backup server changed by some undefined magic.

So, check your IP is correct. Check DNS. Check /etc/hosts.

Then nsr_shutdown.

Then run nsrexecd. There should be two of them running (unless on 7.3.x). If you want you can even try to run in debug mode in a same way as nsrd.

Then try to run nsrd in debug mode again.

2 Intern

 • 

253 Posts

September 14th, 2006 12:00

the nsrexecd just came up with this:

09/14/06 15:09:09 nsrexecd: mondaemon_kill_check: entry
09/14/06 15:10:23 nsrexecd: clu_is_localvirthost(): unknown cluster type
09/14/06 15:11:37 nsrexecd: clu_is_localvirthost(): unknown cluster type
09/14/06 15:11:37 nsrexecd: clu_is_localvirthost(): unknown cluster type
09/14/06 15:11:37 nsrexecd: clu_is_localvirthost(): unknown cluster type
09/14/06 15:11:37 nsrexecd: clu_is_localvirthost(): unknown cluster type
09/14/06 15:11:37 nsrexecd: Aborting connection.
invalid connection from 0.0.0.0/34674 to 0.0.0.0/0.
Aborting due to: Connection reset by peer
09/14/06 15:11:37 nsrexecd: mondaemon_kill_check: entry

2 Intern

 • 

253 Posts

September 14th, 2006 12:00

The daemon.log looks fine but I was getting some wierd timeout errors on some of the backups and everything was moving extremely slow (minutes instead of seconds) so I tried to stop it with the nsr_shutdown and that got hung up too. Something must have happened last night that I don't know about.

2 Intern

 • 

253 Posts

September 14th, 2006 12:00

I did open it with support and after they tinkered around with it for awhile they suggested rebooting. I am scheduled to reboot later tonight.

4 Operator

 • 

14.4K Posts

September 14th, 2006 12:00

Is there anything in /var/adm/messages that might give you additional clues (from system side)?

4 Operator

 • 

14.4K Posts

September 14th, 2006 12:00

That sounds fine. How did you get to this mess anyway? What were the last entries in daemon.log before things started to fall apart?

4 Operator

 • 

14.4K Posts

September 14th, 2006 12:00

I don't know, to be honest I believe it is something related to that box, but it's hard to say what from here. That invalid connection is confusing me. Given that nothing works I would suggest to reboot the box and see what happens then (if there is any change at all). If not, open this with support.

Can you tell me why in the first place nsr_shutdown was executed? What problems did you had (let's see how did it all started as that might give us an answer to what is going on).

2 Intern

 • 

253 Posts

September 14th, 2006 12:00

I started the nsrexecd with a debug and there are two running:

$ shrimp /nsr/logs> ps -ef|grep nsr
root 364866 390518 0 14:58:54 pts/101 0:00 grep nsr
root 1053844 1225912 0 14:57:21 pts/3 0:00 nsrexecd -D9
root 419090 1053844 0 14:57:21 pts/3 0:00 nsrexecd -D9

What I am getting for the debug is:

$ shrimp /nsr> nsrexecd -D9 > /nsr/logs/nsrexec.out
nsrexecd:
clu_init_lc(): ENTRY...
nsrexecd:
get_lc_fspath_vhost_map(): ENTRY ...
nsrexecd: No access to file: /usr/bin/lcmap ...
nsrexecd:
clu_init_lc(): Can't build fspath_vhost_map...
nsrexecd:
dump_map_lc(): ENTRY ...
nsrexecd: Lc_use_local_vhost_list = FALSE
nsrexecd: MOUNTED filesystems
nsrexecd:
dump_map_lc(): EXIT ...
nsrexecd:
clu_is_cluster_host_lc(): ENTRY ...
nsrexecd: process started, pid 419090
09/14/06 14:57:23 nsrexecd: Origin is /usr/bin/
09/14/06 14:57:23 nsrexecd: interface addr = 192.168.1.2
09/14/06 14:57:23 nsrexecd: interface addr = 192.168.1.4
09/14/06 14:57:23 nsrexecd: interface addr = 199.67.19.132
09/14/06 14:57:23 nsrexecd: could not find GUI metadata directory
09/14/06 14:57:23 nsrexecd: Adding port range (service): 7937-9936
09/14/06 14:57:23 nsrexecd: Adding port range (connection): 10001-30000
09/14/06 14:57:23 nsrexecd:
Calling clnttcp_create function
09/14/06 14:57:23 nsrexecd: socket bound to port 13690
09/14/06 14:57:23 nsrexecd:
Calling clnttcp_create function
09/14/06 14:57:23 nsrexecd: socket bound to port 19717
09/14/06 14:57:23 nsrexecd:
Calling clnttcp_create function
09/14/06 14:57:23 nsrexecd: socket bound to port 28304
09/14/06 14:57:23 nsrexecd:
Calling clnttcp_create function
09/14/06 14:57:23 nsrexecd: socket bound to port 12442
09/14/06 14:57:23 nsrexecd: mondaemon_check count 1
09/14/06 14:57:23 nsrexecd: checking file ..
09/14/06 14:57:23 nsrexecd: checking file ...
09/14/06 14:57:23 nsrexecd: checking file nsrla.res.lck.
09/14/06 14:57:23 nsrexecd: checking file .nsr.
09/14/06 14:57:23 nsrexecd: checking file sec.
09/14/06 14:57:23 nsrexecd: checking file product.res.lck.
09/14/06 14:57:23 nsrexecd: checking file nsrdb.lck.
09/14/06 14:59:23 nsrexecd: mondaemon_kill_check: entry
09/14/06 15:01:23 nsrexecd: mondaemon_kill_check: entry
09/14/06 15:03:23 nsrexecd: mondaemon_kill_check: entry


I am still getting the same error when trying to start nsrd

2 Intern

 • 

253 Posts

September 14th, 2006 16:00

I rebooted and it is still not coming up. Any ideas?

2 Intern

 • 

253 Posts

September 14th, 2006 17:00

It looks like it is coming up but now I get a message saying that the trial enabler code has expired. It has had a perminent code since November
No Events found!

Top