2 Bronze

nsrd taking 45 mins to start

Jump to solution

Hi guys,

We have an interesting problem with a NetWorker 7.6.0.8 installation on a Red Hat EL4 x86 host.

When we start NetWorker services we notice two nsrexecd pid's exist and nsrd does not come online for another 45mins afterwards.

If I monitor the nsr folder, there is zero activity in the logs directory (or any other dir) after we start the script. We're logged in as root of course. Once nsrd starts, there are no problems with regular NetWorker operations. The logs do not indicate any errors or any problems of any sort after everything starts.

We've taken a look at the OS logs but nothing is occurring at the time of starting NetWorker.

I have not tried renaming res, mm or cfi yet as a troubleshooting step, but I have renamed tmp.

Any thoughts?

Thanks!

Justin

0 Kudos
1 Solution

Accepted Solutions
3 Argentum

Re: nsrd taking 45 mins to start

Jump to solution

Thanks for the feedback.  When you mentioned the servers file, it reminded me that I had worked on another service request with a  different customer.  Same problem, same root cause, and same solution!

I suspect that that bogus hostname would have shown up on the debug output.    Not surprising that it is the root cause.  Nsrexecd is responsible for allowing or refusing connection from other NetWorker servers.  So it makes sense that it would read the servers file if it exist, and then try to resolve the hostnames.

Regardless...  problem solved...  Q.E.D. 

View solution in original post

0 Kudos
10 Replies
3 Argentum

Re: nsrd taking 45 mins to start

Jump to solution

Hi,

Please have a look at the process activity with the tool strace/trace. Maybe you will find out where it stuck. Please have a look also at the network configuration - I have similar problem that NW start takes a lot of time because of /etc/hosts misconfiguration.

2 Bronze

Re: nsrd taking 45 mins to start

Jump to solution

Hello,

Thanks for the reply! I tried trace/strace but it was not installed on the host at the time

I will definitely look into the network configuration as you suggested. I did notice they had teamed nics. I will also request to have strace installed.

Thanks again

Justin

0 Kudos
3 Argentum

Re: nsrd taking 45 mins to start

Jump to solution

Well, strace will tell you on which system call it is waiting/hanging.

0 Kudos
2 Bronze

Re: nsrd taking 45 mins to start

Jump to solution

Great, thank you

0 Kudos
3 Argentum

Re: nsrd taking 45 mins to start

Jump to solution

If nsrd is not starting at all until 45 mins after nsrexecd starts, then the problem is with nsrexecd.

The startup script starts nsrexecd and then nsrd.  nsrd will not be started till after nsrexecd is completed started.

In addition, normally there will only be one nsrexecd running.  If there are two, then this also more evidence that nsrexecd did not finish its startup phase quickly.  The likely reason for this is a network configuration issue.

To debug this, stop all NetWorker processes, then:

script /tmp/nsrexecd.txt

nsrexecd -D9

(wait)

<ctrl-c>

exit

Review the output file and look at what host names and i.p. addresses it is referencing.

2 Bronze

Re: nsrd taking 45 mins to start

Jump to solution

Thank you Wallace, this is very helpful. I wasn't sure at what point nsrd would be expected to start - I.e. after nsrexecd has completed starting.

Thanks!!

Justin

0 Kudos
3 Argentum

Re: nsrd taking 45 mins to start

Jump to solution

I had worked with a customer that seems to have the same symptoms.  While watching the nsrexecd -D9 output, the customer and I had noticed that there was a reference to 127.0.0.2!  We were not sure where this came from, but definitely nsrexecd was trying to resolve this in its logic.  It was not in the local host file.

It turned out that the Linux server had defined 127.0.0.2 in its network configuration.  Once this was removed, nsrexecd started without any unusual delay.

I am not saying that this is what you will see, but the debug information should help.  If you cannot see anything obvious, then open a support ticket for assistance.

Let us know too...  Good luck!

2 Bronze

Re: nsrd taking 45 mins to start

Jump to solution

Hi Wallace,

Thanks for the feedback. I put nsrexecd into D9 mode and couldn't see anything unusual after 5 mins or so of waiting. We did some more poking around and found the /nsr/res/servers config file had a bogus hostname at the top of the list before the NetWorker server name. We removed the bogus host so the nsr host was at the top of the list and everything now starts perfectly.

So it appears it was timing out on a non-existent host. Interesting problem!

Thanks for your assistance!

Justin

0 Kudos
3 Argentum

Re: nsrd taking 45 mins to start

Jump to solution

Thanks for the feedback.  When you mentioned the servers file, it reminded me that I had worked on another service request with a  different customer.  Same problem, same root cause, and same solution!

I suspect that that bogus hostname would have shown up on the debug output.    Not surprising that it is the root cause.  Nsrexecd is responsible for allowing or refusing connection from other NetWorker servers.  So it makes sense that it would read the servers file if it exist, and then try to resolve the hostnames.

Regardless...  problem solved...  Q.E.D. 

View solution in original post

0 Kudos