Start a Conversation

Unsolved

This post is more than 5 years old

4378

May 24th, 2010 02:00

Scheduled backup fails and Manual backup succeeds

Hi

Im running NetWorker version 7.5.2 on Windows 2008 R2 NetWorker server.

I recently just uninstalled 7.5.2 on 2 of my Windows 2008 clients to install 7.5.1.10 to fix the VSS backup issue.

after installing this version my schedule backups do not want to run, However the Manual backup succeeds when I run it from the client.

i then went backup and installed the 7.5.2 version on the 2 clients, but the scheduled backup still fails with error - Probe job had unrecoverable failure(s), this job is being abandoned.

I rebooted my NetWoker server and this also didnt solve the problem.

I then deleted the clients in NMC and recreated them, this also did not work.

When i opened the NetWorker GUI on one the clients it said that the backup server is not registered.

We will be rebooting the 2 clients, hoping that the reboot will fix the comunication error.

I want to know should I uninstall the NetWoker software from the clients and delete them in NetWorker GUI before the reboot and then do a fresh install.

Or should I leave the software installed and then reboot the servers ?

Just want to make sure that I do it the right way, as its very difficult to get downtime on production systems ?

Thanks and Regards!

Umraan

736 Posts

May 24th, 2010 02:00

I think that before you start uninstalling and rebooting machines, you should check the name resolution between the clients and servers.  Do you have the hosts files on clients and server correctly filled out with IP,short-name and FQDN?  Does nslookup work from all sides on IP address and hostnames?  You might need to change something on the hosts files or DNS server.

-Bobby

96 Posts

May 24th, 2010 03:00

I've already checked the name resolutions between the clients and DNS resolves from all ends.

there are no error logs as well, this is a tricky one.

Hoping the reboot could fix it.

96 Posts

May 24th, 2010 06:00

Is there perhaps a way that I can delete the client completely and re-created the client with a new client id ??

736 Posts

May 24th, 2010 06:00

I'm not sure how this is going to help with a communication issue, but you can do this by just deleting the client from the NMC and creating a new instance.  It will create it with a new clientid by default.  You could also try putting the ip-address of the client in the aliases field.  This sometimes helps with this sort of issue.

-Bobby

87 Posts

May 24th, 2010 07:00

Hello,

Check the windows firewall on client server.

Regards,

30 Posts

May 24th, 2010 07:00

Hi Umraan,

One question, you said that you have downgraded the Newtorker version on the clients, what version of Windows Server 2008 are you running on the clients? is it R2?

If its not R2, i would say this is a peer issue, since you downgraded the software but its using the same Nsrladb. (Legato folder is not deleted.)

Please try the following:

Stop Nsrexecd on the client, rename nsrladb on Networker Client installation path, restart Nsrexecd and try the backup again.

It migth be necessary to do the same thing on the server, as it migth not be able to get the fresh peer info from the client if the communication is not correct.

In that case From CMD type Net Stop Nsrexecd to stop all the services on the server, also rename the nsrladb, and restart Netwrker processes.

Hope it helps.

Regards,

Bruno.

43 Posts

May 24th, 2010 15:00

Run savegrp in debug mode ( savegrp -D9) on the client having this issue. The debug output should give you more information on why the probe failed.

Also review the sso files generated under nsr\tmp\sg\ .

96 Posts

May 24th, 2010 21:00

I've tried all the above still not able to resolve or find the issue.

Does anyone know how to register a client ?

I deleted the client completely as well as the indix directory of the client, it now says that the client is not registered.

Does anyone know how to re-register a client ?

96 Posts

May 25th, 2010 02:00

Hi Bobby,

I followed the steps in the link you sent me.

I stopped NetWoker services on both client and server, backup is still failing.

One of the network engineers advised that its the ipv6 address that is causing the issue with reverse DNS lookup.

thanks for all the responses in helping me get this issue resolved, its much appreciated!!!

I'll keep you posted.

736 Posts

May 25th, 2010 02:00

If you've followed the steps in esg67127 (which lists the most likely causes of this)

http://solutions.emc.com/EMCSolutionView.asp?id=esg67127&usertype=C

you should follow the previous advice and run the savegrp in debug mode and check the output and the /tmp/sg/ output for a clearer indication of the source of the problem.

-Bobby

96 Posts

May 25th, 2010 05:00

Hi

I ran the backup from the command line on the client using the -D and I got the following.

It looks suspect to me but not sure what its telling me.....I know its authentication

I only cpoied the important info:

perhaps someone can tell me what this means ?

Bound TCP/IPv6 socket descriptor 820 to port 8048
RPC Authentication: Client failed to authenticate using GSS Legato: savefs faile
d to authenticate with nsrexecd using GSS Legato: Authentication error; why = (u
nknown authentication error - 15)
39076:savefs: RPC warning: Could not use GSS Legato authentication. (severity 2,
number 7)

lgto_auth: redirected to 1ictbkp01.avinet.co.za prog 390103 vers 2
Creating tcp RPC client handle with host server1.com(IPV6 address)
Creating TCP/IPv6 RPC client handle
Creating TCP/IPv6 RPC client handle
Attempting to bind IPv6 socket descriptor 816
Socket bound to OS determined port
Setting RPC socket send buffer size to 65536
Setting RPC socket recv buffer size to 65536
Bound TCP/IPv6 socket descriptor 816 to port 7938

Bound TCP/IPv6 socket descriptor 820 to port 8048
freeing unused errinfo with msgid 0
freeing unused errinfo with msgid 0
RPC Authentication: Client failed to authenticate using GSS Legato: savefs faile
d to authenticate with nsrexecd using GSS Legato: Authentication error; why = (u
nknown authentication error - 15)
freeing unused errinfo with msgid 0
freeing unused errinfo with msgid 0
39076:savefs: RPC warning: Could not use GSS Legato authentication. (severity 2,
number 7)

lgto_auth for `nsrmmdbd' succeeded
Creating tcp RPC client handle with host localhost (::1)
Creating TCP/IPv6 RPC client handle
Creating TCP/IPv6 RPC client handle
Attempting to bind IPv6 socket descriptor 816

68 Posts

May 25th, 2010 10:00

What I would do is

1) tun off IP V6 unless your whole network has been change over to it.  I would bet you do not need it.

2) in your hosts file add the IP address and name short and full for both the server you wish to back up on and the networker backup server and any proxy you are using if any.  do this on both sides networker server and client.

If you still have the problem.

3) delete the group and client in the networker (GUI) and then recreate them.   (this is from a old bug that comes back some times.  I am not on 7.5.5 but on 7.4.4 and before it was their)

96 Posts

May 25th, 2010 22:00

I managed to fix the problem.

It was in fact a  reverse DNS lookup issue.

the network engineer also mentioned to me that the IPV6 address also caused the conflict between the servers.

Thank you all for the input, its much appreciated!!

No Events found!

Top