8 Krypton

Unable to perform image level restore with VMware Protection

Hello!

I'm hoping to see if any one has experienced this issue before, or at least assist in my situation. I have a case opened and it's been opened for awhile.  Thanks in advanced!!

I have an 8.1.1 environment where I'm currently using VMWare Protection.  I have EBR configured with an external proxy.  I have NO issues backing up VMs, and I have NO issues performing a file-level restore.  However, I am unable to perform an image-level restore. Support reviewed the EBR logs and noticed the error below:

2014-04-17 10:48:18 avnwcomm Info <0000>: NWP_LOG_OUTPUT: NW Client Plugin: recover_start_session: Client '0' is not properly configured on the NetWorker Server.

2014-04-17 10:48:18 avnwcomm Info <0000>: NWP_LOG_OUTPUT: NW Client Plugin: ASSERT(retries < MAX_RSTART_RETRIES) failed in nsr/libnwp/nwp_helper.c: 1446

2014-04-17 10:48:18 avnwcomm Info <0000>: NWP_LOG_OUTPUT: NW Client Plugin: ASSERT(rs->rs_rsi != NULL) failed in nsr/libnwp/nwp_helper.c: 1462

2014-04-17 10:48:18 avnwcomm Info <0000>: NWP_LOG_OUTPUT: NW Client Plugin: ASSERT(!err) failed in nsr/libnwp/nwp_helper.c: 440

2014-04-17 10:48:18 avnwcomm Info <0000>: NWP_LOG_OUTPUT: NW Client Plugin: ABORT session operation successful. Reason for abort: nwp_start_recover_session_helper: cannot start recover session, Client '0' is not properly configured on the NetWorker Server.

2014-04-17 10:48:18 avnwcomm Info <0000>: NWP_LOG_OUTPUT: NW Client Plugin: ASSERT(!err) failed in nsr/libnwp/nwp_intf.c: 331

2014-04-17 10:48:18 avnwcomm Info <0000>: NWP_LOG_OUTPUT: NW Client Plugin: cannot start recover session, Client '0' is not properly configured on the NetWorker Server.

2014-04-17 10:48:18 avnwcomm Error <0000>: Received error from NetWorker connection attempt:

Error Code 16: cannot start recover session, Client '0' is not properly configured on the NetWorker Server.

I guess at some point, the external proxy was configured with a "0" name during the initial configuration of the external proxy, and our assumption was that EBR still had some reference of that "0" name. Troubleshooting steps included deleting the external proxy and recreating it.  We also ran:

mccli group show --recursive

mccli client show --recursive=true | grep proxy

We deleted any bad "proxy" from the EBR database. Unfortunately, we were still getting the same issue where restore would fail 10minutes after initiating the restore.

ADV

Tags (1)
0 Kudos
9 Replies
8 Krypton

Re: Unable to perform image level restore with VMware Protection

Can you please query nsrla with nsradmin and provide output here. It might help us to troubleshoot.

Regards,
Prajith

Re: Unable to perform image level restore with VMware Protection

Can you check if there is an issue in the nsrla on Appliance:

nsradmin -p nsrexec

nsradmin > p type: nsrla

Check the "name" and "Ip addresses"

If its bad, lets consider re-creating it.

0 Kudos
8 Krypton

Re: Unable to perform image level restore with VMware Protection

okay...here's the output and I do see the issue. How do we recreate it?

nsradmin> p type:nsrla

                        type: NSRLA;

                        name: 0;

              nsrmmd version: ;

             nsrsnmd version: ;

NW instance info operations: ;

       NW instance info file: ;

          installed products: ;

                auth methods: "0.0.0.0/0,nsrauth/oldauth";

           max auth attempts: 8;

               administrator: root, "user=root,host=0";

                        arch: x86_64;

                 kernel arch: x86_64;

                    CPU type: x86_64;

                machine type: desktop;

                          OS: Linux 2.6.32.59-0.7-default;

           NetWorker version: 8.1.1.Build.245;

              client OS type: Linux;

                        CPUs: 4;

                     MB used: 21411;

                  IP address: 127.0.0.2, 10.1.1.7,

                              "fe80::250:56ff:feb9:764c";

  environment variable names: ;

I see the name as "0" and I'm not sure if the "127.0.0.2" needs to be there as well.

0 Kudos

Re: Unable to perform image level restore with VMware Protection

to correct this- SSH to the EBR:

stop services: service networker stop

Then mv the nsr/res/nsladb : mv /nsr/res/nsrladb /tmp

Start services: service networker start

Then check the same output as above.

If its good, then clear the peer information on the networker server:

nsradmin -p nsrexec

> d type: nsr peer information; name: <ebr-name>

> yes

Hope this helps

0 Kudos
8 Krypton

Re: Unable to perform image level restore with VMware Protection

so far so good. i'm performing a restore now, and it's actually creating the VM, which we were not able to do before. Keeping my finger crossed.

as for the command:

nsradmin -p nsrexec

> d type: nsr peer information; name: <ebr-name>

> yes

it mentioned that there were "no resources to delete!" i'm assuming this is fine.

what about the 127.0.0.2 IP address? is that normal?

0 Kudos

Re: Unable to perform image level restore with VMware Protection


For the nsradmin command, you have to replace the <ebr-name> with the name of the EBR. Now in this case, thats probably:

d type: nsr peer information; name:0

0 Kudos
rcla2
6 Indium

Re: Re: Unable to perform image level restore with VMware Protection

Hello,

I have almost the same issue, I am unable to restore any VM despite the fact that backups are running smoothly, or at least it seems that it is the case...

See attached log file for details.

It looks like it is the same issue as:

https://community.emc.com/thread/202150

But the NetWorker service on the VBA is started:

     cjebr3test:/space/home/admin # service networker status

     +--o nsrexecd (17329)

Any help would be appreciated.

R.

0 Kudos

Re: Re: Re: Unable to perform image level restore with VMware Protection

The log shows this message: Reason for abort:  nwp_start_recover_session_helper: "Unknown host"

Which indicates that the issue is the underlying connectivty between Proxy > Networker.

This could be an internal or external proxy. And could be attempting to connect to NetWorker server or storage node.

To fix this, go ahead and check the name resolution between VBA/external proxy to networker and vice-versa.

Example, if your VBA node name is 'X-vba', proxy is 'N-Proxy' and networker server is 'Y-server' and storage node name is 'Z-Sn'

On X-VBA run:

nslookup Y-Server

nslookup Z-SN

On N-proxy run:

nslookup Y-Server

nslookup Z-SN

On Y-Server run:

nslookup X-VBA

nslookup N-proxy

On Z-Sn run:

nslookup X-VBA

nslookup N-proxy

Once you find a problem you can fix it by updating the DNS server entries on that host.

Hope this helps!

Regards,

Mahesh

0 Kudos
rcla2
6 Indium

Re: Re: Unable to perform image level restore with VMware Protection

Hello,

My infrastructure is quite simple as it is in this case a NetWorker test infrastructure, it is composed of a single NetWorker server which is also a storage node (called naboo), and a VBA (called cjebr3test), DNS is OK, see below:

     cjebr3test:/space/home/admin # nslookup naboo.company.com

     Server:172.31.1.4

     Address:172.31.1.4#53

     Non-authoritative answer:

     Name:naboo.company.com

     Address: 172.30.0.193

     cjebr3test:/space/home/admin # nslookup 172.30.0.193

     Server:172.31.1.4

     Address:172.31.1.4#53

     Non-authoritative answer:

     193.0.30.172.in-addr.arpaname = naboo.company.com.

     [root@naboo nsr]# nslookup cjebr3test.company.com

     Server:         172.30.100.1

     Address:        172.30.100.1#53

     Name:   cjebr3test.company.com

     Address: 172.31.1.62

     [root@naboo nsr]# nslookup 172.31.1.62

     Server:         172.30.100.1

     Address:        172.30.100.1#53

     62.1.31.172.in-addr.arpa        name = cjebr3test.company.com.

I found a workaround to be able to restore a full VM, I have added in the /etc/hosts file the IP address of the NetWorker server followed by its FQDN and its short name, and then it worked fine !

I am wondering if this issue might not be related to the fact that the NetWorker server is simply called "naboo" and the storage node is called "naboo.company.com" ... ?

I noticed this difference by having a look at the NMC/Devices tab.

Morover, I have many entries in the /nsr/logs/daemon.log (at least one per second) like this one:

73139 04/09/2015 04:41:52 PM  nsrd RAP warning Received no results for NSR Storage Node resource query for naboo on server naboo

R

0 Kudos