Unsolved
This post is more than 5 years old
25 Posts
0
4761
Unable to perform image level restore with VMware Protection
Hello!
I'm hoping to see if any one has experienced this issue before, or at least assist in my situation. I have a case opened and it's been opened for awhile. Thanks in advanced!!
I have an 8.1.1 environment where I'm currently using VMWare Protection. I have EBR configured with an external proxy. I have NO issues backing up VMs, and I have NO issues performing a file-level restore. However, I am unable to perform an image-level restore. Support reviewed the EBR logs and noticed the error below:
2014-04-17 10:48:18 avnwcomm Info <0000>: NWP_LOG_OUTPUT: NW Client Plugin: recover_start_session: Client '0' is not properly configured on the NetWorker Server.
2014-04-17 10:48:18 avnwcomm Info <0000>: NWP_LOG_OUTPUT: NW Client Plugin: ASSERT(retries < MAX_RSTART_RETRIES) failed in nsr/libnwp/nwp_helper.c: 1446
2014-04-17 10:48:18 avnwcomm Info <0000>: NWP_LOG_OUTPUT: NW Client Plugin: ASSERT(rs->rs_rsi != NULL) failed in nsr/libnwp/nwp_helper.c: 1462
2014-04-17 10:48:18 avnwcomm Info <0000>: NWP_LOG_OUTPUT: NW Client Plugin: ASSERT(!err) failed in nsr/libnwp/nwp_helper.c: 440
2014-04-17 10:48:18 avnwcomm Info <0000>: NWP_LOG_OUTPUT: NW Client Plugin: ABORT session operation successful. Reason for abort: nwp_start_recover_session_helper: cannot start recover session, Client '0' is not properly configured on the NetWorker Server.
2014-04-17 10:48:18 avnwcomm Info <0000>: NWP_LOG_OUTPUT: NW Client Plugin: ASSERT(!err) failed in nsr/libnwp/nwp_intf.c: 331
2014-04-17 10:48:18 avnwcomm Info <0000>: NWP_LOG_OUTPUT: NW Client Plugin: cannot start recover session, Client '0' is not properly configured on the NetWorker Server.
2014-04-17 10:48:18 avnwcomm Error <0000>: Received error from NetWorker connection attempt:
Error Code 16: cannot start recover session, Client '0' is not properly configured on the NetWorker Server.
I guess at some point, the external proxy was configured with a "0" name during the initial configuration of the external proxy, and our assumption was that EBR still had some reference of that "0" name. Troubleshooting steps included deleting the external proxy and recreating it. We also ran:
mccli group show --recursive
mccli client show --recursive=true | grep proxy
We deleted any bad "proxy" from the EBR database. Unfortunately, we were still getting the same issue where restore would fail 10minutes after initiating the restore.
ADV
TheStorageProto
27 Posts
0
May 8th, 2014 07:00
Can you check if there is an issue in the nsrla on Appliance:
nsradmin -p nsrexec
nsradmin > p type: nsrla
Check the "name" and "Ip addresses"
If its bad, lets consider re-creating it.
nmc2
268 Posts
1
May 8th, 2014 07:00
Can you please query nsrla with nsradmin and provide output here. It might help us to troubleshoot.
Regards,
Prajith
odurasler757
25 Posts
0
May 9th, 2014 06:00
okay...here's the output and I do see the issue. How do we recreate it?
nsradmin> p type:nsrla
type: NSRLA;
name: 0;
nsrmmd version: ;
nsrsnmd version: ;
NW instance info operations: ;
NW instance info file: ;
installed products: ;
auth methods: "0.0.0.0/0,nsrauth/oldauth";
max auth attempts: 8;
administrator: root, "user=root,host=0";
arch: x86_64;
kernel arch: x86_64;
CPU type: x86_64;
machine type: desktop;
OS: Linux 2.6.32.59-0.7-default;
NetWorker version: 8.1.1.Build.245;
client OS type: Linux;
CPUs: 4;
MB used: 21411;
IP address: 127.0.0.2, 10.1.1.7,
"fe80::250:56ff:feb9:764c";
environment variable names: ;
I see the name as "0" and I'm not sure if the "127.0.0.2" needs to be there as well.
TheStorageProto
27 Posts
0
May 9th, 2014 07:00
to correct this- SSH to the EBR:
stop services: service networker stop
Then mv the nsr/res/nsladb : mv /nsr/res/nsrladb /tmp
Start services: service networker start
Then check the same output as above.
If its good, then clear the peer information on the networker server:
nsradmin -p nsrexec
> d type: nsr peer information; name:
> yes
Hope this helps
odurasler757
25 Posts
0
May 9th, 2014 07:00
so far so good. i'm performing a restore now, and it's actually creating the VM, which we were not able to do before. Keeping my finger crossed.
as for the command:
nsradmin -p nsrexec
> d type: nsr peer information; name:
> yes
it mentioned that there were "no resources to delete!" i'm assuming this is fine.
what about the 127.0.0.2 IP address? is that normal?
TheStorageProto
27 Posts
0
May 9th, 2014 08:00
For the nsradmin command, you have to replace the with the name of the EBR. Now in this case, thats probably:
d type: nsr peer information; name:0
rcla2
12 Posts
0
April 8th, 2015 08:00
Hello,
I have almost the same issue, I am unable to restore any VM despite the fact that backups are running smoothly, or at least it seems that it is the case...
See attached log file for details.
It looks like it is the same issue as:
https://community.emc.com/thread/202150
But the NetWorker service on the VBA is started:
cjebr3test:/space/home/admin # service networker status
+--o nsrexecd (17329)
Any help would be appreciated.
R.
1 Attachment
failedonly1
TheStorageProto
27 Posts
0
April 8th, 2015 15:00
The log shows this message: Reason for abort: nwp_start_recover_session_helper: "Unknown host"
Which indicates that the issue is the underlying connectivty between Proxy > Networker.
This could be an internal or external proxy. And could be attempting to connect to NetWorker server or storage node.
To fix this, go ahead and check the name resolution between VBA/external proxy to networker and vice-versa.
Example, if your VBA node name is 'X-vba', proxy is 'N-Proxy' and networker server is 'Y-server' and storage node name is 'Z-Sn'
On X-VBA run:
nslookup Y-Server
nslookup Z-SN
On N-proxy run:
nslookup Y-Server
nslookup Z-SN
On Y-Server run:
nslookup X-VBA
nslookup N-proxy
On Z-Sn run:
nslookup X-VBA
nslookup N-proxy
Once you find a problem you can fix it by updating the DNS server entries on that host.
Hope this helps!
Regards,
Mahesh
rcla2
12 Posts
0
April 9th, 2015 08:00
Hello,
My infrastructure is quite simple as it is in this case a NetWorker test infrastructure, it is composed of a single NetWorker server which is also a storage node (called naboo), and a VBA (called cjebr3test), DNS is OK, see below:
cjebr3test:/space/home/admin # nslookup naboo.company.com
Server:172.31.1.4
Address:172.31.1.4#53
Non-authoritative answer:
Name:naboo.company.com
Address: 172.30.0.193
cjebr3test:/space/home/admin # nslookup 172.30.0.193
Server:172.31.1.4
Address:172.31.1.4#53
Non-authoritative answer:
193.0.30.172.in-addr.arpaname = naboo.company.com.
[root@naboo nsr]# nslookup cjebr3test.company.com
Server: 172.30.100.1
Address: 172.30.100.1#53
Name: cjebr3test.company.com
Address: 172.31.1.62
[root@naboo nsr]# nslookup 172.31.1.62
Server: 172.30.100.1
Address: 172.30.100.1#53
62.1.31.172.in-addr.arpa name = cjebr3test.company.com.
I found a workaround to be able to restore a full VM, I have added in the /etc/hosts file the IP address of the NetWorker server followed by its FQDN and its short name, and then it worked fine !
I am wondering if this issue might not be related to the fact that the NetWorker server is simply called "naboo" and the storage node is called "naboo.company.com" ... ?
I noticed this difference by having a look at the NMC/Devices tab.
Morover, I have many entries in the /nsr/logs/daemon.log (at least one per second) like this one:
73139 04/09/2015 04:41:52 PM nsrd RAP warning Received no results for NSR Storage Node resource query for naboo on server naboo
R