NetWorker

2 Bronze

RPC send operation failed; errno = An existing connection was forcibly closed by the remote host.

Hello,

We are recently getting RPC errors with our backups, which we didn't get before.

Environment:

Server/Storage Node: NetWorker 7.4.5.10.Build.810 Enterprise Edition on Sun Solaris 9

Tape Library: Sun StorageTek SL500 - 5xLTO3 tape drives

Clients: 7.6.4.3.Build.1070, 7.4.5.Build.758, 7.4.3.Build.569 on Windows 2003 and Windows 2008 servers.

We also have Linux and Solaris clients but this problem does not occur on those clients.

The error messages for the savesets that fail are the following:

39078:save: RPC error: RPC send operation failed; errno = An existing connection was forcibly closed by the remote host.

39078:save: RPC error: RPC send operation failed.  A network connection could not be established with the host.

74209:save: Quit signal received.

Here are some examples from the NetWorker savegroup completion logs:

--- Unsuccessful Save Sets ---

* win2008-yx01:VSS SYSTEM FILESET:\ 1 retry attempted

* win2008-yx01:VSS SYSTEM FILESET:\ 39078:save: RPC error: RPC send operation failed.  A network connection could not be established with the host.

* win2008-yx01:VSS SYSTEM FILESET:\

* win2008-yx01:VSS SYSTEM FILESET:\ System Writer - ERROR: Failed to save FileGroup files, writer = System Writer

* win2008-yx01:VSS SYSTEM FILESET:\ System Writer - Error saving writer System Writer

* win2008-yx01:VSS SYSTEM FILESET:\ System Writer - ERROR: Aborting backup of saveset VSS SYSTEM FILESET: because of the error with writer System Writer.

* win2008-yx01:VSS SYSTEM FILESET:\ System Writer - Error saving

* win2003-db02:D:\ORACLE 1 retry attempted

* win2003-db02:D:\ORACLE 39078:save: RPC error: RPC send operation failed; errno = An existing connection was forcibly closed by the remote host.

* win2003-db02:D:\ORACLE

* win2003-db02:D:\ORACLE

* win2003-db02:D:\ORACLE 74209:save: Quit signal received.

     * <ERROR> :  Failed with error(s)

* win2008-db05:D:\ 1 retry attempted

* win2008-db05:D:\ 39078:save: RPC error: RPC send operation failed.  A network connection could not be established with the host.

* win2008-db05:D:\

     * <ERROR> :  error while saving \\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy8\Inetpub\catalog.wci\0001000E.ci

* win2008-db05:C:\ 1 retry attempted

* win2008-db05:C:\ 39078:save: RPC error: RPC send operation failed.  A network connection could not be established with the host.

* win2008-db05:C:\

* win2008-db05:C:\ 5195:save: save failed on \\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy7\WINSRV\$NtServicePackUninstall$\reg00254

     * <ERROR> :  error while saving \\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy7\WINSRV\$NtServicePackUninstall$\reg00254

* win2003-db21:G:\ 1 retry attempted

* win2003-db21:G:\ 39078:save: RPC error: RPC send operation failed.  A network connection could not be established with the host.

* win2003-db21:G:\

     * <ERROR> :  error while saving \\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy633\Backup\WPMSharepoint.bak

* win2003-db21:C:\ 1 retry attempted

* win2003-db21:C:\ 39078:save: RPC error: RPC send operation failed.  A network connection could not be established with the host.

* win2003-db21:C:\

     * <ERROR> :  error while saving \\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy634\hp\hpdiags\tcstorage.dll

* win2003-file01:D:\ 1 retry attempted

* win2003-file01:D:\ 39078:save: RPC error: RPC send operation failed.  A network connection could not be established with the host.

* win2003-file01:D:\

     * <ERROR> :  error while saving \\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy330\APPS\Ecl\2004a_1\frontsim\2002a_1\old\Eclipse\2002a_1\flogrid\tutorials\RESCUE\cloudspin\cloudspin.bin

* win2003-lotus01:C:\ 1 retry attempted

* win2003-lotus01:C:\ 39078:save: RPC error: RPC send operation failed; errno = An existing connection was forcibly closed by the remote host.

* win2003-lotus01:C:\

* win2003-lotus01:C:\

* win2003-lotus01:C:\ 74209:save: Quit signal received.

Thank you very much for your help/contribution in the resolution of this problem.

Replies (15)
3 Zinc

Hi Celio ,

Would you please check the vss writers on those machines by running the following command:

vssadmin list writers

Ensure that all writers are in stable with no error state.

Secondly, Run some communication checks from the backup server to the client and vice versa using the nslookup and rpcinfo, from the backup server to the networker client ( using short name and FQDN name):

  • nslookup <client>
  • rpcinfo -p <client>

  And the following from the client to the backup server ( using short name and FQDN name as well): 

  • check server file under the nsr directory if the backup server is added there
  • nslookup <backup-server>
  • rpcinfo -p <backup-server>

If all above check are correct and naming resolution is correct, You have to add the VSS:*=off in the save operations field for the Windows 2003 clients. So from NMC navigate to Configurations then client and choose which machines (Windows 2003) are failing for these errors and modify the save operations field under tha apps and modules tab, where you have to add the VSS:*=off

While for Windows 2008 clients, modify the save command field with

save -a '"ignore-all-missing-system-files=yes"'

Re-run the backups again. Waiting your updates.

Ahmed Bahaa

Hi Celio,

Additionally, I recommend you to delete the nsr peer information and clear the tmp directory on the client, Please follow the steps given below to delete the NSR peer information on NetWorker Server and on the Client.

  1. At NetWorker server command line, go to the location /nsr/res
  2. Type the command: 
    nsradmin -p nsrexec
    print type:nsr peer information; name:client_name
    delete
    y

   (specify the name of the client in the place of client_name) 

  1. At the client command line, go to the location /nsr/res
  2. Type the command:
    nsradmin -p nsrexec
    print type:nsr peer information
    delete
    y

From the client side, stop the NetWorker services and then rename the tmp directory under <installation-path>\nsr to tmp.old and then start the services again.

Hope this helps as well.

Ahmed Bahaa

Hello Ahmed,

Thank you very much for your help.

I've tried all your suggested steps:

  • name resolution verification - OK
  • rpcinfo -p - OK
  • "VSS:*=off" for Windows 2003 clients - DONE
  • save -a '"ignore-all-missing-system-files=yes"' for Windows 2008 clients - DONE
  • delete the nsr peer information and renaming the client's nsr/tmp folder - DONE

But unfortunately the problem persists.

39078:save: RPC error: RPC send operation failed; errno = An existing connection was forcibly closed by the remote host.


74209:save: Quit signal received.

Hi,

1- For test purpose, disable AntiVirus

2- If still have the same error after that, then do you have Firewall? Eitherway, set the following REGISTRY keys:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters
KeepAliveTime=3420000

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\
TcpWindowSize=256000

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\
GlobalMaxTcpWindowSize=16777216

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\
KeepAliveInterval=1000

Please confirm..

Regards,

Mustafa

Hi,

Yes, I forgot to mention that I disabled the AntiVirus (Symantec Endpoint Protection) before testing again.

No, there is no firewall between the networker server and client, otherwise the rpcinfo wouldn't work.

I will try the registry keys for TCP/IP and I will get back to you.

Thanks again!

Do you have to reboot in order for the registry parameters to be effective?

Yes, reboot is required

Set the above on both Backup Server & Client

Mustafa

Unfortunatly I cannot reboot the client as it is a production server.

The Backup Server is on Sun Solaris.

Thanks.

I'm suspecting that the problem is related to open files. What do you think? Is there a way to backup open files or simply skip them?

Thanks again.

Top Contributor
Latest Solutions