DeaconZ28-2015
3 Argentum

Backup Job Failed - not sure why exactly

Need some help here. We are in the process of an IT centralization plan and have taken over two remote site file servers. There is no longer IT at the location so we need to send the backups over the WAN (50Mbps MPLS circuit at the site). The clients are a Windows 2003 & 2008 server with Networker 8.1.0.2 and our NW server here is 8.1.0.2 and we have clients direct backup to our Data Domain. Backups run for a while, then abort. We are looking at the servers, but they seem fine otherwise. Here is the message we receive, any input is appreciated. We are particularly concerned for the 2003 box since it is highly utilized (although not during the backup window).

-- Unsuccessful Save Sets ---

This one is Windows 2003

* GA-FILE02.domain.com:J:\WRData libDDBoost version: major: 2, minor: 6, patch: 1, engineering: 1, build: 394260

* GA-FILE02.domain.com:J:\WRData 86704:save: Successfully established DDCL session for save-set ID '2307045880' (GA-file02.domain.com:J:\WRData).

* GA-FILE02.domain.com:J:\WRData 76677:save: RPC send operation failed; errno = Unknown error

* GA-FILE02.domain.com:J:\WRData 32177:save: xdr of win32 attributes failed for `\\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy182\WRData\Drawings\258819\258819\00\19\18.C4'

* GA-FILE02.domain.com:J:\WRData 74209:save: Quit signal received.

* GA-FILE02.domain.com:J:\WRData 99123:save: Handling an abort while processing Windows backup.

* GA-FILE02.domain.com:J:\WRData Unable to find any full backups of the save set 'GA-FILE02.domain.com:J:\WRData' in the media database. Performing a full backup.

  GA-FILE02.domain.com:J:\WRData: retried 1 times.

This one is Windows 2008 R2

* GA-FILE03.domain.com:G:\Engineering Unable to find any full backups of the save set 'GA-FILE03.domain.com:G:\Engineering' in the media database. Performing a full backup.

  GA-FILE03.domain.com:G:\Engineering: retried 1 times.

* GA-FILE03.domain.com:G:\Teamcenter 39078:save: RPC error: RPC send operation failed; errno = An existing connection was forcibly closed by the remote host.

* GA-FILE03.domain.com:G:\Teamcenter

* GA-FILE03.domain.com:G:\Teamcenter

* GA-FILE03.domain.com:G:\Teamcenter 74209:save: Quit signal received.

* GA-FILE03.domain.com:G:\Teamcenter Unable to find any full backups of the save set 'GA-FILE03.domain.com:G:\Teamcenter' in the media database. Performing a full backup.

  GA-FILE03.domain.com:G:\Teamcenter: retried 1 times.

0 Kudos
3 Replies
CarlosRojas
4 Germanium

Re: Backup Job Failed - not sure why exactly

Hi there,

Before getting into more deep troubleshooting, basic question:

Have you tuned up TCP/IP settings on server and clients?

something like this:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\TcpWindowSize=256000

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\GlobalMaxTcpWindowSize=16777216

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\KeepAliveInterval=1000

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\KeepAliveTime=600000

You need to reboot afterwards.

Those values can be increased or adjusted as required.

Also consider disabling TCP chimney.

Final suggestion, if you can change NW server OS to Windows 2012 you will find also big improvement.

Thank you,

Carlos

DeaconZ28-2015
3 Argentum

Re: Backup Job Failed - not sure why exactly

Thanks, I am checking the settings now.

These two are not present at all:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\TcpWindowSize=256000

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\GlobalMaxTcpWindowSize=16777216

These are set to 10000 & 900000 on the 2003 server.

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\KeepAliveInterval=1000

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\KeepAliveTime=600000

None of those are on the 2008 R2 box.

We plan on upgrading the NW box to 2012, but were waiting to replace the hardware. Does an in place upgrade from 2008 R2 to 2012 change the Host ID?

0 Kudos
CarlosRojas
4 Germanium

Re: Backup Job Failed - not sure why exactly

Hi,

I would say that for Windows 2003 all 4 values should be added.

For Windows 2008 the first 2 should be fine, but still disable TCP chimney.

Also, is there any firewall in between? Check the firewall logs just in case.

And eventually check the client side logs, as this could be some other VSS related issues.

Another recommendation would be to set the inactivity timeout on the group for those clients to "0", unlimited.

And yes, upgrading the OS I think will change the hostID

Thank you,

Carlos

0 Kudos