I would like to see if anyone ever seen this issue before.
I'm performing a windows file system back up of a volume residing on SAN storage. The backup is a SAN based back up. The backup of this volume is failing after 1228GB consistently.
11/16/2013 10:28:03 AM 0 0 2 3936 3924 0 limbue101.lim.emea.dell.com nsrd e3acspaclib01:T:\ done saving to pool 'filesystem' (E10071) 1228 GB
11/17/2013 04:57:50 PM 0 0 2 3936 3924 0 limbue101.lim.emea.dell.com nsrd e3acspaclib01:T:\ done saving to pool 'filesystem' (E30203) 1228 GB
Networker doesnt give any more details.
On the client server itself there is an error in the windows application log to correspond with the job failure
Description: Faulting application save.exe, version 220.127.116.11, time stamp 0x4febfc65, faulting module liblocal.dll, version 18.104.22.168, time stamp 0x4febefdc, exception code 0xc00000fd, fault offset 0x000000000006a587, process id 0x158, application start time 0x01cee4a127c148cd.
There are no errors seen on the SAN fabric that correspond to the failures.
Other volumes backup without issue on the same server.
The client server is W2k8 R2 running NW 7637.
The Networker server is W2k3 SP2 running NW 7637
One idea I have is to run CHKDSK on the affected volume.
Would someone have other suggestions ?
If I see liblocal failing at the same time as when error happens, I would suspect NW. You can run save from CLI just to see if you get anything more. I would also compare how much memory save command has consumed with time, especially nearing the expected break. 1+ TB save set is not small and I wonder if something in such size is breaking NW. You may wish also to update to latest NW7 version on client side.
I first would make sure that the correct TCP/IP timers are in place.
Also check the client inactivity time out.
Anyway, having a crash I rather open a case with support, but agreed with Hrvoje, try first upgrading to the latest NW 7.x cumulative fix, as most likely the solution would be already included in the latest build.
I would suggest:
1. Upgrade to the latest version of NetWorker 7.6.5 or 8.0/8.1.
If the issue persists,
2. This issue can be also caused by the confliction between NetWorker and another software especially anti-virus programs. Check whether there is any anti-virus program running on your computer. If so, temparily disable/remove it and then check the backup result.
Thanks Guys for responding.
I upgraded NW to 22.214.171.124 but still seeing the same problem.
application save.exe, version 126.96.36.199, time stamp 0x5239057e, faulting module liblocal.dll, version 188.8.131.52, time stamp 0x5238fc54, exception code 0xc00000fd, fault offset 0x000000000006a8c7, process id 0x2940, application start time 0x01cee602839f1475
There are other similiar size volumes on this server which are backing up without issue.
Memory/CPU usage does not seem an issue.
There are no relevant enties in the NW log files on the server.
On the NW server the group is configured with an inactivity timeout of 0 & a file inactivity threshold of 0 also.
Thanks for taking the time to respond.
Can you try two tests:
- start save from client and see if it breaks at the same point
- if it breaks at the same point, start backup in debug mode only for that point
If this is taking to much time as you wish to have backups and not tests first, split manually this backup set into folders and do backup of each first. One of them may fail and that one might be the same one (same point) where it was breaking before and then you can apply above tests to that isolated area.
IMHO, the picture seems to point to a file system problem. In this case, another NW client version will of course not help.
Thanks Guys for the suggestions.
Sorry for the delay in responding.
I broke down the savesets, performing a manual save on the directory where the bkup is failing.
The directory path of the last file backed up is listed below.
While I think it's a file system issue, I'm wondering would Networker have an issue nagivating such a large file system to backup a file.
Has anyone ever seen anything like this before ?