Running Windows Server Backup causes the following disk errors, we have a Raid 1+0 (4 disks) array on a Perc 310 controller:
I found a good article concerning the event 153 errors: http://blogs.msdn.com/b/ntdebugging/archive/2013/04/30/interpreting-event-153-errors.aspx
You may have a disk that needs to be replaced; I would run diagnostics on the drives just to confirm.
EventID: 140, http://social.technet.microsoft.com/Forums/en-US/85dda2c8-485a-45f1-b438-80720fb10a7e/ntfs-warning-i... also points to a hard drive issue.
EventID: 14 - System, volsnap; looks like this follows suit with the previous 2 errors. Since it detects errors on the drive, it aborts the write the shadow copies.
There are brand new SAS disks. If there was a problem surely the Raid controller would have detected them wouldn't it?
How do I check individual disks when they are part of a Raid array?
You can use the online diagnostic package that will test the drives individually. They can be found here:
Thanks Geoff I have downloaded that and installed update to 'Dell 64 Bit uEFI Diagnostics', but how do I run this online. Also what happens to the raid array if I test one disk offline, I assume that will then need to be rebuilt will it?
In the meantime we have run a Disk Consistency check on the Raid array which failed with:
The Check Consistency found inconsistent parity data. Data redundancy may be lost.: Virtual Disk 0 (Virtual Disk 0) Controller 0 (PERC H310 Adapter)
Raid controller is now resynching, been running for about 17 hrs and is 75% of the way through.
Issue is that the Raid controller was reporting that all disks were operating fine, without doing the the Raid controllers Disk Consistency check we wouldn't have known there was a problem. This is a brand new server, how can it be that the Raid array is not in sync? Would it not have been synchronised before it left the factory?
This problem is being caused by Windows Server Backup trying to backup from the hyper-V host partition. Backup from a virtualised server works fine, just the hyper-V host that has the problem.
We have also identified that this only happens when backing up to a non-raid disk in the same disk chassis as the raid array disks. Backup to an external disk and all works fine.
Backup fails as follows and then leaves the Raid array corrupted!
Backup failed as shadow copy on source volume got deleted. This might caused by high write activity on the volume. Please retry the backup. If the issue persists consider increasing shadow copy storage using 'VSSADMIN ShadowStorage' command.
EventID: 140 – System, Microsoft-Windows-Ntfs
The system failed to flush data to the transaction log. Corruption may occur in VolumeId: V:, DeviceName: \Device\HarddiskVolume9.
(The I/O device reported an I/O error.)
EventID: 153 - System, disk
The IO operation at logical block address c60 for Disk 0 (PDO name: \Device\00000043) was retried.
EventID: 157 - System, disk
Disk 2 has been surprise removed.
EventID: 517 – Application, Microsoft-Windows-Backup
The backup operation that started at '2014-03-20T15:37:24.984150300Z' has failed with following error code '0x8007045D' (The request could not be performed because of an I/O device error.). Please review the event details for a solution, and then rerun the backup operation once the issue is resolved.
This is obviously an incompatibility between the Perc 310 Raid controller and Windows Server Backup 2012 R2.
Any known fixes?
You will need to increase the VSS cache size for backups. The error that is occurring Backup failed as shadow copy on source volume got deleted. This might caused by high write activity on the volume. Please retry the backup. If the issue persists consider increasing shadow copy storage using 'VSSADMIN ShadowStorage' command.) Is that the data that is stored in the cache before it writes it to the disk is being deleted before it can be written to the backup location. Because the data is being deleted before it is written & acknowledged that it has been written you get what is known as dirty cache & when the cache buffer fills up it does a force flush & that is suppose to flush out all the old data that has been written to the drive but for some reason it is flushing most or all the data in cache & not checking to just flush the old cache data.
Let us know how it works.
All drives set to unlimited space for Shadow Copies. Latest event log errors below. It seems the root of the problem is that VSS is causing 'Disk 3 has been surprise removed', any idea why that is happening?
Just found a message elsewhere which again was on a Dell Perc 310 controller. As no one else other than Dell Perc 310 owners are reporting this problem this strongly suggests it is down to the Perc 310 or its driver. Are there any driver or firmware updates in the line which might cure this?