Start a Conversation

Unsolved

This post is more than 5 years old

G

48098

December 6th, 2010 16:00

PERC 4e/DI - Event ID 570 - Reported Predictive Failure - RAID5 with 3 Drives and newly added hot spare drive

Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4

Our exchange server has a drive that is showing predictive failure.  We have the server configured as RAID5 with three drives.  After the server started showing the drive as predictive failure we purchased a new drive and inserted it into the server.  The new drive is now configured as a hot spare and shows as Ready under Type within the Open Manage Array Manager. 

Q:  How long does it take for it to start to rebuild automatically? Does it?

Q:  Does the predictive failure drive need to be taken offline to prompt the use of the hot spare? 

Any help is greatly appreciated. 

 


9.3K Posts

December 7th, 2010 08:00

A predictive failure doesn't mean it failed (yet). You did what one should do; get a new drive, but to make it fail over, you have to either wait for the failure, or trigger a failure (i.e. pull the suspect drive).

 

Before you pull the suspect drive, I highly recommend that you do a background verify on the raid 5. This will make the raid controller double check the raid parity and if there's something not 100%, it can fix it (it wouldn't be able to fix raid parity issues if it runs into this on a rebuild).

7 Posts

December 7th, 2010 10:00

Is background verify the same as checking the consistency of the virtual disk?

7 Posts

December 7th, 2010 11:00

Now getting an Event ID 671 - PERC 4e/DI Controller 0, Array Disk 0:0 Sense Key = 3, Sense Code = 11, Sense Qualifier =1.  If this disk is part of a non-redundant virtual disk, the data for this block cannot be recovered.  The disk will require replacement and data restore.  If this disk is part of a redundant virtual disk, the data in this block will be reallocated. 

Should I take the suspect disk offline?  It is part of RAID 5.

9.3K Posts

December 7th, 2010 13:00

The consistency check is indeed a background verify.

 

I'd try to run the consistency check in the hope that it can finish before the drive actually fails.

The problem is that if there is raid parity corruption (not likely, but possible), and a rebuild occurs, the rebuild fails to complete. The only way to prevent this is to run the consistency check.

7 Posts

December 7th, 2010 14:00

Running the consistency check now.  140 Gb.  Probably going to take a while....

7 Posts

December 7th, 2010 15:00

Consistency check complete.  Event 570 persists.  I guess the next step is to take the suspect drive offline? 

4 Operator

 • 

1.8K Posts

December 8th, 2010 07:00

Aside from the consistency check I would do a manual Patrol Reads which checks the entire array's disk surfaces for errors, a CC only does the  disk surface which contains data. In a rebuild situation, most array failures result from errors building up in the area of the disk's surfaces which do not contain data, not normally checked for errors unless Patrol read is either run in automatic or set to run manually. With the larger disks/array sizes of today's arrays, unused areas of arrays can be large, much greater chance of a large numbers of errors building up over time; during rebuilds array controllers can not handle multiple errors, and just fail the array...Very high end raid systems have better mechanisms in place to take care of the issue, the common raids used by smaller businesses are behind in technology to do this...what we really need is a more robust error checking build into the raid adapters. 

 

7 Posts

December 15th, 2010 15:00

After the consistency check I was able to replace the drive.  Thank You

Unfortunately the other two drives in the array are now giving me Event ID 671.  So I have purchased more drives in hopes that I can replace them.  The new drives arrived today and upon running the consistency check it failed two out of three times ran.

What now?

No Events found!

Top