RAID5 Issue

Question

Hello,

We have PowerEdge 2950 with Raid-5 Perc 6/i array consist of Three SAS 300GB 15K disks, one disk has failed and replaced by SAS 600GB 15K disk, After build noticed the Disk with steady activity green led and with Failure Predicted alert on open manage, after 2 days the online led become Amber and the server was really slow.

We replaced the disk again with SAS 600GB 15K disk but got the same thing "Failure Predicted alert" , and build took a long time to finish with steady green activity led and green online led, usually activity led should be blinking or off and not steady on !!!!

Any Ideas ? maybe size should I use the same size like other 2 disks

Thank You

DELL-Charles R · Answer

Hello mahmoud.jundi,

You should be ok with the replacement drives you used. Replace SAS for SAS and same capacity or larger are confirmed.

While it does not happen often, you could have received a drive that was pred fail or near to it and it did go pred fail during the rebuild.

From the description it sounds possible that the array has punctured.

See if you can run a Consistency check on the Virtual Disk from OMSA.

Check the controller log for Puncture

Export the PERC Controller Log via OpenManage Server Administrator

https://dell.to/3VvrPBI

If you do have a puncture: Check the status of your backup. Try to get a file level backup. If you can't get a backup and this is critical information then you may need to engage a data recovery company.

Reference: How to Handle Puncturing (Bad Blocks) on Virtual Disks for PowerEdge servers

https://dell.to/41ZBnY7

mahmoud.jundi · Answer

Hello, Many thanks for your reply

I did check the Perc Controller logs and there is no puncture, please check below

What I need is to coordinate with the support to get a brand new Disk and check the status

Thanks again

Controller Information PERC 6/i Integrated

Properties : Ok

ID;Name;State;Firmware Version;Driver Version;Number of Connectors;Rebuild Rate;BGI Rate;Check Consistency Rate;Reconstruct Rate;Abort check consistency on error ;Allow Revertible Hot Spare and Replace Member;Load balance;Auto replace member on predictive failure;Cache Memory Size;Patrol Read Mode;Patrol Read State;Patrol Read Iterations;Controller Tasks;
0;PERC 6/i Integrated;Ready;6.1.1-0047;2.23.00.32;2;30;30%;30%;30%;Disabled;Enabled;Auto;Disabled;256 MB;Auto;Stopped;726;Available Tasks;Execute

Virtual Disks

Status;Name;State;Layout;Size;Device Name;Type;Read Policy;Write Policy;Stripe Element Size;Disk Cache Policy
Ok;Virtual Disk 0;Ready;RAID-5;557.75GB;Windows Disk 0;SAS;No Read Ahead;Write Back;64 KB;Disabled
Ok;backup;Ready;RAID-0;931.00GB;Windows Disk 1;SATA;No Read Ahead;Write Back;64 KB;Enabled

Battery

Status;Name;State;Predicted Capacity Status;Learn State;Next Learn Time;Maximum Learn Delay;Learn Mode
Ok;Battery 0;Ready;Ready;Idle;20 days 3 hours;7 days 0 hours;Auto

Connectors

Status;Name;State;Connector Type
Ok;Connector 0;Ready;SAS Port RAID Mode
Ok;Connector 1;Ready;SAS Port RAID Mode

Enclosures

Status;Name;State;Connector ;Firmware Version;Service Tag;SAS Address
Ok;Backplane;Ready;0;1.05;92J00EW;50022090C7A3F700

DELL-Erman O · Answer

Hi, based on logs it seems okay to me. I wouldn't expect size cause the issue directly if the same rpm and other features are consistent with lower size drives. At first I would suspect drive failure itself. (In the unlikely event that bad blocks have propagated to other disks, then it may be necessary to create a new RAID and restore backup data again.)

Dylank · Answer

Just curious but by any chance is the part number of the 600GB drive you got W347K?  Those drives have an absolutely insane fail rate so I wouldn't be surprised if it ended up giving you issues that quickly after replacing.

mahmoud.jundi · Answer

Not sure about part number because the 2nd disk has been replaced today with 300GB, I will check after build and restart the server to see how storage shows in Open Manage

mahmoud.jundi · Answer

Thank you all , All went well after being replaced by 300 GB and restarted the server

Maybe the 2nd 600 GB was OK but open manage was showing this message "Predicted fail" incorrectly and server should be restarted.

PowerEdge HDD/SCSI/RAID

RAID5 Issue

Was this post helpful?