I know "anythign is possible", but I can't believe what happened with my aging ( 4 years old) PowerEdge 2900 with Perc 6i . The machine had 8 300GB drives in a Raid-5 configuration. One Wedneday morning, post Micrososft Update Tuesday, it reported a failure with one of the drives, so my Raid-5 was degraded. I called Dell, but there basically aren't any replacement drives available due to the Thailand flooding and I don't have any spares either. So I offload all the data on the machine and make the decision to rebuilding the entire machine as a Raid-6 with 7 disks, and then reload the data.
But after recreating the Virtual disk and starting to initialize, 2 more drives report in a failed? How can this be? I rebooted the machine and tried to force online the failed drives, recreated the VD, but nothing seems to work.
Suggestions or did I just get "lucky?
Solved! Go to Solution.
Sounds like you are unlucky anyway ... could be the drives are failing, could be the backplane/cable or controller, or it could be firmware related.
What I would do is reseat the RAID controller, the drives, and the cables that run from the controller to the backplane (both ends). Then, run 32-bit diagnostics (bootable - from support.dell.com) on the drives - not the quick tests, but the full diagnostics. Get rid of any drives that fail; the ones that pass, keep.
Update the BIOS, ESM/BMC, RAID, HDD firmwares. Then, try again.
Thanks for the reply. To keep it short, I ended up with 5 working drives (they all have the latest firmware) out of the 8 that were apparently working just 4 days ago. I've just never experienced 3 drives failing in such a short amount of time on the same machine. Truly bizarre. These were Dell/Seagate ST3300656SS 15K 300 GB drives if anyone experiences a similar scenario.
Extremely rare, but not unheard of are drives made at the same time, under the same conditions, using the same parts and processes, used under the same load and environment can logically be susceptible to failing at the same time from a given failure or defect.
If the drives were all from the same lot and the firmware was bad, then yes, they could all go bad within a short amount of time. For the ones that didn't go bad yet, be prepared...
"If the drives were all from the same lot and the firmware was bad, then yes, they could all go bad within a short amount of time."
Agree, but even if every part was manufacturer from the same lots, same day, the odds of multiple disk failing on the same day/time would be astronomical. I have seen instances of a failing component on one drive causing multiple drives to "fail" on the same day. The problem with this is finding the drive causing the problem, esprcially if the the component causing the issue fails intermittantly. Then again it could be a failing component on the controller.
Off chance the patrol reads were manually turned off, this would explain multiple drive failures at the same time. If Patrol Reads were off , multiple physical disk defects could have built up over a long period, only to to found during initialization.