I pulled the bad hot swap drive and reseated hoping it would rebuild however since doing this, three of the four drives in the RAID10 array are flashing Green and Amber which mean Predicted Drive Failure.
How can I resolve this issue?
I have received the replacement drive but don't remember which one was bad since I now have three flashing green and amber.
I have ran the Dell EMC Server Update Utility to verify drivers and firmware. See screen shot below. As you can see there are many updated I could perform but im worried such a maneuver would crash my server.
Any help would be greatly appreciated.
Hello JeffACC,
It sounds like you could have a puncture with all the pred fail drives.
Virtual disk bad blocks are due to presence of unrecoverable bad blocks on one or more member physical disks.
First I recommend to confirm you have a validated backup.
You can search the controller log for "Puncture" see what shows up.
To get controller log: in Open Manage, expand Storage>select the controller.
Click Information/Configuration hyper link at top.
Use the ‘Controller Task’ drop down>
select Export Log, click Execute and Export on next page.
Log will be in Windows folder with file name LSI_####.log
Basic steps for clearing virtual disk bad blocks:
1. Get file level, not block level, backup of the data
2. Update the firmware of the hard drives
3. Run a Consistency Check on the virtual disk (via OMSA)
4. Run a patrol read and Clear the bad block table for the virtual disk (via OMSA)
(Then monitor for any new bad blocks or other issues)
5. Run diagnostics: Boot to F11 on Dell Splash screen, selecting Boot Manager -> System Utilities -> Launch Dell Diagnostics. Note any messages and continue testing.
You will need to replace any drive that still shows pred fail.
I'll supply some other reference information for you:
How to Fix a RAID Puncture
Double Faults and Punctures in RAID Arrays
How to Handle Puncturing (Bad Blocks) on Virtual Disks for PowerEdge servers
Dell -Charles R
Social Media and Communities Professional
Dell Technologies | Enterprise Support Services
#IWork4Dell
Did I answer your query? Please click on ‘Accept as Solution’. ‘Kudo’ the posts you like!
Hi Charles, Thanks for the detailed instructions.
Sorry for the delay, I was out of the office until this morning.
I ran the LSI log and scanned for "puncture" and found none.
Do I still continue on with your original instructions?
Hello JeffACC,
That is good to see. Yes you can continue with the action plan.
Please let me know how it goes.
Dell -Charles R
Social Media and Communities Professional
Dell Technologies | Enterprise Support Services
#IWork4Dell
Did I answer your query? Please click on ‘Accept as Solution’. ‘Kudo’ the posts you like!
Will do, Thanks Charles.
SERVICE TAG (removed by moderator)
TSR Uploaded
Please let me know if you need anything further. Thanks for all your help!
Hello JeffACC,
The log does not indicate a puncture and that is good.
You can continue with the action plan replacing each pred fail drive one at a time.
Run consistency check
Offline pred fail drive and replace
After rebuild completes run consistency check again and repeat for each pred fail drive.
Dell -Charles R
Social Media and Communities Professional
Dell Technologies | Enterprise Support Services
#IWork4Dell
Did I answer your query? Please click on ‘Accept as Solution’. ‘Kudo’ the posts you like!
Thanks Charles,
Will do!
Hi Charles,
Can I run the Drive Consistency checks during the day with people logged in to the server?
JeffACC,
Yes you can run it while live, you can also schedule it via OpenManage as well, as seen here. I say that as it is recommended to run about every 30 days, so it is easier to schedule the task.
Let me know if this helps.