Unsolved
8 Posts
0
1053
Predicted Drive Failure
I pulled the bad hot swap drive and reseated hoping it would rebuild however since doing this, three of the four drives in the RAID10 array are flashing Green and Amber which mean Predicted Drive Failure.
How can I resolve this issue?
I have received the replacement drive but don't remember which one was bad since I now have three flashing green and amber.
I have ran the Dell EMC Server Update Utility to verify drivers and firmware. See screen shot below. As you can see there are many updated I could perform but im worried such a maneuver would crash my server.
Any help would be greatly appreciated.
DELL-Charles R
Moderator
Moderator
•
3.3K Posts
0
May 25th, 2022 13:00
Hello JeffACC,
It sounds like you could have a puncture with all the pred fail drives.
Virtual disk bad blocks are due to presence of unrecoverable bad blocks on one or more member physical disks.
First I recommend to confirm you have a validated backup.
You can search the controller log for "Puncture" see what shows up.
To get controller log: in Open Manage, expand Storage>select the controller.
Click Information/Configuration hyper link at top.
Use the ‘Controller Task’ drop down>
select Export Log, click Execute and Export on next page.
Log will be in Windows folder with file name LSI_####.log
Basic steps for clearing virtual disk bad blocks:
1. Get file level, not block level, backup of the data
2. Update the firmware of the hard drives
3. Run a Consistency Check on the virtual disk (via OMSA)
4. Run a patrol read and Clear the bad block table for the virtual disk (via OMSA)
(Then monitor for any new bad blocks or other issues)
5. Run diagnostics: Boot to F11 on Dell Splash screen, selecting Boot Manager -> System Utilities -> Launch Dell Diagnostics. Note any messages and continue testing.
You will need to replace any drive that still shows pred fail.
I'll supply some other reference information for you:
How to Fix a RAID Puncture
https://dell.to/3lJUe5S
Double Faults and Punctures in RAID Arrays
https://dell.to/3wMMExR
How to Handle Puncturing (Bad Blocks) on Virtual Disks for PowerEdge servers
https://dell.to/3wS62bh
DELL-Charles R
Moderator
Moderator
•
3.3K Posts
0
May 31st, 2022 08:00
Hello JeffACC,
That is good to see. Yes you can continue with the action plan.
Please let me know how it goes.
JeffACC
8 Posts
0
May 31st, 2022 08:00
Hi Charles, Thanks for the detailed instructions.
Sorry for the delay, I was out of the office until this morning.
I ran the LSI log and scanned for "puncture" and found none.
Do I still continue on with your original instructions?
JeffACC
8 Posts
0
May 31st, 2022 08:00
Will do, Thanks Charles.
JeffACC
8 Posts
0
June 1st, 2022 07:00
Hi Charles,
Can I run the Drive Consistency checks during the day with people logged in to the server?
DELL-Chris H
Moderator
Moderator
•
8.4K Posts
0
June 1st, 2022 07:00
JeffACC,
Yes you can run it while live, you can also schedule it via OpenManage as well, as seen here. I say that as it is recommended to run about every 30 days, so it is easier to schedule the task.
Let me know if this helps.
JeffACC
8 Posts
0
June 1st, 2022 10:00
I did run and the three virtual drives are still displaying Predicted Drive Failure. I will need to update the firmware on the drives and raid card but a bit leery to install these firmware patches as I know they will need a reboot and am not 100% sure it will come back online.
DELL-Chris H
Moderator
Moderator
•
8.4K Posts
0
June 1st, 2022 11:00
I would suggest exporting a TSR, so that we can get a better idea as to what is occurring.
You can find the instructions for the TSR here.
After you export it, upload the TSR to upload. dell. com, then private message Charles and I the svc tag used to upload, so that we can locate it.
Thanks.
JeffACC
8 Posts
0
June 1st, 2022 14:00
SERVICE TAG (removed by moderator)
TSR Uploaded
Please let me know if you need anything further. Thanks for all your help!
JeffACC
8 Posts
0
June 1st, 2022 14:00
SERVICE TAG [removed by moderator]
TSR Uploaded
DELL-Charles R
Moderator
Moderator
•
3.3K Posts
0
June 2nd, 2022 05:00
Hello JeffACC,
Thank you for the service tag. I will collect the log, review and update you.
Please note I removed the service tag from your post as that is private information.
Have you Run diagnostics: Boot to F11 on Dell Splash screen, selecting Boot Manager -> System Utilities -> Launch Dell Diagnostics. Note any messages and continue testing.
The firmware will not fix a pred fail drive. It can prevent pred fail if it is applied before it reaches pred fail. You will need to replace any drive that still shows pred fail.
Make sure you have a validated backup.
JeffACC
8 Posts
0
June 2nd, 2022 10:00
Thanks Charles,
Will do!
DELL-Charles R
Moderator
Moderator
•
3.3K Posts
0
June 2nd, 2022 10:00
Hello JeffACC,
The log does not indicate a puncture and that is good.
You can continue with the action plan replacing each pred fail drive one at a time.
Run consistency check
Offline pred fail drive and replace
After rebuild completes run consistency check again and repeat for each pred fail drive.