Start a Conversation

Unsolved

J

8 Posts

1053

May 25th, 2022 09:00

Predicted Drive Failure

I pulled the bad hot swap drive and reseated hoping it would rebuild however since doing this, three of the four drives in the RAID10 array are flashing Green and Amber which mean Predicted Drive Failure.

How can I resolve this issue?

I have received the replacement drive but don't remember which one was bad since I now have three flashing green and amber.  Dell1.PNGDell2.PNGDell3.PNG

 

I have ran the Dell EMC Server Update Utility to verify drivers and firmware.  See screen shot below.  As you can see there are many updated I could perform but im worried such a maneuver would crash my server. 

 

Dell4.PNG

Any help would be greatly appreciated.

Moderator

 • 

3.3K Posts

May 25th, 2022 13:00

Hello JeffACC,

 

It sounds like you could have a puncture with all the pred fail drives. 

Virtual disk bad blocks are due to presence of unrecoverable bad blocks on one or more member physical disks.

 

First I recommend to confirm you have a validated backup.

 

You can search the controller log for "Puncture" see what shows up.

To get controller log: in Open Manage, expand Storage>select the controller.

Click Information/Configuration hyper link at top.

Use the ‘Controller Task’ drop down>

select Export Log, click Execute and Export on next page.

Log will be in Windows folder with file name LSI_####.log

 

 

Basic steps for clearing virtual disk bad blocks:

 

1. Get file level, not block level, backup of the data

2. Update the firmware of the hard drives

3. Run a Consistency Check on the virtual disk (via OMSA)

4. Run a patrol read and Clear the bad block table for the virtual disk (via OMSA)

   (Then monitor for any new bad blocks or other issues)

5. Run diagnostics: Boot to  F11 on Dell Splash screen, selecting  Boot Manager -> System Utilities -> Launch Dell Diagnostics.  Note any messages and continue testing.

 

You will need to replace any drive that still shows pred fail.

 

 

I'll supply some other reference information for you:

 

How to Fix a RAID Puncture

https://dell.to/3lJUe5S

 

Double Faults and Punctures in RAID Arrays

https://dell.to/3wMMExR

 

How to Handle Puncturing (Bad Blocks) on Virtual Disks for PowerEdge servers

https://dell.to/3wS62bh

 

Moderator

 • 

3.3K Posts

May 31st, 2022 08:00

Hello JeffACC,

 

That is good to see. Yes you can continue with the action plan.

Please let me know how it goes.

8 Posts

May 31st, 2022 08:00

Hi Charles, Thanks for the detailed instructions.

Sorry for the delay, I was out of the office until this morning.

I ran the LSI log and scanned for "puncture" and found none.

 

Do I still continue on with your original instructions?

8 Posts

May 31st, 2022 08:00

Will do, Thanks Charles.

8 Posts

June 1st, 2022 07:00

Hi Charles,

Can I run the Drive Consistency checks during the day with people logged in to the server? 

Moderator

 • 

8.4K Posts

June 1st, 2022 07:00

JeffACC, 

 

Yes you can run it while live, you can also schedule it via OpenManage as well, as seen here. I say that as it is recommended to run about every 30 days, so it is easier to schedule the task. 

 

Let me know if this helps.

 

8 Posts

June 1st, 2022 10:00

I did run and the three virtual drives are still displaying Predicted Drive Failure.  I will need to update the firmware on the drives and raid card but a bit leery  to install these firmware patches as I know they will need a reboot and am not 100% sure it will come back online.

Moderator

 • 

8.4K Posts

June 1st, 2022 11:00

I would suggest exporting a TSR, so that we can get a better idea as to what is occurring.

You can find the instructions for the TSR here.

After you export it, upload the TSR to upload. dell. com, then private message Charles and I the svc tag used to upload, so that we can locate it. 

 

Thanks.

 

 

8 Posts

June 1st, 2022 14:00

SERVICE TAG (removed by moderator)

TSR Uploaded

Please let me know if you need anything further.  Thanks for all your help!

8 Posts

June 1st, 2022 14:00

SERVICE TAG [removed by moderator]

TSR Uploaded

Moderator

 • 

3.3K Posts

June 2nd, 2022 05:00

Hello JeffACC,

 

Thank you for the service tag. I will collect the log, review and update you.

Please note I removed the service tag from your post as that is private information.

 

Have you Run diagnostics: Boot to  F11 on Dell Splash screen, selecting  Boot Manager -> System Utilities -> Launch Dell Diagnostics.  Note any messages and continue testing.

 

The firmware will not fix a pred fail drive. It can prevent pred fail if it is applied before it reaches pred fail. You will need to replace any drive that still shows pred fail.

 

Make sure  you have a validated backup.

8 Posts

June 2nd, 2022 10:00

Thanks Charles,

Will do!

Moderator

 • 

3.3K Posts

June 2nd, 2022 10:00

Hello JeffACC,

 

The log does not indicate a puncture and that is good.

You can continue with the action plan replacing each pred fail drive one at a time.

 

Run consistency check

Offline pred fail drive and replace

After rebuild completes run consistency check again and repeat for each pred fail drive.

No Events found!

Top