Start a Conversation

Unsolved

This post is more than 5 years old

OB

15414

November 30th, 2017 05:00

R720XD Correctable Disk Media Errors

Hello,

I'm having constant correctable disk media errors in 3 R720XDs with PERC H710P HBAs. The exact error that appears in the storage logs is:

A disk media error on Disc 0 Backplane 1 of Integrated RAID Controller 1 was corrected during recovery.

We opened cases with Dell and they replaced Disk 0 in each server but the errors continue. I understand these errors are being corrected, but why are we seeing so many correctable errors? Is this normal? We see hundreds of these errors during the patrol read operations.

It's not an issue with the drive as its been replaced and we were still getting errors related to disk 0 (in all 3 servers).

We recently updated the Idrac to v.2.50.50.50 and now the same errors appear but on different disks, not just disk 0. So perhaps there is an issue at the controller or even motherboard? Can this be an identification of an imminent failure in hardware?

While Dell tells us that nothing is wrong as the errors are being corrected and no disk has had a physical failure, I'm a bit worried as we've had these PERC controllers corrupt data before and we don't want a repeat of that.

Thanks in advance.

Moderator

 • 

8.5K Posts

November 30th, 2017 07:00

Omar Baez,

This could just be that the server, or even OpenManage, is out of date and needs to be updated. I would start with the BIOS, you have the iDrac done, so the H710p firmware and the H710p Driver, as well as going to SAS Drive and finding the hard drive to update.

Lastly, make sure that youre using the latest OpenManage as well. 

After all that is updated then let me know if you still see anything reported. 

December 7th, 2017 07:00

Hello Chris,

Thanks for the quick reply. We have most of the FW updated (BIOS, iDRAC+Lifecycle controller, PERC Controller). The only thing that we haven't tried updating were the hard disks as the Nautilis utility did not seem to work for us. Today I realized that its a UEFI boot image and we were trying to boot it from BIOS (it would lock up), but it works perfectly from the EUFI boot menu.

We will find some downtime to update the hard disk FW and see how it behaves after that.

Not sure the H710p driver will make a difference. Just to give you some background, these Dell systems are all running the same OS, kernel, drivers, etc set by our company standards. They are all running Debian and the same version of the H710p drivers (module) so the problem is unlikely to be there, as only 3 systems are exhibiting the constant errors, while the rest (30+ systems) are not. The hard drives on the other hand are not standard, they are all different manufacturers and FW levels, so that seems like a good place to start.

As far as the OMSA version we are using v. 7.1.0 which I know is old but it seems to be one of the latest versions available for Debian. The newer ones are for Red Hat and SuSE from what I can see. And again they all have the same OMSA, so I don't think that's causing issues in theory.

Anyways, I will post again once we see how they behave after the updates.

January 4th, 2018 09:00

Hi Chris,

Even after updating the FW on all the hard drives (using Nautilus) we are still getting the disk media errors on these 3 servers. So all FW is up to date without a change in symptoms. 

Could this be due to a faulty PERC controller? I say that because we have replaced disk 0 in one of the servers, and the disk media errors were always for disk 0, and they continue even after the disk replacement.

January 18th, 2018 08:00

Does anyone have any ideas why we might be seeing these media errors so frequently?

January 29th, 2018 10:00

BUMP

1 Message

June 26th, 2018 06:00

I have same error, but on a perc H310, Hello DELL help us!!!

We change a DISK0 and error remains.

all firmwares are update, included all disk..

1 Message

September 11th, 2018 05:00

Same here:

System Host Name: server

Event Message: A disk media error on Disk 0 in Backplane 1 of Integrated RAID Controller 1 was corrected during recovery.

Date/Time: Fri, 07 Sep 2018 20:22:27 -0500

Severity: Informational

 

Detailed Description: This message is generated after a disk media error is corrected on a physical disk.

Recommended Action: No response action is required.

Message ID: PDR54

 

System Model: PowerEdge R730xd

Service Tag: xxxxxxx

Power State: ON

System Location: Datacenter, Aisle 1, Rack 1, Slot 1 (2 U)

4 Posts

November 11th, 2018 14:00

Similar case on H710 Mini in an R520.

Interested that you point out it is always your Disk 0. In our case 0:1:0 Patrl Read kicks in and finds many "Patrl read corrected a media error" interspersed with "Unexpected Sense" on the same disk

We updated BIOS to the latest, PERC driver to the lastest, drive firmware to the latest. If anything we get more of these messages now. We have a hot spare configured and as others have said keep expecting the drive to fail over. But it doesn't, and this makes me suspect the messages could be a red herring. The Virtual Disk is not even flagged as Virtual Disk Bad Blocks in OMSA, which I would have expected, but maybe I don't understand how that works.

In any case Dell seems to have gone dark on this thread, even though a number of parties are reporting similar behaviour.

 

P.S. I get that Patrl is misspelled. Got a message that the correctly spelled word cannot be used in this community? What the...

1 Message

December 4th, 2018 02:00

Same here.
Dell R520 with H710p controller. 
"disk media error on Disc 1 Backplane 1 of Integrated RAID Controller 1 was corrected during recovery"

Changed the disk1 but the error messages is still.

Any solving?

30 Posts

May 24th, 2019 19:00

Dell PowerEdge T320 with PERC H310 also showing similar symptoms, only started after reconfiguring RAID5 VD from 3 PD to 5 PD. Maybe firmware bug can't handle more than 3 PD in a RAID5 array?

PDR54: A disk media error on Disk 0 in Backplane 1 of RAID Controller in Slot 6 was corrected during recovery.

VDR47: A disk media error was corrected on Virtual Disk 0 on RAID Controller in Slot 6.

 

Should I be concerned?

30 Posts

May 25th, 2019 10:00

More details of the issues I'm seeing is over (here).

No Events found!

Top