Start a Conversation

Unsolved

This post is more than 5 years old

W

1515

March 27th, 2018 13:00

OMSA 5.1 Falsly Reporting Disk Predictive Failure

Running OMSA 5.1 on PowerEdge 2850 Server with RHEL 3.  Started getting SCSI Sense Errors, then Disk Predictive Failure Disk #0:6 on External SAS RAID-10 in MD1000, which is Connected via PERC 5/E Controller.  Changed Out Hard Drive & Rebuilt RAID-10 OK & is Now OPTIMAL with No More SCSI Sense Errors.

However, Once a DAY, Still Getting Warning about the Same Disk having Predictive Failure Reported.

In OMSA WebGUI, ALL Disks Show NO PREDICTIVE FAILURE, but Something Seems to be Stuck in this Once a Day Reporting Process.

Did try Restarting dataeng Service, but it didn't Fix the Problem.  Is there Any Other Service that I Could ReStart?  Perhaps Whatever it is that Seems to Run this Check Once per Day & Generates this False System Log/SNMP Message.  The time of day seems to be around the time of the ORIGINAL Disk having Predictive Failure, but keeps getting later & later by a few minutes each day this has gone on for about 2-3 weeks now.

Or is there a Command to Clear Out the Disk Fault Queue or Something?  I see now that I probably didn't follow correct procedure in replacing the drive.  I've always just Yanked FAILED drives Out & Put New Drives in to Start Rebuilding.  However, this Original Drive was only PREDICTIVELY FAILED, but was having SCSI Sense Errors, so I Wanted it Gone Sooner than Later.  Whenever I Yanked it, the ALARM Sounded, but Nothing Turned RED, as it Typically does (ie. Server & MD1000 Stayed BLUE) & the Alarm Noise didn't Stop until the RAID-10 was Optimal Again.  The System is Mounted High in the Rack, so I Couldn't Tell if it was the MD1000 Making the Noise or my Server.  Everything was Fine Again After Rebuild Finished though (except this Annoying Error Report that won't go away).

We RARELY Reboot this System & would Like to Avoid doing so (if Possible).

Moderator

 • 

6.9K Posts

March 28th, 2018 08:00

Hello wbancks,

First thing is to make sure that you have a valid & tested backup. Once you have that then we would need to review a controller log from your PERC card to make sure that there is not a puncture. Here is a link to the Dset that you can install to get the controller log. https://www.dell.com/community/PowerVault/OMSA-5-1-Falsly-Reporting-Disk-Predictive-Failure/m-p/6048911#M28419%2Fjump-to%2Ffirst-unread-message

Whenever there is a predictive failure drive you need to offline the drive prior to removing the drive.

Please let us know if you have any other questions.

5 Posts

April 5th, 2018 13:00

Tried to post this reply yesterday, but I think that my session must have timed out or something.

I have DSET v2.0 report from the Server, but How/Where Do I Send it to be Analyzed for Any Potential RAID problems?

Moderator

 • 

6.9K Posts

April 6th, 2018 08:00

Hello wbancks,

I will send you an email that you can reply with the log.

Please let us know if you have any other questions.

No Events found!

Top