Start a Conversation

This post is more than 5 years old

Solved!

Go to Solution

114419

February 26th, 2013 05:00

E1618 Predictive Failure on both PSUs, how to troubleshoot

Hi,

I have a PowerEdge R300server and the following warnings:

PS 1 Status: Power Supply sensor for PS 1, predictive failure was asserted

PS 2 Status: Power Supply sensor for PS 2, predictive failure was asserted

- I have updated OMSA to 5.5 (RedHat 4.7), DRAC5 firmware to 1.60 and BMC firmware to 2.50.

- I have swapped both units, also checked the system with one PSU at a time

- I measured voltage (with open chassis) - stable 3.3V, 12.2V and 15.1V during during booting, powering off or normal work

- I checked if there is any damage inside one of the PSUs and there is none

What else could I check to make sure that the PSUs are faulty (or not)?

Cheers,

Daniel Andrzejewski

990 Posts

February 27th, 2013 07:00

Thanks for the additional information.  If this server is still under warranty, I would contact support and get them replaced.  If they aren't under warranty, I would replace them  or use them till they fail.   You have done all the troubleshooting steps necessary.

Regards,

990 Posts

February 26th, 2013 07:00

Good morning, Daniel.

I would clear the logs from OMSA,  then power down the server, pull both power cords, and drain the remaining power off the board.  Wait a few minutes, then put the power cords back on and power up.  Then check OMSA to see if the error is persistent.

Regards,

5 Posts

February 26th, 2013 07:00

Good Morning Geoff,

Thank you for your quick response.

I have already cleared the logs from OMSA. I have also restarted the machine several times, powered it down for longer periods, but that did not resolve the issue. Although, the message disappears from the LCD for some time, the error is reported by OMSA and later shows up on the LCD.

I would be glad to see some other options. Could there be some problems with the server itself instead of the power supply units?

Thanks again,

Daniel Andrzejewski

990 Posts

February 26th, 2013 07:00

Make sure the bios is current at 1.5.2  and the DRAC is updated to 1.65; Critical DRAC5 Update -- The release is aimed at upgrading expiring DRAC5 certificates, feature additions and known issue fixes.  The BMC is current at 2.50;  there may be a mismatch in the bios, DRAC and BMC levels that will give a false positive  on the  power supplies.

Regards,

5 Posts

February 27th, 2013 01:00

Hi Geoff,

BIOS is at 1.5.2, DRAC5 updated to 1.65, BMC at 2.50.

I performed clearing the logs, powering down, waiting 10 minutes. After that I booted the machine up and the warnings reappeeared in the logs:

Status: Non-Critical 1353 Wed Feb 27 09:59:55 2013 Instrumentation Service Power supply detected a warning Sensor location: PS 1 Status Chassis location: Main System Chassis Previous state was: Unknown Power Supply type: AC Power Supply state: Presence detected, Predictive failure

Status: Non-Critical 1353 Wed Feb 27 09:59:55 2013 Instrumentation Service Power supply detected a warning Sensor location: PS 2 Status Chassis location: Main System Chassis Previous state was: Unknown Power Supply type: AC Power Supply state: Presence detected, Predictive failure

Any idea what I could conclude from this? Soon failiing power supplies or something on the motherboard, or maybe just the software problem?

Cheers,

Daniel Andrzejewski

5 Posts

February 27th, 2013 07:00

I would like to add that I have enabled SNMP and used ipmitool to restart BMC and clear ESM logs.

However, as soon as I restarted the server the mentioned above errors/warnings reappeared.

Daniel A.

5 Posts

February 27th, 2013 23:00

Thanks Geoff,

yes, it is out of warranty. The problem with replacing power source units is that it does not guarantee that the error will be gone. Like I said before, we are not sure if it's not a problem with the server itself.

Cheers,

Daniel A.

No Events found!

Top