Start a Conversation

This post is more than 5 years old

Solved!

Go to Solution

21128

December 11th, 2012 09:00

PowerEdge 4400 Critical fault Peripheral Bay Backplane +5V/+12V

I have recently installed OMSA 4.5 on a Linux system to try to identify why the front panel error lights are blinking.  The summary report is:


$ omreport chassis

Health

Main System Chassis

SEVERITY : COMPONENT

...

Non-Critical : Fans

...

Critical : Voltages

... 


I figured out how to correct the fans issue, but it is not so clear how to approach handling the Voltages issue on such and old platform.


$ omreport chassis volts

Health : Critical

...

Index : 21

Status : Critical

Probe Name : Peripheral Bay Backplane +5V

Reading : 6.630 V

Minimum Warning Threshold : 4.750 V

Maximum Warning Threshold : 5.250 V

Minimum Failure Threshold : 4.500 V

Maximum Failure Threshold : 5.500 V

Index : 22

Status : Critical

Probe Name : Peripheral Bay Backplane +12V

Reading : 10.550 V

Minimum Warning Threshold : 11.520 V

Maximum Warning Threshold : 12.480 V

Minimum Failure Threshold : 11.400 V

Maximum Failure Threshold : 12.600 V


Any ideas?  Is it more practical to move the system to different hardware than to attempt resolution of these issues?  Also, what is the Peripheral Bay Backplane, is it field replaceable, and does it have a part number usefule for locating a replacement?  I have another 4400 that is in an unknown state (no OS, etc.), so conceptually I might be able to rob parts off of it, or transfer the RAID containers to it.

10 Posts

December 17th, 2012 11:00

The Peripheral Bay Backplane voltages problem was resolved by replacing the 2-bay SCSI cage assembly (328WD Rev A00 backplane) with a cage assembly (4575D Rev A00 backplane) from another PE4400.

I'm guessing that the part number difference on the backplane was not a factor, but I have no real way of knowing that without getting another 328WD backplane, and I don't see a point in doing that.

10 Posts

December 11th, 2012 10:00

The suggestion to check disks is helpful.  That possibility did not come to mind.  The operating system is on the two-disk array (in a mirror configuration), so it should be relatively easy to boot the system without the 8-disk array mounted (/home) and then run omreport with those drives pulled.  I suppose with patience, one might even pull one disk at a time (after letting the array rebuild between attempts) though that could be a pain, and is no help if more than one disk is sub-optimal.

December 11th, 2012 10:00

Thanks for your great post on getting OMSA to run. It's possible this is a backplane or disk issue. The part number for the 2x4 backplane option is 9403R and the 1x8 backplane option is 94243. I don't think we have any of these around anymore as the server is quite old, so if you have a matching one in the other server to try out that may be the way to go. Since these refer to backplane power rails, If it's possible to remove the disks and boot to the OMSA Live 4.5 CD from linux.dell.com/.../omsa-45-live to make sure you get the same report with the disks removed that would be good to to ensure it's not actually a disk problem. Let us know how it goes.

10 Posts

December 11th, 2012 10:00

The referenced live CD does not boot.  It references missing /dev/hda, /dev/hdb, /dev/hdc, and /dev/hdd devices.  I tried using the live CD before I got OMSA running on the hosted operating system.

December 11th, 2012 11:00

Thanks for the quick response. I was able to boot the OMSA disk (on a different system) without trouble so it's possible the disk mentions you saw are related to a possible problem with the disks or the backplane. To clarify my suggestion, you could remove all disks at once and attempt a live CD boot once to see if the problem persisted, rather than having to go through the trouble and RAID wear of rebuilding several times. If you can just go with the 2-disk OS mirror for testing that would be good too.

10 Posts

December 17th, 2012 10:00

Nothing I do results in an ability to boot the live CD.  I tried pulling all the drives.  The following occurs:

...

Freeing unused kernel memory : 184k freed

mount: /dev/hda is not a valid block device

mount: /dev/hdb is not a valid block device

mount: /dev/hdc is not a valid block device

mount: /dev/hdd is not a valid block device

FATAL---LIVECD NOT FOUND

sh-4.00#

At this point I have a very minimalistic shell.  I can't do a lot.  If I ^D out of this shell, a kernel panic occurs.

Since I can't get that to work, I looked at the spare 4400 I have and swapped the 8-bay SCSI backplane assembly between the two systems, but that led me to the knowledge that it is not the "Peripheral Bay Backplane".  After the swap: the critical voltages are not cleared and I get a new BIOS startup message that says:

PE4400

Embedded system management firmware revision: 5.50

System backplane firmware revision: 5.34

!!******: Warning : Firmware is out-of-date...

System Peripheral Backplane firmware version 1.30

Power supply paralleling board firmware version 2.41

Running the ESM update confirms that the warning is for the System backplane (since the out-of-date message only occurred when the 8-bay SCSI backplane assembly was replaced), so I still don't know what the System Peripheral Backplane is.

In clear view are a Control Panel Board (04442C) and the main motherboard.  After pulling CPU fan assembies, I see a backplane behind the 2-bay SCSI bay - maybe that is the offending part.

No Events found!

Top