I got a error mesaage in the front panel of PE2900 E1410 CPU1 was asserted.(CPU1 Status: Processor sensor for CPU1, IERR was asserted). The problem is fixed by restarting the computer yesterday. However it happened again today and could not boot any more. Is there any one could help me..
DRAC 5 A00 1.32 (07.12.22)
You have a hardware failure - most likely CPU 1, though the fault could be elsewhere (motherboard, perhaps). The Hardware Owners Manual says that this error occurs when the specified CPU (CPU 1 in your case) reports an internal error.
If you have a hardware warranty, call Dell in the morning.
If you don't have a hardware warranty, then you are on your own.
If this is a single CPU machine, you will have to try to could try to obtain a replacement CPU, which will be much easier with it being a single CPU machine, as you won't be trying to match the specification of an existing CPU. However, you must keep to the processor compatibility of your machine - the 5400 series processors only work in 2900 III machines, and 5300 series processors only work in 2900 II and III machines. I believe support for the older processors was dropped in the newer machines.
If this is a dual CPU machine, you could remove and store CPU 1 and move CPU 2 to the CPU 1 socket to see if the machine will now boot.
Thanks for your reply. I brought this machine at Feb 2008, so I believe it is under maintenace. I'm not sure if it's hardware failure. The error message showed when the room tempture is higher than 30C. However it works fine at 28C. Is it a normal condition?
I also believe there are something wrong in PE2900. However I call the servcie guy in Dell, they told me that it's only becasue of the higher temperature and everything is ok. Is there any other tools that I could check the potential error of PE2900? Please advise.
Thanks for your help.
This error occurs when the CPU asserts its Internal ERRor (IERR) pin.
There are several possible reasons for this. There's a couple of listed processor errata that can cause this, but your failure to boot in what are not unreasonable ambient temperatures is of concern.
You could adopt a 'watch and wait' approach. However, if it happens again, I'd want an engineer out certainly to reseat the heatsink for that CPU, and really to change the CPU for another. As snapohead says, it's a real and legitimate error.
PE2900 died today after I got a message E122C this morning (System Board CPU Power Fault: Voltage sensor for System Board, state asserted was asserted). It could not be started anymore. Dell engineer will replace a new CPU and MB. Becasue of the weekend, I need to wait until next Mon. My question is why it failure? Do you have any experience for that?
Hardware failure can easily be just one of those things. It could be that something like the voltage regulation components for the CPU failed on the motherboard; the first sign of this might be some sort of blip that caused the IERR.
A new motherboard and CPU should get you going - how unfortunate it failed on a Friday so that your service call isn't until Monday.
Many thanks for your recent comments. I just a little depression, because I thought the PE2900 should be very stable. Is there any suggetion about the environment for PE2900? I do not want to have a hardware failure again. Or any suggetions for the redundant system? I do nothing without this server.
Thanks again for your help.
It's probably just "one of those things" - you could join the service engineer on Monday to see if there's an obvious component damage on the motherboard near the CPUs. I wouldn't be surprised if there's nothing to see, though.
There is a fair amount of redundancy in these servers (typically they have RAID disks and redundant power supplies), but it isn't possible to duplicate everything.