Unsolved

This post is more than 5 years old

12 Posts

391561

March 3rd, 2008 00:00

Critical Server Error

Hello,

 

We lost connectivity to a PE2950 earlier today and when we arrived at the data centre the front-panel LCD was orange and saying the following: "E1420 CPU Bus PERR".

 

I power cycled the machine and it came back up with no problems and currently SEEMS to be running fine.

 

When I loaded up Dell Server Administrator it shows everything with green check marks and doesnt report any issues.  When I look at the log section it shows that there was a critical error (red x) and reports a similar error to the front panel LCD.

 

My question now is:

1) What is the error? Is it something I should be worried about?

2) Why is the front panel LCD still amber showing that error when the server administrator is reporting no issues?

 

Please advise, thanks

 

 

1.2K Posts

March 4th, 2008 02:00

Clearing the logs will change the LCD to blue and remove the error. But I suspect the error may appear again at a later stage. Were there any errors indicating a particular CPU? If not, it may require a replacement MB at some stage.

See how it runs, and if the problems persist then log a call with Dell.

12 Posts

March 4th, 2008 12:00

Hi Snapohead,

 

Thanks for your reply.

 

Do you know what that error actually means? What is PERR? Some sort of pairty error?

 

The strange thing is that the server has been working perfectly fine since the incident.  Not sure what could have caused it.  I'm worried because its an important server.

 

I'm using PERC raid controller (RAID1 on sys drives, RAID10 on data drives) - if I pull those drives and put them into an identical PE2950 will the raid still work? (does the info get written to the RAID drives or does the controller have to move with them?)

 

Thanks

1.2K Posts

March 7th, 2008 02:00

Yes, it's a parity error with the CPU bus. Although, I'm not sure what would cause this. It may well be a one off (hopefully). The config is written to both the disks and controller, so swapping to an identical system will work.

You'll need to import the config from the disks though, as it will be detected as foreign.

12 Posts

March 11th, 2008 13:00

 

Phainlen,

 

Think you may be on to something.  The server it happened to me on is also running Virtual Server 2005 R2 !!! 

 

Hmm.... Any Dell support people here know whats going on?

14 Posts

March 11th, 2008 13:00

I am also getting the "CPU Bus PERR" on my 1950 servers.  The strange thing is that it is only happening on servers that I am running Microsoft Virtual Server 2005 R2 on.  This error has happened on three different 1950 servers so far for me (all running MVS 2005 R2).  We had the motherboard replaced on one of them through Dell support and two weeks later the same issue came up.  We then replaced the entire server with a new 1950.  Two weeks later, the same issue happened again.

9 Posts

March 12th, 2008 12:00

I am also having this problem when running OpenVZ on my PE2950.

14 Posts

March 13th, 2008 13:00

Interesting thought about the PERC.  We are booting from the SAN so we aren't even using the PERC.  Given that info, I would almost eliminate the storage side as a potential issue since we are both having the same issue with different types of storage.

23 Posts

March 13th, 2008 13:00

The crashing box is

 

BIOS Information   Manufacturer   Dell Inc.     Version   2.2.6     Release Date   02/05/2008

 

 

The non crashing box with datacenter is

 

BIOS Information   Manufacturer   Dell Inc.     Version   1.3.7     Release Date   03/26/2007

9 Posts

March 13th, 2008 13:00

Do you know what BIOS revision is on each box? I will be working on our 2950 hopefully today, I did read somewhere on another Dell forum to try turning on the virtualization option under the advanced screen and then under processor. I will be trying that today.

 

23 Posts

March 13th, 2008 13:00

So glad I found this thread.

 

Here's my situation.

 

I have the following

 

  • Dell PowerEdge 2950, 10 GB RAM, Perc 5 RAID, Windows 2003 x64 Enterprise, Virtual Server R2 x64
  • Dell PowerEdge 2950, 24 GB RAM, Perc 5 RAID, Windows 2003 x64 Datacenter, Virtual Server R2 x64

  • Dell PowerEdge 2950, 28 GB RAM, Perc 6 RAID, Windows 2003 x64 Datacenter, Virtual Server R2 x64

Oh, the other difference is the processors.

 

OKAY, so the first 2 boxes ran great. For  months. Box number 3 I just received, set it up, running great for 2 weeks, and BAM. CPU BUS PERR E1420 on the front LCD. The thing was turned off when I got into work. Powered it on, no entries in the windows event logs. and Dell ESM showed that error.

 

I called DELL, they said maybe a misseated CPU. I wasn't happy, since the machine ran fine.. so they said replace processors, and I didnt' want to replace them because the box was new, they sent me a new box.

 

rebuilt the new box, 2 days later, SAME issue but this time the machine actually rebooted. We are confused. The only thing that stayed the same was the RAM. So we ran a MEM test, all passed. Now what?

 

Spoke to Dell again, and then I found this post as I was talking to level 3 support. The tech said he has a few documented cases like this, and virtual server seemed to be common. But it is running fine on my other boxes, so I was thinking okay it's the PERC, or it's the processor architecture. the machine has been running okay for a few days now. They don't know why this error occurs. but it really has me in a bind.

 

If anyone finds anything out, let me know. I will DEFINITELY keep you all posted on progress.

23 Posts

March 13th, 2008 13:00

no i totally agree. What about your CPUS? What OS are you running? 32bit 64 bit?

 

I think it's virtual server trying to access something out of processor range..

12 Posts

March 13th, 2008 13:00

HI GUYS! I'm so glad I started this thread - now I know I'm not going crazy!

 

I got two 2950's back in November and they've been running fine.  Last week I had the error happen on one of them and it just shut off.  Since then, it hasn't happened again but I'm freaked that it may as its a production machine.

 

I'm running

-Windows Server 2003 R2 Enterprise

-Virtual Server 2005 R2 32bit

-8GB RAM

-PERC RAID CONTROLLER:

     -RAID 1: 2x36GB SAS (OS DRIVE)

     -RAID 10: 4x300GB SAS (DATA DRIVE)

 

I will keep everybody updated on any progress I have - hope you all will do the same. 

 

 

 

23 Posts

March 13th, 2008 13:00

Cool I'm running 64bit, so that eliminates the 32 bit 64 bit being the problem.Also eliminates the storage as the other guy is using  a SAN.


I think now we should discuss the processors themselves. Steppings etc.

14 Posts

March 13th, 2008 14:00

I have dual quad core 3.0 Ghz Intel Xeons.  I'm beginning to agree with the theory that MS Virtual Server is the problem.

14 Posts

March 13th, 2008 14:00

My Servers are Windows Standard 2003 R2 SP2 x64.

No Events found!

Top