Start a Conversation

Unsolved

This post is more than 5 years old

44598

February 16th, 2009 06:00

Help with Poweredge server crashing/freezing/failing

Here at work we have a Dell Poweredge 1800. This is the problem I've been experiencing.

One weekend last November, one of my clients tried to remote VPN into the server. It didn't work. With a :emotion-18:  look on my face, I went to the site of the server, and when I found it, it was frozen (blank screen, would not respond to input). So I powered off the server and powered on again, but it would not start up.

I went through the usual, pulling everything until I was down to the bare hardware. The server refused to boot, at all. Basically what it does is it tries to boot, but then restarts, or sometimes, won't even attempt to boot at all.

Eventually, it worked. And since then, this exact issue has happened more than 5 times, and it's getting annoying.

I'd like to get down to the source of the problem. Since I don't have any swappable hardware, being a Xeon server and all, the best I could do is look at the Dell server logs. When I did this, it has a hundreds of entries that say:

  Sat Jan 10 13:51:35 2009  PROC_1 VCORE voltage sensor state deasserted
  Sat Jan 10 13:51:39 2009  PROC_1 VCORE voltage sensor state asserted
  System Boot  BMC 1.5V PG voltage sensor state asserted
  System Boot  BMC 1.5V PG voltage sensor state deasserted
  System Boot  BMC PROC VTT voltage sensor state asserted
  System Boot  BMC PROC VTT voltage sensor state deasserted
  System Boot  BMC VRD 1 Temp temperature sensor detected a warning (8.0 C)
  System Boot  BMC VRD 1 Temp temperature sensor returned to normal (10.0 C)

Now, I know this has something to do with potentially the VRM or the processor, but I can't be sure because I don't know what those errors mean.

I'd appreciate it if anyone can help me out with this. Thanks!

6 Posts

April 12th, 2009 21:00

A shot in the dark: Replace the Power Supply.

 

 

1 Message

December 1st, 2011 17:00

I had a similar situation and it turned out to be the motherboard. In my casehere were no temperature warnings.  I replaced the power supply twice and swapped the original one to another PE1800 and the problem remained in the original server. Eventually the server would not boot at all. I had another PE1800 that I had bought for parts and configured it using the disks and RAM from the bad server.

I later replaced the motherboard with a used one I bought on eBay and it is the one that I am posting from now. PE1800 motherboards can be found on eBay for less that $100.

No Events found!

Top