Unsolved

This post is more than 5 years old

12 Posts

20324

February 9th, 2006 00:00

PowerEdge 1750 became unresponsive when processing high load!!

We have 2 dell 2650 and 1 1750 server with 2003 as the OS.  Both of the 2650 seems to be alright, but the 1750 have been hanging so much lately.  It seems to me that when ever there is a high load on the server it will became unresponsive and all I can do is a hard boot..... The major problme is there is no log of any kind after the boot, so I am not sure if that is a hardware problem or software problem.  Did any of you guys experieced the same problem???

February 9th, 2006 10:00

No but I have used a tool to help me determine if the server is having a software or hardware issue. Here is a link to download it. You will need to extract the .zip file and then burn the ISO image to a cd. When you do this restart the server with the cd-r in the cd drive. It is bootable. Run the GUI selection and then select the express test. I tried the extended test once and it ran for well over an hour.
 
LINK:
 
 
Instructions on how to setup the cd-r:
 
Custom Instructions for EI5061A0.ZIP:

NOTE: In order to use this file, you need a CD-R or CD-RW and appropriate software to create a CD from the downloaded image.

Download

1. Select the download method you want to use to retrieve the file - either HTTP or FTP.
2. If the Export Compliance Disclaimer window appears, read the agreement and then select either "I agree" or "I disagree".
3. When the File Download window appears, click Save.
4. From the Save As dialog, select a location and a file name then click "Save".
5. If a Download Complete window appears, click Close.

Installation

1. Unzip the .ISO file from the downloaded .ZIP file and save to your hard drive.
2. Follow your CD writing software vendor's instructions to create a CD from the saved .ISO file.

Running the Diagnostics from a CD

1. Insert the CD into your CD-ROM, CD-RW, or DVD drive.
2. Boot to the CD.
3. Follow the on-screen instructions.

12 Posts

February 9th, 2006 15:00

ok I will try that tonight but I heard some of the people said that it might be the raid controller or it might be the network card.  I am not sure which one it is coz after a reboot every thing went back to normal.....
 
which dianogstic should I do I think most of them deals with the memory....

Message Edited by faifai9394 on 02-09-2006 11:48 AM

6 Operator

 • 

1.8K Posts

February 9th, 2006 20:00

A possible cause of such a problem is the failure to reboot after MS updates, even if not required.
I would update the firmware and drivers on the raid card, very unlikely it is a NIC card/driver problem.
You might want to "print screen" task manager screens for a couple of weeks,sort on memory usage, save them in a file to check for a memory leak.
 

12 Posts

February 9th, 2006 21:00

Do you guys think upgrading the raid firmware would mess up the sql?? that server actually contain a sql applicatoin in there so I am afraid that upgrade will mess it up....... also that was the only server that have the application.....

6 Operator

 • 

1.8K Posts

February 10th, 2006 12:00

It is always a slim chance a firmware update will mess up, (1 in 200), but I have not had this happen in years. Obviouly backup, document the raid setup, have the old firmware disk on hand, server should be on a battery backup unit. Not trying to scare you, it should go fine.
 
What disks do you have, model, manufacturer ?
 

Message Edited by pcmeiners on 02-10-2006 08:33 AM

12 Posts

February 10th, 2006 15:00

I think those are segates, the problem is that I had never setup that raid and my supervisor who had set that up is no longer here

Message Edited by faifai9394 on 04-11-200611:39 AM

6 Operator

 • 

1.8K Posts

February 10th, 2006 16:00

my supervisor who had set that up just got laid...... :smileyhappy:  ,nice your supervisor just got "some"
 
The reason I asked was Seagate had an issue with disk firmware below revision 006, while under heavy load with certain 10 and 15k drives. What revision do you have. Not sure if the Dell software will show the revision, but if you shut down, you could pull a disk and check the revision, then put the disk back in and restart the server.
 
Still think the raid firmware is the most likely cause.
 

12 Posts

February 10th, 2006 16:00

I am not sure what rev is right now I think I need to pull it out, so far I have read alot about the raid thing have problem, but most of them are linux not windows.... I think our firmware might be old becasue we got that system like two years ago.... and the bad part is my supervisor never try the system and laid it around for almost half year before we start using it. so no more dell support.....

 

Message Edited by faifai9394 on 02-10-2006 01:09 PM

12 Posts

February 10th, 2006 19:00

Do you have to reconfigure the raid after the firmware upgrade?? Thats the only thing that I am aware of becasue that server have an important application that we use every day. 

6 Operator

 • 

1.8K Posts

February 10th, 2006 19:00

You still can get drives/firmware from the Dell sites download center.
I would definitely update the raid firmware, few months back I gettig coruption, disk errors etc until a new   firmware flash.
 
 

February 13th, 2006 10:00

I have to admit. This is the funniest post I have ever read (Quoting earlier thread:
I think those are segates, the problem is that I had never setup that raid and my supervisor who had set that up just got laid...... )
 
RECONFIGURING YOUR RAID. You should never have to do this unless you choose to do it. Also, make sure, before you update the firmware on your PERC controller that you know which one it is. Also make sure that you don't have a failed drive.
 
Did you get a chance to run the Dell 32-bit diagnostics? That will tell you if there is a hardware issue and will generate log files too. You should run this and let us know how it goes.

Message Edited by Joseph_Pickell on 02-13-2006 06:14 AM

12 Posts

February 13th, 2006 15:00

I have run the dianogstic already, and I got this error

IPMI-IPMI System Event log Check

Test Result: Fail

Error Code 2900:0221

Msg: : power supply: power supply sensor power supply A/C Lost

February 13th, 2006 15:00

How many power supplies do you have in the system?
 
You should check for the failure (amber light) indicator on the power supply. If it is plugged into UPS. you may want to bypass it for while running the Diags. You might want replace the power cord going to it too. These are just a few ideas.

12 Posts

February 13th, 2006 15:00

I think it have only one and yes it is connected to a ups, but I don't think the power is the major problem...

April 4th, 2006 19:00

Hi,

I am having the same kind of problem with our PowerEdge 1750 server. It is becoming unresponsive and rebooting itself twice a week. We recently added memory and harddisk to the server. Also, Memory Diagnostics did not come up with anything so I am planning to run the complete Diagnostics on all devices.

Let me know if you have found solution for your problem and what did you do to get there.

Thanks,
Vineet
No Events found!

Top