Start a Conversation

This post is more than 5 years old

Solved!

Go to Solution

117312

October 18th, 2011 08:00

PowerEdge 1900, PERC 5/i RAID 1+0 Dilemma

Good Morning Everyone,

Client is running a PE 1900, PERC 5/i with a RAID 1+0 configuration (4 physical disks, 500GB each) used as an internal file server (heavily used, I might add).

Starting this past weekend the server slowed to a crawl - extremely slow response times, etc.  The system event log is filling up with the following:

  • Event ID: 2271 "The Patrol Read corrected a media error.:  Physical Disk 0:0 Controller 0, Connector 0"
     
  • Event ID: 2095 "SCSI sense data Sense key:  3 Sense code: 11 Sense qualifier:  0:  Physical Disk 0:0 Controller 0, Connector 0"

over and over again.  I remoted in yesterday, opened OMSA (as an example of how slow the server has become, it took nearly 30 minutes to fully launch OMSA) ... it shows no errors, everything healthy.  After COB yesterday we initiated a restart of the box - that restart took over 2 hours!

So it appears to me (correct me if I'm wrong) that physical disk 0 is in the process of failing, it just hasn't completely failed yet - and that the PERC 5/i is continually scanning/attempting to repair the errors, which is slowing the system down to a crawl.  Assuming this is correct and given that this is a RAID 1+0, would it be wise to go ahead and manually fail disk 0, replace it, then allow the PERC to rebuild the array?  If so - and again, given the configuration - I should be able to perform the operation without taking the box all the way down, correct?

Any and all advice/recommendations are welcome and most appreciated.

Best,

Ken

7 Technologist

 • 

16.3K Posts

October 18th, 2011 09:00

Disk 0 may be failing, or it simply encountered an error, as all drives occasionally do.  However, seeing as how your system is at a crawl now, I would recommend some steps to determine if the drive(s) are good or bad.

Online Diagnostics can be run from Windows and can report within 2 minutes (per drive) on its health.  These diagnostics are 95% accurate, but they obviously cannot be 100% in that short amount of time.  These can be run while the system is up and running, without affecting the users/data.  Just make sure to select the Quick Test box.

If you need a more reliable/exhaustive test, 32-bit Diagnostics can be run outside of Windows on the drives.

In actuality, I would imagine disk 0 will fail, requiring a replacement.  If that is the case, yes, you can Prepare To Remove/Offline the drive, replace it, then rebuild it without taking the server down or rebooting.

Moderator

 • 

8.4K Posts

October 18th, 2011 11:00

Thank you Ken,

The controller is current on updates. As Flash1932 suggested you will want to run Online Diags on the drives. It can be run from the OS and doesn't require a reboot.

See if the drive is failing the diagnostic test.

Online Diags- 

7 Technologist

 • 

16.3K Posts

October 18th, 2011 20:00

Kill it.  There is no point in letting it continue, as it is obviously hung.  I would plan to simply replace the drive.

Moderator

 • 

8.4K Posts

October 18th, 2011 08:00

Ken,

The Perc 5/i does support forcing the drive offline and rebuilding it, while the server is live. You will have a slight performance drop from the rebuild.

Now with the drive, the 3-11-00 sense key is a Medium Unrecovered Read Error, which doesn't refer to a hardware failure specifically, but may be pointing in that direction..

I would like to ask a couple questions to verify it is that the drive needs to be replaced.

Is the drive flashing just green, amber, or amber and green?

What is the Perc 5/i showing for Driver and Firmware versions in OMSA?

11 Posts

October 18th, 2011 09:00

Thanks so much for the reply.

PERC 5/i driver and firmware info:

Firmware Version 5.2.2-0072

Driver Version 2.24.00.32

Storport Driver Version 5.2.3790.4173

Regarding disk 0 flashing, no amber or green flashing.  These drives are all mounted within the internal cage (this isn't a rack server), so we have no visibile light indicators.

Best,

Ken

11 Posts

October 18th, 2011 14:00

Thank you.  I've downloaded and installed the online diagnostics - launched a quick scan on the suspect drive ... the scan reached 90% and has been stuck there for over 30 minutes.  How long should I wait?  Is that an indicator that the drive is likely bad?

Best,

Ken

7 Technologist

 • 

16.3K Posts

October 18th, 2011 14:00

It is a strong indication.  What OS are you running?  If 2003, does it have Service Pack 2?

11 Posts

October 18th, 2011 15:00

Yes, Server 2003 with SP2.

Moderator

 • 

8.4K Posts

October 18th, 2011 16:00

Has it moved forward? What test did it halt on?

11 Posts

October 18th, 2011 18:00

Amazingly, yes - the test has been running now for 4 hours and 24 minutes and it's still stuck at 90%.  It's the disk self test (quick).

Advice?

Thank you!

Ken

11 Posts

October 19th, 2011 08:00

Appreciate all of the assistance - the drive finally died completely late last night ... that was definitely the issue.

October 6th, 2012 05:00

Dear Ken,

i have perc 5e installed along with driver version 2.23.00.32 on P.E 1950 server which is out of dated now when i go to download the recommended firmware version 2.24.00.32 from,dell website but cannot get the subject driver could you help regarding this issue?

 

 

 

Syed Sarfraz Gillani

7 Technologist

 • 

16.3K Posts

October 7th, 2012 21:00

PERC 5/i or 5/E?  And do you need the firmware or the driver?  (Driver, not firmware, is 2.24.00.32.)

No Events found!

Top