GeforceXP
1 Nickel

RE: Degraded RAID 1 Array:PowerEdge 2800

Jump to solution

I've tried refreshing the page, and deleting the history/cache etc, but it still doesn't show - (pause 2 minutes) The rescan feature worked! Hurrah.

So, I'm now back to square one, where I have a RAID1 system with one predicted disk failure.

What would you do from here on in?

Furthermore, I can't test those drives I purchased as my other PE2800 has all 8 slots taken by 10krpm drives, and my other server has 15k drives. Common sense says I cannot mix-match 10k/15k rpm drives!

So, just go out and purchase another 73GB drive? Any thoughts would be very welcomed.

Edit: Dell OpenManage Server Administrator Version 6.2.0

Thank you.

 

0 Kudos
theflash1932
5 Iridium

RE: Degraded RAID 1 Array:PowerEdge 2800

Jump to solution

"So, I'm now back to square one, where I have a RAID1 system with one predicted disk failure.  What would you do from here on in?"

Another thing that many people don't know is that with a pf drive, it should be forced offline before replacing, otherwise the pf flag can be assigned to the new disk (until an actual 'rescan' is performed) and can result in some odd behavior (the controller is more sensitive to potential failure of pf drives).

"Common sense says I cannot mix-match 10k/15k rpm drives!"

You CAN mix/match speeds and/or sizes (there is nothing wrong with it 'technically'), but in real-world scenarios, you would never want to mix speeds (depending on the configuration, it would degrade the speed of the entire array).  One thing you CAN'T do is mix U320 and U160 SCSI drives on a backplane/controller.

"What would you do from here on in?"

I guess from here depends on what you feel most comfortable doing.  There is a chance that the pf drive will operate just fine until you are done with the server (or until you can get another replacement - if you feel another replacement would be better than the ones you currently have), although it could cause additional issues if its status degrades.  There is a chance that it could happen again if you try to replace it, but I believe it is unlikely to happen again.  If it were me (and it's important to remember that it's not :)), I would force offline disk 0 and replace it "hot" with the other 146GB drive.  Another thing you could do, which would probably feel less risky, is to simply insert the 146GB drive into an open slot, assign it as a hot-spare, then force offline disk 0, which will cause a rebuild to automatically start with disk 0.

 

0 Kudos
GeforceXP
1 Nickel

RE: Degraded RAID 1 Array:PowerEdge 2800

Jump to solution

Please forgive my lack of response (I'm in bed on my iPhone as it's nearly midnight here)

Firstly I wasn't aware that one should force offline a PF disk, but  that does make sense.

Just going back to what you said last about inserting the drive in to one of the open slots which I do have in this PE2800, what you're saying is, the PE will automatically rebuild the array to any hotspare that's available in the system once it detects a degraded array?

if I'm wrong please tell me I am.

Also can I assume then, if I make a disk 'offline' essentially that's a 'good copy' of the said data at that time before it was taken offline?

I'm just thinking that if something was to go horribly wrong, it'd nice to have a disk that's a copy of the OS that I can just pop back in and force online, even if it was degraded/PF.

I should be very honest here at this point and say that I've not had any experience with hotspares and what their function are.

Very very early start for me tomorrow - a downed mailsever would be catastrophic as you can probably imagine.

thanks again for your for your replies !!

0 Kudos
theflash1932
5 Iridium

RE: Degraded RAID 1 Array:PowerEdge 2800

Jump to solution

"Just going back to what you said last about inserting the drive in to one of the open slots which I do have in this PE2800, what you're saying is, the PE will automatically rebuild the array to any hotspare that's available in the system once it detects a degraded array?"

That's right.

"Also can I assume then, if I make a disk 'offline' essentially that's a 'good copy' of the said data at that time before it was taken offline?"

I wouldn't make that assumption, especially about a disk with a predicted failure, but it could be used in its previous state to get the system back online.  If you ever do this though, you won't "force online" the drive ... you will need to boot to CTRL-M, Configure, View/Add and choose Disk View and save on exit to essentially "import" the configuration from the disk.  Only if it is the only disk available in the RAID 1 and currently showing FAILED should "force online" be used.

Hot spares are simply drives that are dedicated to automatically take over and rebuild should a disk fail.  Very helpful for systems that are not actively monitored and located in an area where an audible or visual indication of a failed drive can be heard/seen.  Often times, people will go for days, weeks, or even months with a failed drive, then they lose everything when the second drive dies, because they had no idea that the first had failed.  A hot-spare helps mitigate data loss - or downtime - in those types of situations by simply rebuilding when one fails.  It is also the ONLY way to rebuild a drive in some situations (when a drive does not show as 'failed' - you cannot 'rebuild' a 'ready' drive - only a 'failed' drive).

Get some sleep 🙂

0 Kudos
GeforceXP
1 Nickel

RE: Degraded RAID 1 Array:PowerEdge 2800

Jump to solution

Of course, logically the term "Hot Spare", means it's a hot-spare, as and when one drive fails. Duh! emoticon.Cool.title

I can share the experience of having servers that are not actively monitored, apart from logging in via OMSA now and again, and in some cases I've had, like you said drives that have failed days, or even a few weeks, but not months.

So, I think I'm at the point where I'm going to load another 146GB drive I purchased (as I bought two) in to a spare slot, and see what happens. I can assign it as a global Hot Spare through OMSA and then I have the choice of either waiting for the PF drive to fail, or manually take it offline.

In this scenario, would you say the best practice is to pull the PF drive whilst "hot", or go through the OMSA to take it offline? (not even sure if it can be done this way)

Thanks.

Edit:

I should add that the server can support 146GB SCSI drives as the drives in 1:2 and 1:3 are both 146GB 15k drives - Hitachi's.

The ones I've purchased are Seagate Cheetah 15k.5 model number ST3146855, which of course are U320's.

The theory is, these drives *should* work, but after my initial experience with the first one, no wonder I'm extremely hesitant!

0 Kudos
theflash1932
5 Iridium

RE: Degraded RAID 1 Array:PowerEdge 2800

Jump to solution

"In this scenario, would you say the best practice is to pull the PF drive whilst "hot", or go through the OMSA to take it offline?"

It can be done either way, but best practice with a pf drive is to force it offline from OMSA before removing it.

"I should add that the server can support 146GB SCSI drives as the drives in 1:2 and 1:3 are both 146GB"

Good to know, since you just referred to them as "something" in  your first post 🙂

0 Kudos
GeforceXP
1 Nickel

RE: Degraded RAID 1 Array:PowerEdge 2800

Jump to solution

Ha ha ha, yes you're correct I did indeed refer to them as "something"...!

I presume to take a disk offline it's under PERC 4e/Di > Connector 0 (RAID) > Enclosure (Backplane) > Physical Disks > "Disk in question" > Drop down box > "Offline ...." ?

This is the option I'm looking for after my array has successfully rebuilt, right? emoticon.Smile.title

I owe you a beer or two.. or three if this works 🙂

 

0 Kudos
theflash1932
5 Iridium

RE: Degraded RAID 1 Array:PowerEdge 2800

Jump to solution

" presume to take a disk offline it's under PERC 4e/Di > Connector 0 (RAID) > Enclosure (Backplane) > Physical Disks > "Disk in question" > Drop down box > "Offline ...." ?"

You got it 🙂  Although, taking the drive offline is the option you will be looking for first (after assigning the hot-spare), not "after [the] array has successfully rebuilt" ... after taking it offline, the spare will begin to rebuild, then you can remove and toss disk 0.

0 Kudos
Highlighted
coolguy3384
1 Copper

RE: Degraded RAID 1 Array:PowerEdge 2800

Jump to solution

where can I get  storport driver version 5.2.3790.4173

0 Kudos
theflash1932
5 Iridium

RE: Degraded RAID 1 Array:PowerEdge 2800

Jump to solution

You can only get it from Microsoft, but Dell has links to it on their Support site. If you want a link, we need all the info about your situation - server, controller, OS, etc.

0 Kudos