1 Rookie

 • 

6 Posts

6065

August 11th, 2020 10:00

Raid 5 Replace failed single HDD

I have a PowerEdge R900 server, it has a RAID 5 array, with 3 HDDs.
Currently it is operating with only 2 HDDs since 1 HDD failed, what is the correct way in which I should install the new HDD (not hot swap)?
As i understand I have to shut down the server and insert the new HDD, but what procedure should I follow in the Bios Configuration Utility to ensure a successful installation?

aquiros_0-1597168769535.png

 

4 Operator

 • 

2.9K Posts

August 12th, 2020 08:00

The import foreign and clear commands are generally more for when an extant virtual disk has some sort of sync error. Clearing and importing (depending on the situation) are ways to bring a member disk back into the array. This new disk you have isn't a member at this time, so those I wouldn't expect those options to show up. For example, say someone has a RAID 5 and something causes one of the drives to fall out of sync, like a timeout condition or a power issue, I'd expect a possibility of seeing a need to import then. The key is that the drive would have a RAID metadata stamp already written to it, an identifier indicating that it is supposed to be an active member of an array. 

4 Operator

 • 

2.9K Posts

August 11th, 2020 15:00

Hello,

 

Using the PERC utility won't be necessary, the rebuild can be checked and confirmed within the OS. You'll just need to shut the server down, remove the bad drive, install the replacement, and the controller should automatically begin to rebuild onto it. If OpenManage Server Administrator is installed, you can confirm that it begins rebuilding within there. If it doesn't, you'll just need to mark the replacement drive as a hotspare. You can take both of these steps within the PERC utility as well, but it isn't necessary.

1 Rookie

 • 

6 Posts

August 12th, 2020 08:00

Thank you very much for your help.
I understand what you say, and so with new HDD in place, when the server turn on the there is no need to add it to the disk group? Sometimes I see that for the HDDs it says Import Foreign or Clear, should any of these options be selected for the new drive?

 

1 Rookie

 • 

6 Posts

August 12th, 2020 09:00

Ok, got it, thank you very much for your help and information.

1 Rookie

 • 

6 Posts

August 13th, 2020 10:00

Thank you, as a hot spare, does it affect the operation of Raid 5 at all? Data can be compromised?
This server has two HDDs that have to be changed, this is the first one I changed, then I will have to change another.

4 Operator

 • 

2.9K Posts

August 13th, 2020 10:00

It looks like the controller doesn't seem to like the drive. Have you tried marking it as a hotspare? If you've already tried that, I'd recommend exporting a controller log so that we can get a better idea of why the controller is behaving in this manner.

1 Rookie

 • 

6 Posts

August 13th, 2020 10:00

Thank you for your help,

I'm on site now, i already changed the HDD with the newone, the server alredy boots but new HDD doesnt seem to be rebuilding:

 

-------------------------------------------------------------------------
EID:Slt DID State DG Size Intf Med SED PI SeSz Model Sp
-------------------------------------------------------------------------
32:0 0 Onln 0 136.125 GB SAS HDD N N 512B ST3146356SS U
32:1 1 Onln 0 136.125 GB SAS HDD N N 512B ST3146356SS U
32:2 2 UGood - 136.125 GB SAS HDD N N 512B ST3146356SS U
-------------------------------------------------------------------------

I'm trying to set it online bu i got this error:

[root@icedb1 perccli]# ./perccli64 /c0/e32/s2 set online
Controller = 0
Status = Failure
Description = Set Drive Online Failed.

Detailed Status :
===============

------------------------------------------------
Drive Status ErrCd ErrMsg
------------------------------------------------
/c0/e32/s2 Failure 255 Operation not allowed.
------------------------------------------------

 

4 Operator

 • 

2.9K Posts

August 13th, 2020 11:00

Doing anything to the replacement drive at this point in time will not affect the array, because it isn't an array member. Flagging the new drive as a hotspare will only communicate to the controller that it can be used to 'heal' any array in a degraded state. Since we only have the 1 degraded array, it would rebuild the RAID 5. 

 

That said, if the same array has 2 drives in need of replacement, the conversation is a bit different. The array may not be able to recover, even if the drives are just predicted to fail. For example, if drive X and drive Y both have bad blocks within the same data stripe, then that stripe would be bad and would not be able to rebuild. This would be a double fault condition. I'll link to an article below that will go into detail on this.

 

Also, sending you a private message. I'm going to provide you with my email address. If you'd like me to look over your controller log, you can email it to me directly.

1 Rookie

 • 

6 Posts

August 13th, 2020 12:00

Thank you for your reply, and not really. 

No, there have not been made changes to that storage chain.

I'm shuting down the server to insert the HDD in another available slot, just to test.

If we click F2 on the VD on the Perc 6/1 Bios Utility, i have the options to set the HDD as:

Manage Ded. HS.

Im sharing with you in  private message what i see in the PERC utility.

 

4 Operator

 • 

2.9K Posts

August 13th, 2020 12:00

It appears there's an additional underlying issue.

Event Description: Enclosure PD 20(c None/p0) phy bad for slot 2

This would indicate that a cable or connector has failed. I'm also seeing some entries showing failed commands and bus resets. These might be addressed with updating the PERC firmware, but the PHY bad message is concerning. Has any of the storage chain behind the drives been replaced recently?

No Events found!

Top