Start a Conversation

Solved!

Go to Solution

2059

April 26th, 2021 08:00

Replacement on PE R710 and Perc 6

I have a R710 and PERC 6i running ESXi.  What is the method to use to hot-swap a failing drive?  It is not running OM.

2.9K Posts

April 27th, 2021 13:00

I would 1,000% agree with Joerg. OMSA would provide you a GUI to take these steps in. However, OMSA installation in an ESXi environment *DOES* require a reboot, so you'll want to take that into account, if you do choose to install the tool. You can find the latest version of OMSA for your version of ESXi using the support page linked below. I'd have linked the latest version of OMSA, but I didn't see your ESXi version posted.

 

https://dell.to/3gFpS3f

 

https://dell.to/3vtfqQD

 

To answer your question directly, you can shut the server off, remove the disks, power the server back up, boot into the PERC BIOS, install the new drives, and use the management tools there to view the state of the replacement disk.

Moderator

 • 

8.4K Posts

April 30th, 2021 12:00

That is not correct, the process you listed could have corrupted the Virtual Disk, as you never force a blank drive online. 

 

You need to do as follows;

 

1. Power down the server

 

2. Remove the Predicted Failure drive

 

3. Power the server on

 

4. Access the controller

 

5. Insert the replacement

 

      5b. If the drive doesn't start rebuilding automatically then assign it as a                  hotspare.

 

6.  Boot to OS and allow the drive to rebuild

 

7. Let us know how it goes.

 

 

Moderator

 • 

8.4K Posts

April 26th, 2021 12:00

GM_XISS,

 

If the drive is FAILED, and OFFLINE then you would just simply pull the drive and then insert a replacement drive about 20 seconds later. If the drive is FAILING and is still ONLINE, then you would need to force the drive offline first, then remove drive and insert replacement. Now since you have ESXi and no OM you could do this via RACADM commands, where the command would be 


racadm storage forceonline:Disk.Bay. #

# being the drive in question.
 Let me know if this helps.
 

20 Posts

April 27th, 2021 05:00

So are you saying login to my ESXi ssh and run this command?  Even when doing esxiscsi commands, it only saw one drive. 

It's in predictive failure state.  It's drive 2 of 4.  So technically it's drive 1 (0-3).  Does this command use actual numbers or does 0=1?

I want to make sure to get this right.

Also, I thought we could hot swap, meaning, if I remove the drive, wait for at least 30 seconds (for it to be marked offline), I could then insert a new one.  Please let me know if this is the case.

 

Thanks so much for the reply!

 

Moderator

 • 

8.4K Posts

April 27th, 2021 06:00

You would need to install iDrac tools on the system, now I am not certain your esxi version so this is just an example. 

 

With the command the drives designation is based on the drives FQDD (Fully Qualified Device Descriptor). 

 

You can do that on a normal failure, but a pred fail that is still online has the ability to rebuild those bad blocks to the replacement. Forcing it offline avoids that. 

 

The other way to offline the drive would be to shut down the server, then remove the drive once it is powered down, then power back up before you install the replacement.

 

 

20 Posts

April 27th, 2021 08:00

Alright, I have the ability to shut down the system.  So can you please outline the exact steps I'd take to replace the drive (given I don't have the tools installed).

Pardon my naivety but I am not able to install such tools at this time.

Thanks!

4 Operator

 • 

3K Posts

April 27th, 2021 08:00

@DELL-Chris H , Racadm command mentioned might not be supported as it is R710 server with iDRAC6. The command mentioned is supported from iDRAC7.

 

@GM_XISS  As Chris mentioned  you can remove the drive after shutdown of the server.

20 Posts

April 27th, 2021 11:00

I'm going to power down, remove the drive, power on, then wait until it's fully booted and then insert the new replacement - is that correct?

 

 

4 Operator

 • 

1.7K Posts

April 27th, 2021 12:00

You should install OM(SA) for ESXi

Regards,
Joerg

20 Posts

April 27th, 2021 14:00

Thanks for all the advice, but having a RAID controller, it should be able to offline a drive and rebuild the array with a new one inserted. 

I'm not sure how to install OMSA or whatever.  This is an old R710 with ESXi 6.5.  There's very little support.  I don't even use OMSA, so that's something else I'd have to learn.  I need the easiest way to do this in the shortest amount of time.

4 Operator

 • 

3K Posts

April 27th, 2021 20:00

What raid level you have for the virtual disk for which you are trying to replace one drive? Is the disk already in failed state? Is there a dedicated or global hot spare configured on the controller?

The method you captured looks correct. After inserting the new drive you can go to PERC BIOS configuration utility (Press Ctrl+R during boot when prompted) and check whether new drive is detected and rebuild is happening for the virtual disk. If not you can add new drive as dedicated hot spare for the virtual disk to initiate rebuild.

Note : It is recommended to backup your data before carrying out any maintenance tasks.

20 Posts

April 30th, 2021 12:00

Hello all,

I tried finding the OMSA utility and it doesn't have the version of ESXi that I have installed.  I'm going to have to pull the drive.  So just to recap, pull the drive, power down, then power up.  Then login to ESXi and issue the shutdown.  The I go to the PERC configuration with CTL+R during POST and insert the new drive and mark it online.  Does this sound about right?  I haven't done it this way in years, and on SuperMicro's you can just pull a failing drive, wait a minute, and plug the replacement drive in - all without powering off the system. 

Could someone please verify this before I end up messing up the RAID array?

 

Thanks,

Gabriel

4 Operator

 • 

1.7K Posts

April 30th, 2021 14:00

Sounds absolutely right. The key is to mark the new disk as a hotspare.

@GM_XISS 

About the OMSA. Here is a link for OMSA 9.4 for ESXi 6.5u3 https://dl.dell.com/FOLDER05993176M/1/OM-SrvAdmin-Dell-Web-9.4.0-3776.VIB-ESX65i_A00.zip

The problem is since OMIVV doesnt support the old 11 gen. any more iam not 100% if a OMSA 9.4 supports the R710. But i can send the links to the previous versions if needed.

Btw. ESXi 6.0 is the last supportet version on paper... but in reality 6.5 and when having Intel Westmare also 6.7 runs on the R710.

 

Regards,
Joerg

20 Posts

May 1st, 2021 07:00

No need for OMSA as we are not going to be using that.  I will manually power down the server, remove the drive, boot into CTL+R (controller config), then insert the new disk and see if it starts to rebuild.  Otherwise mark as hot spare. 

Fingers crossed!

 

I am wondering though, I was advised by another tech that you could actually hotswap the bad fo the good.  How come noone suggested this (without using OMSA)? 

20 Posts

May 2nd, 2021 09:00

Ok well, all well horribly!  As soon as I powered off the machine, Drive 4 died also... so, two dead disks in a RAID 5 = bad news for the boss.

Things happen though, and lesson learned.  We're going to add the VMs to the monthly backups.

My final question though, does the Perc 6 support RAID 10?  We're going to go either RAID10 or RAID6 + HS.

No Events found!

Top