Highlighted
1 Copper

Replacing non-failed drive in mirror

Hi,

I'm trying to work out the correct procedure to replace a drive in a mirror that has had a SMART failure prediction.  The drive is still running and the array is still showing as optimal but the disk is showing as "failure predicted."  

The server is an R720 with embedded PERC H710P Mini.  It's running Linux and we have Dell OMSA and MegaCli64 installed.  I need to do this without downtime.

We have 2x 900GB disks in a RAID-1 volume (virtual disk 0) and 7x 600GB in RAID-6 (virtual disk 1).  We are not using hot spares.  The problem disk is the second disk in the RAID-1 array - 0:1:1 (or [32:1 according to MegaCli).

I've found a walkthrough guide but I'm not sure if this is the correct procedure and I can't afford to "test" it or get it wrong:

Walkthrough: Change/replace a drive

Set the drive offline, if it is not already offline due to an error

MegaCli64 -PDOffline -PhysDrv [32:1] -a0

Mark the drive as missing

MegaCli64 -PDMarkMissing -PhysDrv [32:1] -a0

Prepare drive for removal

MegaCli64 -PDPrpRmv -PhysDrv [32:1] -a0

Change/replace the drive

            Pull out the problem disk and insert replacement disk.

Re-add the new drive to your RAID virtual drive and start the rebuilding

MegaCli -PdReplaceMissing -PhysDrv [32:1] -Array0 -row1 -a0
MegaCli -PDRbld -Start -PhysDrv [32:1] -a0

Some notes/questions about this:

1) Auto-rebuild is enabled on the controller.  Are any of the above steps unnecessary?

2) Any idea how long this might take?  I know it's highly dependant on the rebuild rate setting and the amount of activity on the server but I'm just after some real world examples on similar hardware.

3) With the -PDReplaceMissing command, I've based the values of -ArrayN and -rowN on the following output (cut to show the relevant disk):

# ./MegaCli64 -CfgDsply -aALL

==============================================================================
Adapter: 0
Product Name: PERC H710P Mini
==============================================================================
Number of DISK GROUPS: 2

DISK GROUP: 0
Number of Spans: 1
SPAN: 0
Span Reference: 0x00
Number of PDs: 2
Number of VDs: 1


Physical Disk: 1
Enclosure Device ID: 32
Slot Number: 1
Drive's position: DiskGroup: 0, Span: 0, Arm: 1
Enclosure position: 1
Device Id: 1

From what I've read, the -ArrayN number is the Span Reference from the above output, but examples I've seen show a '0' in the command rather than the 0x00 that I see in my output.

Thanks for your assistance,

Mark

Reply
1 Reply
Highlighted
Moderator
Moderator

RE: Replacing non-failed drive in mirror

Hi,

You don’t need to do all of those steps. Offlining the drive, MegaCli64 -PDOffline -PhysDrv [32:1] -a0

Is the only step that you need to do before swapping the drive. When you put the new drive in it should rebuild automatically.

Thanks,
Josh Craig
Dell EMC Enterprise Support Services
Get support on Twitter @DellCaresPRO
Reply