VNX: How to replace or proactively replace a failed or failing vault drive on a VNX2 array. (Dell EMC Correctable)

Summary: How to replace or proactively replace a failed or failing vault drive on a VNX2 array.

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Symptoms

Issue:
A Vault drive on a VNX MCx array failed and has spared to another location.  If the customer had built some user LUNs across the vault drives and wants to move that data back to the vault drive, how does that data get moved back? 

With Permanent Sparing there is no automatic rebuild equalize operation.  When a regular drive fails and is replaced, there is no automatic equalize operation from the permanent spare  back to the replaced drive.  The drive that replaced the previously failed drive is now part of the Raid Group.

When a failed Vault drive is replaced, the new drive is formatted and the private space is rebuilt from other Vault drives but if a customer created a RAID Group/LUNs on a vault drive, the LUN data is not copied back.  It remains on the drive that it was rebuilt to.  To manually copy the data back to its original location you must use the naviseccli copytodisk command.  

Info:
Vault drives on Next generation VNX2  are the first 4 drives in the array; 0_0, 0_1, 0_2, and 0_3.  
Each vault drive will need approximately 300GB of Private System space to hold MCx Code and other array related data. 

Though it is not recommended to put customer LUNs on the vault drives some customers do. 

Cause

Next Generation VNX does not equalize or rebuild customer data that was built on Vault Drives.  When a Vault drive is replaced, the new drive is formatted and the private space is rebuilt from other Vault drives but customers LUN data is not copied back. To manually copy the data back to ts original location you will need to use the naviseccli copytodisk command.

Resolution

Scenario 1:  Vault Drive has faulted and has already been permanently spared to another Drive on array.  To equalize Customer Data back to its original vault location, do the following:
The naviseccli copytodisk command initiates the copying of data from a configured drive (part of a RAID Group) to an unbound drive. The user can use this command to copy data from any bound disk to any unbound disk, not just from a permanent spare to a replacement drive.
In this Example we are copying from drive 0_1_5 to 0_0_2

naviseccli -h <ipaddress>  copytodisk 0_1_5 0_0_2
WARNING: The data from Source  disk 0_1_5 will be copied to Destination disk 0_0_2. This process cannot be aborted and may take a long time to complete.

would you like to continue the copy? (y/n) y

Copy Back operation will then be initiated.

Scenario 2: Messages indicate the drive is failing.  How to Proactively Replace the failing Vault drive in slots 0,1,2,3 on bus 0

  1.  Make sure to remove all unbound drives on the array .(We do this because any unbound drive can become a permanent spare /Hotspare drive on VNX2 array)
  2. Check the drives in Unisphere or Naviseccli  in slot position 0,1,2,3 and make sure there is no double fault on this set of drives before continuing. 
  3. Once the 5 minute timer has elapsed then insert new drive in slot. Drive need to be removed for at least 5 minutes for full rebuilt of customer luns on vault drive to occur.wait for at least 5 minutes**3  Remove faulted or suspect drive to be replaced from slot and
  4.  The new drive will come online and rebuild its User  luns (if User lun's were configured on vault drives ) will start from the other vault drives.


Note ** Very important to leave drive removed for at least 5 minutes
Flare allows for a drive in a redundant raid group to be offline for a period of time up to 5 minutes  while write I/O to this drive is logged. The actual I/O's are not logged. A bitmap is used to keep track of which address ranges on the drive are dirty. If the same drive becomes accessible again within a the 5 minute limit, the rebuild log will be used to do a quick rebuild of the drive as in this case. This is referred to as a differential rebuild. Once the drive has been removed for more than 5 minutes, a full rebuild of luns from the other vault drives will occur.  If there are no user luns configured on the vault drives then there will be no need to rebuild the user LUNs.

Affected Products

VNX2 Series

Products

VNX5200
Article Properties
Article Number: 000029126
Article Type: Solution
Last Modified: 17 Jun 2025
Version:  4
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.