Start a Conversation

Unsolved

This post is more than 5 years old

2435

August 14th, 2014 07:00

VNX2 Hot sparing rules

I'm curious if anyone has any direct experience with this because I've heard a lot of contradictory things.

When the VNX2 chooses a hot spare, from the MCx whitepaper, it goes through 4 steps seemingly in this order:

1. Drive type

2. Bus

3. Size

4. Enclosure

What is unclear to me (and this is where I've heard conflicting info) is whether #2 Bus actually takes priority over #3 size. 

So for example, say I have a 600GB hotspare on bus0, and a 900GB hotspare on bus1.  On bus1, I lose a 600GB drive.  Which hotspare will be invoked?  Since hotsparing is now permanent, I would hope that it is the 600GB hotspare on bus0, but the whitepaper seems to indicate otherwise. 

104 Posts

August 14th, 2014 10:00

Hi rzero,

I would say the 900gb drive will have a penalty and the 600gb drive will be used instead. Also i guess a lot of intelligent people already discussed and programmed that way.

Regards

58 Posts

August 14th, 2014 18:00

The array will not select a drive of a different type as a hot spare, so I'm not sure how to take your response.

4K Posts

August 14th, 2014 18:00

When a Hot Spare is needed, the MCR code works through a list of priorities of which drive to select as the Hot Spare for the failed drive. For example if a 400GB SAS drive in DAE 0 on Bus 1 fails, the first rule processed will be to look for an unused SAS drive. If none is available it will take an NL-SAS or SSD drive. Next it will look on the same bus as the failed drive, for whichever drive type it has selected to be the replacement. The third rule it will process is to find a drive with the same size or larger, and finally it will try to find a drive in the same Disk Array Enclosure.

August 15th, 2014 06:00

Hi,

Roger is right. This is how MCx forces the system to do. So he is correct. But the other facts are an MLC failed drive can't replace an SLC drive as a spare drive, but an SLC failed drive can be actually replaced by an MLC drive, (though they are with the same capacity), the reason behind that is MLC is able to store much data than SLC & that is due to it's architecture with utilization of power. In case if MLC drive is not available the system will look at the next step to choose from.

Thanks

Rakesh

58 Posts

August 15th, 2014 07:00

Based on what I'm reading in the MCx whitepaper (and my own personal experience) he is not correct and neither are you.  The MLC drives (Flash VP) are not allowed to hotspare for SLC (Flash) drives, or vice versa.  On page 42, "these two types cannot be a spare for each other."  This is because they are not allowed in the same RAID group, although they are allowed in the same storage pool (assuming they are not in the same RAID group). 

Additionally, swaps between top level types (NLSAS for a failed SAS drive) are also not allowed.  Page 43, which also contains a compatibility matrix, states, "VNX does not allow RAID groups to contain drives from different sets.  For example, SAS and NLSAS drives cannot be used within the same RAID group.  SAS Flash VP and SAS Flash are not allowed to mix either."

Still looking for a definitive answer about my question specifically regarding bus choice vs disk capacity.  I seriously cannot believe that the system would be implemented as the paper describes, such that a larger disk on the same bus would be chosen over a matching size disk on a different bus.

1 Rookie

 • 

20.4K Posts

August 15th, 2014 07:00

where did you see that VNX2 will allow sparing between different drive types ? Look at page 9

https://www.emc.com/collateral/software/white-papers/h10938-vnx-best-practices-wp.pdf

August 15th, 2014 08:00

Better you refer the article 170623. It might redirect you that KB. I have a print out of that. But not the soft-copy. Technically what I believe is SLC can be replaced by MLC if there's no SLC disk available (This content is out of EMC's documentation, which I had gone before some time, on one of the manufacturers' sites). But Yes, EMC doesn't recommend it (at least till those documents, which i have gone through. So apologies for being out of context).

Thanx

Rakesh

August 15th, 2014 09:00

If KBs are meant to deliver something more than white papers then I'm correct. I have gone through this that if you don't have an SLC drive for let's say a particular time-span. You can actually force the system to choose MLC to replace it (on the cost of little performance, cause reading/Writing from SLC & reading/writing from MCL will be different in terms of speed). Once the drive it available replace it again.

Thanks

Rakesh

August 15th, 2014 09:00

Roger W. wrote:

When a Hot Spare is needed, the MCR code works through a list of priorities of which drive to select as the Hot Spare for the failed drive. For example if a 400GB SAS drive in DAE 0 on Bus 1 fails, the first rule processed will be to look for an unused SAS drive. If none is available it will take an NL-SAS or SSD drive. Next it will look on the same bus as the failed drive, for whichever drive type it has selected to be the replacement. The third rule it will process is to find a drive with the same size or larger, and finally it will try to find a drive in the same Disk Array Enclosure.

But I am pretty much sure that this statement is correct. Look at F&Qs on MCx (again I only have the hard-copy so sorry for that) choosing lower drive like replacing a SAS with NL-SAS (of course RPM difference), But it will change the performance of entire RAID Group as per NL-SAL drive. Which manually can be changed later on. That is the reason EMC doesn't recommend.

Thanks

Rakesh

28 Posts

August 15th, 2014 10:00

Hi raid-zero,

When the VNX2 chooses a hot spare it goes through the 4 steps you mentioned from the MCx whitepaper:

  1. Drive Type
  2. Bus
  3. Size
  4. Enclosure

In your example (spare 600GB drive on Bus0, spare 900GB drive on Bus1, and failed 600GB drive on Bus1), the VNX2 would locate all available (unbound) SAS drives on Bus1 with a size of 600GB. The system would then pick one of these available drives of the same drive type (preferably in the same Enclosure, if one is available). In your example, the 900GB drive on Bus1 would be selected as the spare drive to be used and the rebuild process would be started.

The current MCx whitepaper provides this information on page 41, where a note is included that mentions "a replacement drive of the same Type and Bus of a failed drive will always be chosen before a replacement on a different bus." The latest Best Practices Guide provides hot sparing considerations on page 9, where it is recommended to "distribute unbound drives (spares) across available buses."

The current MCx whitepaper also includes Figure 35 on page 43, which is the VNX2 Drive Sparing Matrix that shows the drives that are compatible spares based on the failed drive. In this figure, you can see that SAS Flash or SATA Flash drives (using SLC technology) cannot spare for SAS Flash VP drives (using MLC technology); and SAS Flash VP drives cannot spare for SAS Flash or SATA Flash drives.

Links to mentioned documents:

VNX MCx Multicore Everything whitepaper

VNX Unified Best Practices for Performance Applied Best Practices Guide

58 Posts

August 15th, 2014 11:00

Thank you Charles.  I have gotten some contradictory information from a source I consider authoritative so I wonder if you could tell me if you've actually seen this in action? 

Also (again this may not be something you can even venture, which is OK) do you know why it behaves this way?  Is there some benefit that I am not seeing?  I don't understand why in my example the 600GB drive would not be chosen, considering the permanent hotsparing that is in place now.  In other words, I don't understand why the rules wouldn't be this instead:

1. type

2. size

3. bus

4. enclosure

28 Posts

August 15th, 2014 12:00

We were able to have a discussion with some engineers in this focus area and this is how they explained the hot sparing functionality to us.

1 Rookie

 • 

20.4K Posts

August 15th, 2014 12:00

Charles,

but you confirm that an NL-SAS drive will not spare for a SAS drive, NL-SAS will not spare for an EFD ..and so on ?

8.6K Posts

August 16th, 2014 13:00

Of course it wont

261 Posts

August 16th, 2014 13:00

Correct, we have comfirmed this. A specific type of drive will only be replaced by that specific type of drive. Flash for Flash, SAS for SAS, NL-SAS for NL-SAS. For Flash, they also only spare only to their specific type, either FAST Cache Optimized or FAST VP Optimized.

This is where the VNX2 Drive Sparing Matrix within the MCx Multicore Everything white paper is useful to determine what will spare for what. This was built form testing in lab. The N/A color in a box means that sparing will not occur between those drives. This is the current Sparing Matrix and is subject to change as new codes are released.

Sparing_Matrix.jpg

No Events found!

Top