Start a Conversation

Unsolved

This post is more than 5 years old

2310

February 28th, 2011 14:00

Hotspares across the bus and pools

I have following setup on my secondary CX4-240...

5 DAEs - 3 FC, two SATAN

One of the FC DAE's has 2Gbit 300Gbit drives, we when upgraded from a CX3-20 we put a DAE on the second bus with the 2Gbit drives in it.  Reset the first bus and voila 4Gbit.

I have a hot spare in all of the FC shelves (1x146 4Gbit, 1x300 4Gbit, and 1x300 2Gbit).

Now that I'm on Flare 30, I'd like to consolidate my MV/A destinations in a pool using the 2Gbit drives and adding an additional 4+1 set that is on the other DAE on the other bus.  I went through some docs and didn't see much (maybe haven't gotten to the right one yet) but my questions are...

1) Is it OK to have the hotspare for the 2Gbit DAE on the other bus?  Are there problems/gotchas with that?

2) Any problems with the pool being spread across the two buses?  I would think not but other than the fact that one 4+1 would have 4Gbit backend connections and the other one has 2Gbit - but this is the DR site I already accept the fact that if we fail over it won't be as zippy as the primary site.  There are a helluva lot more negative devices on the performance side (i.e. slower servers) at the DR site already that a little slower disk performance probably won't be noticeable.  Right now the MV/A secondaries are spread across whatever it would "fit into" when this was done...some SATA, a 3+1 R5, and a 4+1 R5...I can't see it being worse with 20 FC disks in a pool.

From what I understand, if I want the data to be spread out evenly across all of the disks I'll want to put all of the disks that I foresee needing into the pool prior to migrating any LUNs into it.  In the end, I'm thinking I'll have (4) 4+1 R5's with 300GB 10K disks in this pool.

Thanks

Dan

5.7K Posts

March 1st, 2011 05:00

SATAN ! Cool... I wonder who you got this from, hehehehe

392 Posts

March 1st, 2011 06:00

EMC CLARiiON Best Practices for Performance and Availability in the Hot spares section describes the algorithm for hot spare selection.  This document is available on Powerlink.

>1) Is it OK to have the hotspare for the 2Gbit DAE on the other bus?  Are there problems/gotchas with that?

Yes.  It will be slower.

>2) Any problems with the pool being spread across the two buses?

You should read the EMC CLARiiON Virtual Provisioning whitepaper before going forward.  It will explain a lot of things. You give VP a number of drives, and it allocates them; the Wizard is in control.  It so happens that if the drives you provide are physically located on separate buses it does a round-robin allocation between the buses.  Its all in the paper.

>3) From what I understand, if I want the data to be spread out evenly across all of the disks I'll want to put all of the disks that I foresee needing into the pool prior to migrating any LUNs into it.  In the end, I'm thinking I'll have (4) 4+1 R5's with 300GB 10K disks in this pool.bu

Yes, this is correct.  The widest distribution of data, which provides the optimal use of the feature, occurs when you do exactly as you've described.

190 Posts

March 1st, 2011 07:00

Hot spares seem nebulous as always - but it seems from reading this that if the HS is on the other bus it'll look for one with enough capacity on the same bus which in my case would be a 1TB HS.  This would certainly be slower than a 2Gbit drive on a different bus.  This is my DR site, though, so that might not be an issue as 1) the vast majority of the time it sits pretty much idle as a replication target and 2) if a drive fails it'll get replaced quickly by EMC so any performance degradation would only be counted in hours not days or weeks.

I read the VP whitepaper and I don't see in the document a description that includes any talk about the bus being a factor in data placement.  In fact, if you search for "BUS" in the document it appears nowhere - well, not exactly true the word "business" comes up several times ;).  This is where things are a bit fuzzy - if I have 3 of the R5 groups on the second bus and one on the first, are you saying it'll go back and forth and "fill" the one on the first bus with data before it fills out the one on the second?  In other words, four of these R5 groups equate to roughly 4TB.  If I write 2TB into the pool, the disks on the first bus will fill to capacity and I'll have 2TB roughly free in the pool on the second bus?  If this is the case, here are my two possible scenarios for configuring this...

Scenario 1 -

Two 4+1's on the first bus and two 4+1's on the second bus.  Data is striped evenly across all of the components in the pool.  A 300GB 2Gbit drive hot spare will be on the second bus.  A drive failure on the first bus would invoke the SATA drive on the first bus instead of the 300GB HS on the second bus.  A drive failure on the second bus would invoke the 300GB drive on the same bus.  Yes, I know, the backend bus speeds are different but with only one shelf on the second bus I wouldn't think that should make much of a difference.  Every time one of the 2Gbit connected drives take a nosedive EMC replaces them with 4Gbit-capable drives...eventually it'll dwindle down far enough that I'll replace them with 4Gbit connected drives and reset the bus   And yes, the DAE can do 4Gbit...already checked into that.  I also suppose that I can replace the 146GB HS on the first bus with a 300GB+ hot spare though I am wandering away from the question at hand...

Scenario 2 -

One 4+1 on the first bus and Three 4+1's on the second bus.  Data is NOT striped evenly across all of the components in the pool.  No hotspare on the second bus and a 300GB 4Gbit connected HS on the first bus.  A drive failure on the second bus would HAVE to use a HS from the first bus as there is no HS available on that bus.

474 Posts

March 1st, 2011 08:00

The hot sparing rules apply for a specific technology though. An FC Raid group will not spare to a SATA disk or vise-versa. FLARE will pick spare on bus 2 in scenario A. You must have hot spares of sufficient size for each drive technology in the array. EFD spares for EFD drives, FC spares for FC drives, SATA spares for SATA drives, etc..

The sort of nebulous thing in the way FLARE handles sparing as you may have noticed is that the spare does not have to be the same size as the failed disk. If, for example you have a RAID Group of 300GB drives but only 100GB per disk has been bound in the RAID group, then a 146GB hot spare would be sufficient and might be chosen for sparing. FLARE rebuilds LUNs, not entire spindles, when sparing.

1 Rookie

 • 

20.4K Posts

March 1st, 2011 08:00

Richard Anderson wrote:

The hot sparing rules apply for a specific technology though.  An FC Raid group will not spare to a SATA disk or vise-versa.

you sure about that ? I read otherwise in the white paper about hot spares

474 Posts

March 1st, 2011 09:00

I stand corrected. EFDs still require an EFD spare but FC and SATA can spare for each other apparently. The algorithm shown in the white paper indicates that in use capacity is the first rule though. It will pick the smallest spare that will fit the in-use data.

So if your data drives are 300GB and you have a 300GB and 2TB spare in the system. FLARE will pick the 300GB spare first because it’s the smallest size available that fits the data. Only if the spares are of equivalent size does it then look at disk location/bus.

392 Posts

March 1st, 2011 09:00

I'm sorry, but I don't understand your Scenario Post.

With Virtual Provisioning (VP) you should not get wrapped around the axle with positioning the data.  If you feel you need the deterministic performance that comes with exact data placement on to the platform's storage, your application is not a candidate for VP.  Use FLARE LUNs in that case.  They give you that capability. 

With regard to pools, the drives you provision the pools with are grouped into RAID groups by the Wizard.  The Wizard distributes the pool's RAID groups across the available busses in round-robin fashion.  If you provisioned your pool with 20-drives using RAID 5, and you have 4-buses (say a CX4-480), you'd have 1x (4+1) per bus.  If you provisioned with 25-drives using RAID 5, and you have 4-buses, one of your busses would have two of the pool's (4+1)s provisioned; the remaining buses would have one RAID group each.  Bottom line, the Wizard spreads the load across the buses automatically.

I hope this helps.

190 Posts

March 1st, 2011 09:00

According to the  EMC CLARiiON Global Hot Spares and Proactive Hot Sparing white paper, that's not right.  FC and SATA II drives are interchangeable.  If you read how the algorithm works, in my scenario with a 300GB HS on the 2nd bus and a 1TB SATA II drive on the first bus, it will choose the SATA II drive if there is a failure on the first bus.

Dan

190 Posts

March 1st, 2011 12:00

Eh, I wouldn't say I'm getting "wrapped around the axle" about anything.  This is just the first time I'm puting together a pool and I want to do it correctly. 

I think the confusion has more to do with some comments that were actually about the provisioning process and not how the data is written in the pool.  The wizard configures the disks but I got the (likely) mistaken impression that data is written round-robin across the different buses in the system.  That didn't make any sense to me probably because that is not the case.  

392 Posts

March 2nd, 2011 06:00

Please note, that the private RAID groups are just one part of how data is stored and distributed throughout the pool. The distribution of data throughout the pool involves a complex algorithm, which includes handling of many exceptional cases.  The details of this algorithm are beyond the scope of this forum.  However, the distribution of data within a pool is handled at three levels:

  • Private RAID groups (described above)
  • Private LUNs
  • 'Slices'

The overall effect is that data is spread as widely over the available pool storage as is practical. Each level incorporates performance optimizations.  For example, the Private RAID groups are configured to 'buss balance'.

6 Posts

March 10th, 2011 11:00

Since we are talking about the EMC CLARiiON Global Hot Spares and Proactive Hot Sparing white paper, I wanted to know if there is an updated paper of the same title covering Flare 30. I am also interested in knowing what the rules are regarding SSD drives. The previous paper was released in 2007.

I have a few more questions that are related. Lets say that a drive fails in a RAID 5 group. And it so happens that 90% of the data on the failed drive is sitting either on the Write cache or the read cache. Now during rebuild can I use the data sitting on the cache and copy them back to the replacement drive and the 10% that is not there may be we can do parity calculations to get the data back. Is that possible

Regarding Probational drives and rebuild loggingm where is the logging done. I was assuming that it will be done on the cache or the SPs. Are there any white pages that explains this specific issue in much greater depth

Finally can we have SSD drives as vault drives and can we have SSD drives as global hot spares,

727 Posts

March 11th, 2011 15:00

Answers inline...

Since we are talking about the EMC CLARiiON Global Hot Spares and  Proactive Hot Sparing white paper, I wanted to know if there is an  updated paper of the same title covering Flare 30.

The basic technology remains the same and is applicable with the storage systems today. The most recent whitepaper on Hot Spares has been updated in late 2009.

I am also interested  in knowing what the rules are regarding SSD drives. The previous paper  was released in 2007.

For performance reasons, we do not allow other drive types to hot spare for flash drives. Only a flash drive configured as a hot spare can spare for a failing flash drive in the storage system. Also, a flash drive hot spare can spare for a failing flash drive only. Hot spare rules for other drive types remain the same -- Fibre Channel and SATA drives can hot spare for each other.

I  have a few more questions that are related. Lets say that a drive fails  in a RAID 5 group. And it so happens that 90% of the data on the failed  drive is sitting either on the Write cache or the read cache. Now  during rebuild can I use the data sitting on the cache and copy them  back to the replacement drive and the 10% that is not there may be we  can do parity calculations to get the data back. Is that possible

Hot spare works with data on the failing disk. It will copy whatever is there on the drive to the hot spare (in case of proactive hot sparing).

Regarding  Probational drives and rebuild loggingm where is the logging done. I  was assuming that it will be done on the cache or the SPs. Are there any  white pages that explains this specific issue in much greater depth

Yes, the logging data is stored in the system cache. Rebuild Logging is discussed in Hot Spare whitepaper on Powerlink. Here is the link:

http://powerlink.emc.com/km/live1/en_US/Offering_Technical/White_Paper/C1069_CLARiiON_Global_Hot_Spares_ldv.pdf?

Finally can we have SSD drives as vault drives and can we have SSD drives as global hot spares,

As mentioned above, flash drives can be configured as hot spares.

No Events found!

Top