Start a Conversation

Unsolved

This post is more than 5 years old

8051

March 30th, 2014 08:00

V-Max FAST-VP Capacity Planning Best Practices?

My company recently deployed FAST-VP on a V-Max 10K using 3 disk pools, EFD, FC and SATA.

After running FAST-VP for 8 months it's apparent that the FAST controller over weights the higher tiers of disk (i.e. EFD and FC).

Our EFD pool is always above 95% utilized, which is good, our FC pool hovers around 70% utilized and SATA around 30%.

So my question is at what level of FC pool utilization should we be planning to add more disk? 75%, 80%, 85%?

It seems like the FAST controller is demoting from the FC tier, to the SATA tier in order to avoid running out of space.

We don't want to get into a situation where we're demoting data because we HAVE to, we only want to demote to SATA when the data is truly cold.

Thanks.

213 Posts

March 30th, 2014 15:00

tzvb32, Let me first give you  a brief about how FAST controller works,  FAST controller makes its decisions on extents promotion/demotion based on promotion/demotion thresholds run on sub lun level. These thresholds are calculated using the collected performance metrics and the available capacity on each tier. The goal of these thresholds is to maximize the utilization of the highest performance tier while demoting inactive or less accessed extents to less performing tier (like SATA)

For your case, I won't make my decisions to add new disks based on FAST movements. You should consider the amount of allocated tracks for your TDEVs and the over-subscription ratio for your pools. These would be the factors which will make me decide to add new disks.

There is a feature that i highly recommend you to enable, VP Allocation by FAST policy. with this feature enabled you will never have your IOs halted due to lack of space of the pool your TDEVs are bound to.This feature allows new allocations to come from any of the thin pools included in the FAST VP policy that the thin device is associated with.

Once you enable this feature, It attempts to allocate new writes in the most appropriate tier first, based on available performance metrics. If no performance metrics are available, an attempt is made to allocate to the pool to which the device is bound. If the pool initially chosen to allocate the data is full, FAST VP then looks to other pools contained within the FAST VP policy and allocates from there. As long as there is space available in at least one pool within the policy, all new extent allocations will be successful.

You can also control FAST movements by increasing the PRC on Pool level.

I think FAST VP is working as expected in your case I have 2 question though, May i ask what is your FAST VP policies configuration? and which pool TDEVs are bound to?  is it only FC pool or a mix of FC and SATA pools?

Hope it helps.

Mohammed Salem @yankoora

859 Posts

March 31st, 2014 03:00

We have set the pool alert at 80% for FC & Sata, we dont care about EFD much as we don't bind any tdevs to EFD pool (its only for FAST VP purpose). Mgmt are usually slow when it comes to approving the PO to buy more disks. Usually  it takes 1 month in our company to approve PO and around 15-20 days for EMC to do shipping and installation. So, just to keep a bit of cushion, we have set the alert for FC & SATA at 80%.

Also, we keep a monthly check on the pool utilization and look at the usage trend. This gives a tentative date by which we will be ending up with 100% utilization. We also keep a monthly track on each tdev utilization, if there is a huge diff between allocated & written tracks, we reclaim that space.

regards,

Saurabh

2.1K Posts

March 31st, 2014 11:00

I'm curious why your higher performance policies exclude the option of 100% SATA? While I know it is unlikely anything assigned the policies with the higher "performance potential" would ever use 100% there is no harm in allowing anything that doesn't need to performance to drop down.

We have two different types of policies in play in our environment. The "traditional policies" are the ones that include all three tiers in each. They all start with 100% SATA as a base and they increase the FC and EFD tier allowances as they go up. I do have a few of our top tier applications that consume the full 15% EFD we allow them but barely touch the FC tier even though they could go 100% if they need it.

Just for completeness (since I mentioned the other tier type) we also have a set of policies that we are looking to implement which virtually isolate specific tiers. We have one that allows 100% SATA and 5% FC, one that allows 100% FC and 1% EFD, and one that will only allow 100% EFD.

19 Posts

March 31st, 2014 11:00

VP Policy Configuration

T1_FAST: 35% EFD, 100% FC, 25% SATA

T2_FAST: 15% EFD, 75% FC, 75% SATA

T3_FAST: 50% FC, 100% SATA

We always bind all TDEVs to the FC pool, LUNs are never bound to the EFD and SATA pools directly.

PRC is set to 1% on EFD, 10% on FC and 1% on SATA.

Allocation by FAST policy is enabled.

19 Posts

March 31st, 2014 12:00

Allen,

That's a good question, we do this primarily for our Oracle environments that run monthly batch cycles.

Limiting the amount of SATA in the T1_FAST policy insulates us from performance issues associated with data that has been demoted to SATA and needs to be promoted quickly for a reporting or batch cycle. Having said that we're considering bumping up the amount of SATA exposure in the T1 policy to 30% or 35% based on what we're seeing in our FAST-VP compliance reports.

2.1K Posts

March 31st, 2014 12:00

I can understand that, but then I wonder why you are able to allow any of the data to drop to the SATA tier. Wouldn't allowing even 25% of it to drop potentially cause the same issue (since you don't directly control which data moves).

I hope I'm not coming off as argumentative here... I truly want to understand since we have these types of discussions regularly around here and every bit of data/knowledge we can bring to the table helps in the end :-)

19 Posts

March 31st, 2014 13:00

Allen,

We started out with a much lower percentage of SATA in our T1_FAST policy and have been testing and slowly moving the allowed SATA % up based on SG metrics and feedback from the application owners. So we haven't seen an issue with allowing 25% in SATA and we'll continue to increase the amount of SATA exposure until we start seeing an impact to our month end processing and then back it off. Unisphere for V-Max with SPA has been a great tool to help us make good decisions with respect to our T1_FAST policy.

213 Posts

April 1st, 2014 01:00

Guys, generally speaking I will not recommend making more than one policy for same tiers. I will make it 100/100/100.FAST VP will take care of everything else related to data movements.

You can check Sean Cummins own blog about FAST VP Best practices. One of the greatest posts about FAST VP:

http://blog.scummins.com/?p=87

Hope it helps

Mohammed Salem  @yankoora

2.1K Posts

April 1st, 2014 06:00

Mohammed, can you clarify what you mean by not "making more than one policy for the same tiers"? If you only make a single 100/100/100 policy then how do you ensure that applications needing more performance can get it while other apps that shouldn't waste your more expensive tiers can't "waste your money".

I agree that if money was no issue and you could put unlimited amounts of EFD and FC in your array then a 100/100/100 policy would be great. Our budgets are a bit more constrained though.

Maybe I'm just misunderstanding your point...?

226 Posts

April 1st, 2014 07:00

Allen,

I like to think of it in the context of the old "80/20 rule" --

In cases where you have devices associated to a 100/100/100 policy and bound to the middle tier, the setup is simple, virtually self-managing, and it works well -- it takes about 20% of your personal effort to achieve 80% of the positive results.

In cases where you micro-tune things by setting up many different fast policies for workloads that are less critical, there's a lot of effort involved, and that effort generally doesn't squeeze all that much more out of the box... and unless you get everything 100% correct, and stay on top of it as workloads change over time, it can sometimes have a negative effect on your more important workloads. So while there's nothing at all "wrong" about having multiple policies, IMHO this approach requires 80% of your effort and yields about 20% of the positive results.

To be more specific -- when you restrict certain workloads to the lower tiers, you run the risk over overutilizing the shared components associated with those lower tiers (DAs and drives). Overutilization of shared components can have a negative effect on the box as a whole, including the workloads that are more important to you. You can offset this by imposing Host IO Limits on those workloads that are restricted to the lower tiers -- but that is another increase in the amount of administrative effort required on your part... So unless you've heavily automated things, I often question whether all that extra effort is really worth it...

Related to the budget thing -- using a 100/100/100 policy (or even, say, a 100/100/75 policy for reasons that tzvb23 described) doesn't mean that you must have enough EFD capacity so that all devices associated to that policy can "fit" within the EFD tier. You'll have more "demand" for EFD capacity than there is usable capacity in the EFD tier -- but this is a normal situation, and it allows FASTVP to make independent decisions about which portions of data belong in which tier at any given time.

Just my two cents..

Thanks,

- Sean

2.1K Posts

April 2nd, 2014 13:00

Thanks Sean. I went back and reread the detail in your referenced blog post and I can see your point, but I still have concerns about that approach in our environment. For us the 80/20 rule is more like "Get 80% of the work done for 20% of the budget that you actually need."

We aren't really artificially forcing anything to the lowest tier. We do limit the amount of data that can make it up to the EFD tier but we rarely have anything run up against the limits of the FC tier. But I do like the option of having anything that wants to dropping down to the SATA tier. We haven't seen any overload on the backend from this approach and it does allow us to offer different price points to the application owners (once we complete the chargeback model setup that needs to be done in the financial area). Application owners can control their costs by defining the level of potential service they want to receive instead of paying for whatever the application manages to consume.

467 Posts

April 2nd, 2014 18:00

M.Salem wrote:

Guys, generally speaking I will not recommend making more than one policy for same tiers. I will make it 100/100/100.FAST VP will take care of everything else related to data movements.

That doesn't work in my environment.  What happens is we have some large amounts of data which get pushes down to SATA because they aren't going anything.  This lasts for months and months.  Then all of the sudden (end of year / end of quarter) they decide they need all of that data and they need it all right now.  They read, edit, overwrite, and otherwise use the data to the tune of hundred of thousands of random non sequential IO/s.  When the data has been moved down to SATA,  they suffer huge performance problems due to all the random IO.   If we go with a 100/100/1 policy for this data,  its a rather large performance boost for them.

82 Posts

May 12th, 2014 00:00

Hello all,

  All the questions and replies made this post wonderful. Thanks all.

Here is my question,

when deciding on configuring FAST policies, what factors we need to take into account? Really what is the best way to start with? Is there any traditional way to go with while designing policies for an environment?

~dino

2 Intern

 • 

1.2K Posts

May 12th, 2014 01:00

The recommendations are:

  • Policy: 100% across all technologies – 100%, 100%, 100% for a self managing black box solution.
  • Operation mode: automatic
  • Workload analysis period: 7 days 168 hours or 4 weeks.
  • Initial analysis period: at least 24 hours, best practice is the default of 168 hours.
  • Performance time windows: 24 hours.
  • Pool reserved capacity: Based on the lowest allocation warning level for that thin pool. For pools with bound devices 10-15% and pools with no bound thin devices set PRC to 1%.

82 Posts

May 12th, 2014 08:00


Thanks Zhang,

Your Recommendations and @Sean Cummins post gives a clear picture.

Still my question is, if we are going to customize FAST for a 3 tier environment,

1. How to determine upper capacity usage limit for each policy?

2. Shall we assosiate all devices (might be write intense or read intense) in a storage group to FAST and leave it to FAST make the decision for easier management or we should not associte write intense to FAST VP as a good practice?

~Dino

No Events found!

Top