Firstly, apologies if this is not a suitable subject to raise here - I have reasons (described below) for trying to clarify things before approaching a sales team!
I'm in the process of trying to specify a storage system (which we will buy two instances of) to support a pair of basically identical virtual platforms (also likely to be near-identical in workload). My outline is all about *one* of these systems. In our case we already have a VNX5500 for each platform which we bought to support the pilot system (10TB usable 15K, 64TB usable 7.2K), and "sticking with what we know" a VNX2 is a likely successor.
Perhaps in contrast to some use cases, our future workload for the platform is fairly well scoped. It will basically just be more of the same VM types as in the pilot. I have been able to dig through vSphere performance graphs to estimate the IOPS, bandwidth and capacity usage of different VM types (for many of these types the successor system will just have 8x the count of VMs compared to the current one). In the process of gathering this data, I've found some interesting things (today I discovered the Mitrend tool which has been valuable in characterising our workload):
1. Our workload in the pilot is currently typically 70-80% writes on a 95% IOPS of 1700.
It appears that we have several DB servers that host what are actually quite small MS SQL databases which readily fit in server memory but which have quite high change rates (thus frequent transaction logging). I think the capacity of this very write-heavy data is probably only a few hundreds of GBs.
We also have some continuous file writing from several of our applications. The loading seems very consistent, though it appears that an AV component install across our server estate pushed the IOPS to around 25K for a few hours (80% utilisation on one SP, 32% on the other so there was still capacity...)
2. The total final capacity will need to be between 180 and 350TB
Some of the continuously written material (estimated to be around 150TB) will be written once (at a rate of around 400GB/day and retained for slightly over 12 months) but may never be read again. This block of data may be removed partly or wholly from my scope, thus the range of capacity values. This 400GB/day will have very few reads.
3. The estimated bandwidth is likely to be around 500-800MByte/s
4. The estimated throughput is 3K IOPS read, 5K write (based on a calculation of the slightly changed workload mix of the final system).
So.. I have read the best practices for performance guide and my assumptions/questions are as follows:
a. In order to get the capacity I need will take (say) 120 7.2K 4TB drives, RAID6
b. I don't know whether I need any other spinning disks, or just have an eMLC flash tier (15 disks?). I did try the VNX Skew report on Mitrend, but all it delivered was a config summary unfortunately, and no heatmaps (I'll pursue that as a seperate enquiry).
c. I'm assuming I would have some FAST Cache drives (how many?)
d. I don't know whether it is worth the overhead of separating the workload into different pools (I suppose the obvious case would be the long tail data being in an (almost?) entirely NL-SAS pool)
e. There may be an interest in staging some of the purchase - the capacity requirement will take some time to grow as the project turns on the workloads.
I realise that some of this is advice I could get through an EMC partner - however, it is likely (due to my organisation's policy) that we will need to procure this on a competitive basis from a pre-established supplier list (which contains several EMC partners). If we were to offer what I've written above to several partners would it be sufficient to produce a useful BoM without great wasted effort for the unsuccessful suppliers? Is there a process for asking EMC to transform this kind of summary into a BoM? It may be that what I really need is a small bit of consultancy with someone more knowledgeable, though I'd have to venture into the bureaucracy of procuring that!
Finally - part of my rationale for posting here is the hope that others might find the design discussion interesting.
Thanks to those who have spent their time reading this far... I shan't be offended if the length causes many of you to skip it altogether!
Without writing a book, here are a few things that come to mind:
How much fast cache? ... Within budget constraints: all of it. It is a good resource, particularly useful for smoothing out oddities in your usage. Tiering isn't responsive enough in that it re-balances daily not minute by minute. Fast cache is much more granular, and responds quickly to changes.
I can't get a feeling for your budget here, but I have two different thoughts depending on that:
a VNX2/5600 is a good platform for this in that it has more cache and back-end channels than a 5200 or 5400, and can handle up to 500 disks. I have one configured for general VMWare hosting that has 210 x 1.2TB 10k disks, 180 x 3TB NL-SAS disks, and 15 200GB EFDs for Fast Cache. It listed over a million, and we paid something like half of list...
But you also mentioned buying some capacity, and then upgrading. Often it is difficult to get satisfactory pricing on future upgrades. You can get 'not to exceed' pricing, but it may not be as competitive as you would like. So how about buying a smaller unit, like a 5200, with the intent to get a second one later (or as I understand it, a pair now, and a pair later...)? I could see configuring something like 68 x 4TB NL-SAS with 7 x 200GB EFDs, and 50 x 1.2TB 10k disks and getting a usable capacity of 160-200TBs out of that. The 5200 is not as robust as the 5600 (smaller cache and only 125 disk handling capacity), but it is competitive with your VNX 5500 in terms of controller performance. I have a couple of arrays somewhat similar to that which lead me to guess that you would be able to get a 'loaded' 5200 for something close to the mid $200k range.
A 5400 would offer you the ability to handle more disks, and the CPUs in the controllers are a bit faster, but the RAM in the controllers is the same as the 5200. I'm sure you can get a decent price if you configure one similarly, or with a bit more spindles in it, but I don't have any real-world feedback on that.
If money were no object, I would get the 5600, either spec'd to your final capacity, or with some sort of 'not to exceed' upgrade plan. But money is always an consideration.
Thanks Zaphod, that's really helpful - and some food for thought.
Plenty of FAST Cache: understood, that makes sense. I assume that your 15 drives were 14 RAID1 pairs plus a hot spare?
In your sample system did you just have 2 backend buses or did you use 4 or 6, for example? I'm not sure what the trigger point is for increasing the backend bus count (obviously the smaller models manage with just 2 anyway).
10K & 7.2K drives: I notice though that you've gone for HDDs rather than eMLC SSDs. I assume that is just a cost/GB factor? Likewise the choice of 10K instead of 15K drives is presumably a similar argument
2x VNX5200 instead of 1x5600 - that's an interesting thought in terms of phasing, as we could get some additional life out of one of our existing VNX5500s if we went down that route, but that's a decision others in my project may have to make dependent on how much capital expenditure they want to push out into the future. I understand the point about future expansion pricing and that is subject to the same project direction - I can see the discount advantage of buying upfront.
Thanks again for your comment- that has been really useful in pointing out what I need to focus on.
The VNX 5600 comes with two backend buses by default; I added a four channel upgrade to bring the count to six. As configured there are 21 total disk enclosures, and so each bus has three or four enclosures on it. The 5200 is limited to two buses, and depending on drive sizes a full configuration can be from five to eight total enclosures I do not believe that any VNX is permitted more than eight enclosures per bus, and I do not happen to know if that number is lower if you are using 25 slot 2.5" enclosures.
I work in higher ed., and budget concerns are high on the list of constraints.There is a short story about how I ended up with the 5600:
For the last several years we have been buying VNX5300 and more recently 5200 arrays.Our growth has been such that we were up to adding several a year, and I now have a dozen of them. Fiscally we keep things for 5 years, so the idea was that rather than buying big units every few years we would just get the smaller ones, and they would age out incrementally giving us a more uniform spend year over year. But now that we are up to getting three or more a year we have grown enough that we can achieve that same effect by planning to buy a 5600 every year. I was able to demonstrate that I could get the 5600 for just about the same dollars as three of the smaller arrays.
In the smaller units I had been using three configuration templates:
1) Bulk storage with 120 x 2TB or 3TB disks and no advanced functions.
2) 'Pretty Good' storage with 125 x 2.5" 10k disks; 600GB or bigger. Again with no advanced functions.
3) Tiered storage with some EFDs for fast cache, along with a mix of 7.2k and 10l/15k disks pooled together.
Most or out workloads are quite happy on NL-SAS disks with no Fast cache. We purchase the 10k based units for things like Oracle that benefit from lowered latencies, and the tiered units take outliers with higher IOPS requirements, or tendencies to burst high, making them poor neighbors in the other units.
The 5600 configuration is roughly the same as one of each of the above, all in one unit. Most of the capacity is tiered pools, and there is some NL-SAS RAID groups to provide bulk type space.