the “Don't specialize your storage, use general pools” discussion - part 1 - Business Case

Part I – The Business Case - “Don't specialize your storage, use large pools”

Minimizing the Cost of Storage Infrastructure for Virtualization
I have to be honest, I come from the school of over-design, over-engineering, and abundant expenses when it comes to running infrastructure. However, it was fully justified. Huge infrastructure capital expenses allowed me to sleep better at night and minimize the amount of labor necessary manage hardware. Yes I may have worked for a company that gave me the luxury to do this (highly profitable and highly reliant of technology), but I am running into less and less companies that can take this stance.

Let me take a step back and let’s see how I would have over-designed, and over-engineered storage. Let’s take a database as an example. A traditional approach would be to dedicate silos of disk (raid groups with dedicated disk spindles) to specific IO profiles, and critical functions that allow for optimal performance. So where are my excessive costs in this scenario? Largely in idle capacity.

Idle capacity is a central theme of managerial accounting and if not educationally learned is innately learned by all business managers’ to be a pivotal concern for minimizing costs. Who wants a machine producing a product that dedicates itself to a single product? In the world of competition among companies, this kind of approach to production is wrought for the most part with failure. In economics, the company that is able to minimize its cost to produce and sell a product that is equal to others wins. Are we seeing any similarity here between technology infrastructures and accounting? Now let’s move forward with the analogy of a raid group producing IOs compared to a bottle machine producing bottles.

How can we squeeze the most out of our bottle machine investment? Produce more than just soda bottles, produce water bottles as well. And how can we get more out of our storage investment? Don’t specialize backend disk in the array, produce IOs that can be used for multiple workloads—a shared model.

The traditional approach, why not shared everywhere?

SLAs. One of the most critical pieces to meeting SLAs in the storage world had to do with physics and a basic question-- how many drives do I have, and what are their IO capabilities. Customers with a buck to spend have traditionally relied on planning towards the physics of storage compared to the opposite paradigm of minimizing the cost of storage but sacrificing guarantees for SLAs. Put another way, a customer with stringent SLA needs would typically rely purely on the physics, while someone trying to minimize costs would adopt a shared model.

So how can we maintain SLAs in a shared model?

Easy, this has traditionally been done by way of QoS at the LUN level and applying Quality of Service via priority controls to critical workloads. This allows a customer to leverage a shared pool model while ensuring SLAs.

But wait, LUN level?

From a storage arrays perspective, at the moment, I don’t know of a vendor that is applying QoS at the VM level. However, this is a two-sided battle. If we include the hypervisor in the storage discussion it adds the opportunity during storage operations to make decisions about what VMs get priority access to what resources. This control of resources is something that has been done for a while in general for the VMs in virtual infrastructure. What’s new in this mix is the ability to limit VMs consumption of storage resources based on fair access and hard limits. This means that we can skip QoS at the array level, and simply apply fair access policies to the virtual machines themselves. This is truly game changing technology and is called Storage IO Control within the vSphere hypervisor from VMware. It is game changing in the sense that it allows us make that soda bottle machine produce water bottles while making profit driven choices to produce bottles based on demand.

Ok, so let’s minimize idle capacity

So here we have it, a fundamental shift in how we consume storage resources for a VMware environment. We now have the technology that allows us to leverage huge pools of storage. No longer do we need to break up our storage into specific islands of storage, we can now have a do it all factory while ensuring we produce for our most important product first and foremost—IOs for critical apps. Talk about elasticity, we can now bring on new less critical applications and leverage our highest tiers of storage while not affecting tier 1 applications. So let’s break the idle capacity discussion out a bit. Instead of just referring to idle capacity, let’s be more specific at the types—idle IO capacity, and idle storage capacity (GB). And yes we have equal goals for both, minimizing costs by minimizing specialization of resources.

Part II – The Technical Case - Don't specialize your storage, use general pools” - to be continued

How SIOC and FAST are complementary technologies that help lower costs of storage infrastructure in a VMware environment

View All

No Events found!