Why would someone want to consider a turnkey object storage solution as opposed to a home grown solution?
There are a number of reasons why you may want to opt for a turnkey solution instead of piecing together your own, the main reason being that a turnkey solution is vendor supported, which means that the vendor has certified the commodity hardware as a reference architecture. This means that you will still get the cost benefit of the commodity components, but you will also receive the high level of support like you would if you were buying a SAN. Other benefits include a much quicker deployment because you don’t have to go through the trial and error of architectural design as it has already been done for you. As a result, more IT time can be allocated for building applications instead of designing the infrastructure.
Many IT organizations like to build what they can themselves. What are some of the biggest challenges to designing a home-grown object-storage platform?
Two of the biggest challenges are scaling and managing a home grown environment as well as geo-scaling and data protection. I’ve had experience with customers who designed their own object store using open source and commodity storage only to find that once the environment scaled it was nearly impossible to manage. An example of this was a customer who literally had to have staff walking the data center floor to try and determine which commodity racks had failed disks that needed replacing. Turnkey vendor solutions, like EMC ECS Appliance are highly integrated with the software-defined management layer which would notify you of disk failures or issues automatically. I mention the disk example because commodity disks are generally high capacity SATA drives that will eventually fail and need to be replaced. Depending on the size of the environment, there may be thousands of disks and a task for determining root cause that is simple with a solution like ECS Appliance, is not that straightforward in a home grown solution. Before making a decision, it is important to consider how well you will be able to manage a home grown object store as it scales. Another concern is how to implement geographic protection and ensure availability. Will you be able to implement this in your home grown object store? A product like EMC ECS Appliance provides full geo-protection against site failure should a disaster or calamity force an entire site offline. ECS Software protects data across geo-distributed sites and ensures that applications seamlessly function in the event of a site failure.
How does ECS protect data within the system and how does it compare to other vendors or an OpenStack distribution?
ECS uses an erasure coding scheme to provide storage efficiency without compromising data protection or access. ECS first writes data triple mirrored for high performance, then later splits the data into 12 data fragments and 4 coding fragments, with the resulting 16 fragments dispersed across nodes at the local site. The storage engine can reconstruct a chunk from a minimum of 12 fragments. In this way, ECS can tolerate node failures and still deliver service while also marrying the high ingest performance of mirrored writes with the low storage overhead of erasure coding.
ECS provides strong consistency for data, even when accessed from multiple geographies. In a strongly consistent system, the application is always guaranteed to read the most recent version of the data, and ECS has been optimized to quickly determine where the data resides and return it from the location closest to the application. Most other cloud systems implement eventual consistency schemes, where updates to data propagate lazily and applications may read prior versions of the data for an unspecified period of time, requiring additional and complex application logic to ensure correct behavior.
OK. What about open source options on commodity hardware? How do these solutions compare to a vendor supported, turnkey solution? Experts? Anyone?