Welcome to the EMC Support Community Ask the Expert conversation. Today EMC announced ScaleIO 2.0 that adds unique value delivered by ScaleIO by further increasing performance, enhancing scalability and improving operations; making ScaleIO even more resilient and secure than before. Our seasoned experts have extensive experience with ScaleIO and are here to answer any and all your questions. If you missed the live announcement, view it here and ask your questions.
Meet Your Experts:
Product Manager - EMC ScaleIO
Jason is a Product Manager on the ScaleIO product and true technologist at heart. He works on all aspects of ScaleIO and is always interested in people thoughts on storage, networking, technology and the ways that all these intersect. In previous roles, Jason have been a Technical Trainer, Corporate System Engineer, IT manager and once upon a time, a Support Tech. Twitter:@osaddict.
Technical Marketing Engineer - EMC ScaleIO
Native of the Greater Seattle area, David has been in the storage industry starting with Digital Computer Corporation and "cut his virtual teeth" on multiple start ups. An Isilon Proven Professional as well as ScaleIO and builder of virtual labs to learn the software defined storage.
Product Manager - EMC ScaleIO
I currently lead the ScaleIO product management team at ScaleIO. I have over 10 years of experience in technology sector including product mgmt., software engineering, HW engineering, performance engineering as well as investment banking and consulting. I have several patents and papers on storage systems and clustering. I am an outdoor enthusiast and usually spend my free time hiking, camping with my wife and our dog, a 16lb Schnoodle. Twitter: @navinsharma101.
Principle Product Marketing- EMC ScaleIO
Jason is currently a Product Marketing Manager for EMC ScaleIO. In previous roles he was a Product Manager for EMC Centera, EMC Atmos, and EMC ViPR/ECS. Jason is a member of EMC Elect 2016 and can be found on Twitter @FelixNU98.
| INTERESTED ON A PARTICULAR ATE TOPIC? SUBMIT IT TO US|
This discussion will take place Mar. 30th - Apr. 12th. Get ready by bookmarking this page or signing up for e-mail notifications.
Share this event on Twitter or LinkedIn:
>> Ask The Expert – Introducing ScaleIO 2.0 http://bit.ly/1Mz9MVc #EMCATE <<
This Ask the Expert session is now open for questions. For the next couple of weeks our Subject Matter Experts will be around to reply to your questions, comments or inquiries about our topic.
Let’s make this conversation useful, respectful and entertaining for all. Enjoy!
Afaik (pls correct me if I'm wrong), ScaleIO keeps 2 copies of each data block distributed between fault sets (i.e. in 2 separate fault sets), and this is always the case and not configurable.
Q1: Is there any documentation available how the algorithm distributes data between more than 2 fault sets and/or between SDSs that are members of a fault set and those that aren't?
Q2: 2 copies of data is not very redundant, so can you pls share your experiences how customers overcome this limitation?
Q3: Do you know about any plans/time frames to extend this to 3 copies in at least 3 separate fault sets?
Thanks a lot!
Thanks for the questions Groer.
You are correct, we are doing 2x mesh mirroring. Answers to your questions below:
Q1: Faults sets are a delineation of a set of nodes that could fail together. Example: by have the nodes in a rack within a fault set, the algorithm makes sure that all mirrors go to a nodes outside of it’s fault set. (Ie a different rack). That way, if multiple nodes within a fault set fail, then the data is still available. By default, each node is a fault set. The minimum number of fault sets is 3, but be aware that fault sets can affect usable space. Especially if you are have a small number of them.
Q2: When a devices or node fails, the rebuilt is done in a massively parallel fashion as every node storing data within that storage pool is working to redistribute that data. And since there are so many nodes involved and the data is mirrored, these rebuilds happen in a matter of minutes. So that window of risk is very low – key is to understand that since the rebuilds are so fast (as long as you follow some basic best practice) you actually have lower probability of DU/DL compared to typical RAID-6 and even 3-mirror (which suffers from really, really long rebuild times) . We have an internal tool that will give the availability of a specific configuration based on all the components and the network connectivity to the nodes. What we find is that we have very high availability with 2x mesh mirroring and have very large service providers using this for critical applications. Risk can additionally be mitigated by creating production domains and storage pools as cluster grow to larger and larger sizes as well as using fault sets.
Q3: Today we do 2x mesh mirror but are exploring other data layout options for the future. However, availability has not been something that has been an issue or driving these discussions.
Are there any plans on using some kind of (simple) compression on data blocks? This could be done by most modern CPU's without almost no performance impact?
Will there be an product/update to get more historical info about the performance?
Thanks for the question BaDMaN,
Great question. We are exploring both of these options. If you want more details, reach out and we can discuss the roadmap.
What would be the state of the cluster if the fault set fails and there is not space for rebuild?
Ex, 4 Rack cluster , Rack is a fault set and lets assume the fault set fails . It might be difficult to persuade a customer to have the capacity of the rack as spare capacity .
I have question about scaleIO performance
1, how many factors would be have impact in performance?
(disk type,network,server spec,number of SDS, and any other?)
2, will the number of devices in each SDS have impact in performance
(eg, 3 SDS with 3 disks in each one VS 3 SDS with 6 disks in each one)
3, the devices be added to SDS can be disk or an unmounted partition,
is performance different if we use a raw disk from several unmounted partition
(eg, add 1T raw disk VS add 10*100G partition from the disk )
Do you have resp. are you willing to share data about the performance impact of integrity features like checksum, zero padding, background scanner? Even orders of magnitude would be interesting, i.e. are we talking rather tenths of a percent or 1 or 10 percent?
Thanks a lot!