Thanks for the reply Rob. We are looking to scale past the 2 PB mark, so this certainly looks like the best solution, now if we can afford it is another conversation entirely . A quick side note, you came out to San Diego for a customer executive dinner event in 2012 and I still reference some of the things you said there. It was an excellent talk that left an impression on me, so thank you!
Thanks Peter, I know "it depends". Would be great to have some hallmarks for concrete scenarios though. Can you quantify EMC's understanding of the "needs of the platforms they run on", specifically for HD400?
I work with the backup division, and in the past we use to have lot of problems performing NDMP restores with Isilon. The backup operation runs perfectly, so fast, but when we have to restore using NDMP, the performance is not so good.
In the latest version, with snapshots backups this has been "solved", but, would we still have problems I we continue using common NDMP?
Hi Pablo. I had to get some consultation to answer this, so hopefully I do it justice. There are multiple use cases here that can affect the answer, and dataset plays a key part. Three-way NDMP restores are inherently slow since the data goes over the front end next. Next, if there are small files in the dataset to restore, since our writes on many small files don't perform as well as larger files it will be slower (three-way NDMP worsens this). One of things we did in OneFS 7.1.1 was introduce parallel restores which are multi threaded restores. Earlier to OneFS 7.1.1 we were doing single stream backups. Some performance evaluations have shown about a 2x performance increase with local restores using backup accelerators and the parallel restore feature.
Would it become economic in certain scenarios to push data from Isilon to ECS via the upcoming (and already demoed in 2014) Isilon CloudPools feature?
Does 7.2 bring any new reporting options? Especially around smartquotas?
IIQ 3.1 has new reports on Smartquota. Are there any specific reports you are looking for?
Can we check on which user is using a specfic file type? pictures or video or .bak files? Users exceeding advisory quotas and soft quotas?
Sure! And thanks for asking Niki. As a data lake foundation, ECS fits when the customer desires object storage as the foundation for their storage. ECS is object based and provides a very low cost, highly dense storage platform for Hadoop. ECS consists of low cost commodity components, architected and designed to deliver an enterprise class storage platform that can scale to hundreds of petabytes, even exabytes. The benefit for customers is that they get the economics of commodity but also get the reliability, availability and serviceability (RAS) typical of integrated storage platforms; fully supported by EMC.
EMC has a couple customers using ECS as a large, multi-site archive for unstructured content. The HDFS data service on ECS allows them to bring analytics to those deep archives. Metadata querying is an example. They can also support Web, mobile and cloud apps written to object storage APIs such as Amazon S3, OpenStack Swift and EMC Atmos. ECS addresses some of the limitations of traditional HDFS by featuring a hybrid distributed erasure coding mechanism that protects data across multiple sites with very low overhead. And ECS can maintain that low overhead without sacrificing accessibility of the data. In fact, data can be read from and written to any site in a multi-site environment – even in the event of a site failure. ECS also supports multi-tenancy so an enterprise or a service provider can offer Hadoop-as-a-Service (HaaS).