Start a Conversation

This post is more than 5 years old

Solved!

Go to Solution

7758

December 4th, 2014 12:00

Isilon Nodes IOPS

The Isilon literature only says that by using the S nodes, ‘up to 200 GB/sec are possible’.  This would only be if you had the biggest system possible.  I can’t find IOPS information at all either ,a way to determine this.  It would be good to have statistics on a minimal 3 node configuration of S nodes and the same for X nodes.  Is there a chart of all nodes IOPS?


Thank you,


Saeid

254 Posts

December 4th, 2014 13:00

IOPS, as in disk IOS, is not a particularly good measurement of a NAS system and is much more relevant to block or SAN systems.  One of the big reasons is that in a NAS environment, the clients don't really generate IOPS directly.  They make calls via a protocol which eventually map down to disk IOPS, but there is not necessarily a one-to-one or even a one-to-X where X is a constant ratio that makes IOPS a good indicator of performance. 

Compare this to block systems where the host sees the storage as a disk and everything is a read or a write.  Reads are, essentially 1:1 and writes are 1:N where N is constant and dependent on the RAID type.  Host IOPS are easier to measure and translating them to disk IOPS is a straight-forward process.  With NAS, we have more than just data read and write operations because we have things like metadata operations that people don't always count as throughout but still happen.  A classic example:  Have you ever tried to copy a directory with a ton of small files and you notice how the throughput stinks compared to a few large ones that constitute the same size?  That's because the small file case has to do metadata stuff and that becomes a significant part of the work but it doesn't always count with respect to "data moved".

This is why you don't see standard NAS benchmarks (of which SPEC SFS is the most popular) talk about disk IOPS.  They talk about protocol IOPS and then they define the exact mix and sizes for the benchmark.  This makes it possible to do a reasonable comparison between NAS systems because you have done a good job of fixing everything else in the environment.  So when an NAS system is rated for IOPS, it typically means protocol IOPS and, more often than not, it means specifically the SPEC SFS workload.  If it is not SPEC SFS, then the workload needs to be well-defined for the benchmark to have any useful meaning.

So to loop this back to your question.  Another commonly seen performance metric used for NAS system is throughput.  Unlike disk IOPS, throughput can be measured from the client and the server which is why you see thinks like X GB/s.  As far as numbers for an arbitrary # of nodes, I would work with you Isilon SE to look at a given config.  They have internal tools that can show throughput for a single stream (to one node, which can vary based on cluster size which makes sense as that changes the # of potential disks) as well as aggregate throughput for protocols such as NFS or SMB (which is another consideration as the protocols differ with respect to performance as well).

This is probably a long-winded answer that doesn't give you a simple number and for that I apologize, but my goal is to get you a meaningful number and like much of our industry, the answer is "It depends...."

December 4th, 2014 14:00

I agree with Adam - it depends.  I've got 28 S200 nodes in a cluster that has hit >1M NFS operations per second with sub-millisecond latency.  However, we were very heavy metadata read at the time.  If I tried to do 1M NFS writes on those same nodes, you'd probably smell them burning from where you are.

You really, really, really need to understand your application workload.  If you don't have it, you're just taking a wild guess.

For some workloads, the new L3 cache is awesome.  For EDA workloads, it doesn't buy us anything.  NetApp has similar trade-offs where hybrid aggregates are awesome for VMware environments but do nothing for EDA.

This is no different than buying a car.  My car can likely go 80mph.  So can a tractor trailer.  But put 40 tons of freight in my car and it won't get out of the driveway.

There are graphs out there that have some information, but they're really only useful for comparisons of node types for the IDENTICAL workload.

Note that it also depends on which version of OneFS you're running because the benchmarks definitely vary from release to release.

For a "home directory" mix, eye-balling from a graph, an S200 is about 15K NFS ops.  An X400 is about 14K.   For peak aggregate throughput, per node, an S200 is about 810MB/sec read and 450MB/sec write.  An X400 is very slightly higher.  As your sales rep if you want to see the chart and work with an SE to analyze your workloads and make the appropriate suggestions.  You might want to make tradeoffs for price or rack space.  The S200 easily wins in performance per rack unit, but gets clobbered in TB per rack unit.

You really, really, really need to understand your application workload.  If you don't have it, you're just taking a wild guess and no amount of benchmarking will help you find the right solution.

10 Posts

December 4th, 2014 18:00

Thank you Adam,

I've prospect customer who is considering 3 node X200, which is to be used for edit on place for DnX145 or XDCAM 50 on Window Adobe Premier system. They want o know how many edit station can edit directly on the Isilon cluster with the configuration mentioned above.

Other manufacturer have a chart with the IOPS and transfer rate, I could find anything like that on Isilon.

Is there any chart to show the max band width of the Isilon nodes?

Thanks,

Saeid

10 Posts

December 4th, 2014 18:00

Thank you so much Ed!

Saeid

10 Posts

December 5th, 2014 11:00

Hello,

Thank you!

254 Posts

December 5th, 2014 11:00

I don't have a chart per se.  But I have some *rough* numbers.  Again, mileage will definitely vary.

A 3-node X200 cluster doing SMB2 should be able to do an *aggregate* throughput of *around* 1200 MB/s (read XOR write)  That is in a lab and I'm sure the load was very well balanced, etc.  Real life can and probably will vary.  I believe this test had 1 SSD in each node.  But I would still work with your Isilon SE to look more deeply at your particular workload.  Numbers off a chart are still a bit of a guess without seeing the bigger picture.  The more you understand the workflow, the better you can architect the solution.

No Events found!

Top