Re: Isilon storage performance issue
A couple of thoughts and suggestions:
is really worth reading to learn more about the filesystem layout. And questions similar to yours
have been discussed here recently
With all information, it is really fun to examine a file's disk/block layout as reported by isi get -DD "file".
> 5.If I change filesystem block size from 8K to 32K , does it means my stripe unit will be now of 16X32K
I don't think you can do so - which exact setting are you referring to?
> 6.OneFS uses 16 contiguous block to create one stripe unit , can we change 16 to some other value?
Couldn't imagine, but the access pattern parameter controls whether a larger or lower number of disks per node are being used (under the constraint of the chosen protection level).
> 7.Can we access Isilon storage cluster from compute node (install RHEL) using SMB protocol, as I read in performance benchmark from storage council that SMB performance is almost double compare to NFS in terms of IOPS?
In benchmarks SMB IOPS appear higher than NFS IOPS because the set of protocol operations is different, even for identical workloads, not to mention different workloads used. You cannot compare the resulting values...
For your original test, you might max out with the disk IOPS (xfers), but you could also get stuck at a certain rate of your "application's IOPS " while seeing few or no disk activity at all(!) -- because your data is mostly or entirely in the OneFS cache . Check the "disk IOPS" or xfers, including ave size per xfer, with
isi statistics drive -nall -t --long --orderby=OpsOut
and cache hit rates for data (level 1 & 2) with:
isi_cache_stats -v 2
In case of very effective caching the IOPS will NOT be limited by disk transfers (so all that filesystem block size reasoning doesn't apply).
Instead the limit is imposed by CPU usage, or network bandwidth, or by protocol (network + execution) latency even
if CPU or bandwidth < 100%.
In the latter case, doing more requests in parallel should be possible (it seems you are right on that track anyway with multiple jobs).
To check protocol latencies, use "isi statistics client" as before and add --long:
isi statistics client --orderby=Ops --top --long
This will show latency times as: TimeMax TimeMin TimeAvg (also useful for --orderby=... !)