Unsolved
This post is more than 5 years old
1 Rookie
•
77 Posts
11
44632
August 6th, 2013 13:00
Ask the Expert: Isilon Performance Analysis
Welcome to this EMC
|
|
Support Community Ask the Expert conversation. This is an opportunity to learn about and discuss the best practices for Isilon performance analysis, including:
- Client work-flow considerations
- Network considerations
- OneFS identify and break down causes of latency or contention
- Tools that help identify work-flow bottle-necks
- Sizing and tuning protocol OPS to disk IOPS
This discussion begins on August 12 and concludes on August 23. Get ready by bookmarking this page or signing up to receive email notifications.
Your host:
John Cassidy has spent three decades developing, supporting complex solutions and simplifying problems; He brings the following to bear in OneFS Performance or Complex work-flow issues:
* Work-Flow profiling
* Simplification methods
* Measurement tools
* Deterministic Results
No Events found!


JonK1
2 Intern
•
247 Posts
0
August 22nd, 2013 07:00
I've got a question on the data access pattern and how to change it.
We've recently migrated a large amount of PACS data to an Isilon system (currently 3-node NL400). Folders have a large amount of files which can be anything between 44KB and several tens or hundreds of MegaBytes. There is only a single server using this directory tree and I'd like to make sure it has the absolute best performance.
The folder has been configured with a data access pattern since this is the default for this cluster. With a single server my assumption would be that a streaming pattern would be better for performance, both read and write.
When changing the DAP from the Filesystem Explorer I can see the changed setting (now Streaming). This is also reflected when running an 'isi get' on the CLI. However, as soon as I move one folder down, it's back on Concurrency for all files and folders. I made extra sure to check the "Apply data access pattern setting to contents" checkbox, but to no avail.
What's the best method of changing it? Would the folllowing command be OK to run or would you advice to use a different approach?
isi set -R -l streaming
-R for recursive, -l to change the file layout optimization setting to streaming.
There's also "-r -g" with the options 'repair/reprotect/rebalance/retune'; would I need this? My gut feeling says I'll need it to restripe existing data and I'd need the retune goal, but to be honest I'm flying blind here and could use a double check 
Thanks!
paul_ford
2 Posts
0
August 22nd, 2013 08:00
John,
A Federal customer has requested assistance developing a performance test plan. They are trying to replicate the workflow on their production environment onto the test Isilon cluster without access to the production software and systems. The difficulty is that the production systems are classified and we are trying to use free/commercial tools and unclassified data to replicate the environment.
My thought is to use isi statistics to profile the access patterns and levels then replicate.
The environment is a mix of high rates of data ingest and dissem, large NUI containers, VMware guests, VDI, GIS servers, and who knows what else – They make up the requirements as they go along.
I have researched this situation and reached out internally in the past couple weeks and no good documentation has been located. Do you have a guideline on how to replicate an enviroment when trying to do an internal repro?
Thank you in advance,
Paul
cassij
19 Posts
0
August 22nd, 2013 08:00
Peter,
I wouldn't be overly concerned. However, we should see uniform inode balance. If you have OFF_HOURS to allow the "autobalancelin" job, I would suggest you run that. If it does not better balance the inodes on drives. Please open up a support case. Let me know the case number and I can assist.
Best,
John
cassij
19 Posts
1
August 22nd, 2013 10:00
Jon,
My thought here is that at some point the STREAMING was manually set.
https:// :8080/SPSettings
You will notice an option under both
Protection management
[ ] Including files with manually-managed protection settings
I/O optimization management
[ ] Including files with manually-managed I/O optimization settings
I suggest, that you ensure that your defaults match your ideals. Allow the smartpools job to run, and it will re-assert they changes over all of OneFS. You should notice that the directories you mentioned will be addressed.
NOTE: As a litmus, once you check the above to options you can run
isi smartpools process -dv
the -d option means change the layout but don't move blocks, meaning defer the heavy lifting to the job engine smartpools process running across all nodes
If the path is one of your noted directories in STREAMING but CONCURRENCY is the default you should see from
isi get -D
| head -1 # that the directory now matched your layout goals.
Thanks,
John
cassij
19 Posts
0
August 22nd, 2013 11:00
Paul,
Presuming that you have the ability to gather data whilst onsite but not allowing said to leave site. I would suggest the following:
* Map out the WORK-FLOW in a pseudo network topology diagram, were you detail the type of work or expect operations. Talk to the expected work-flow behaviors and tolerance for latency per operation. It is important to know were the work-flow will contend for resources over time. They simple way I would do this is on A4/Letter size pieces of paper and coloured pencils '-) low tech.
* As to profiling. I would highly recommend setting up InsightIQ, it is a very useful tool to profile work-flows, especially if you can map out the network segments and work-flow and then build custom reports to size and measure gains as you tweak or change the resources and tuning.
* VMWARE or VDI setup, you will have templates and clone images were each instantiation will be in VMware terms dedup'ing the GuestOS as it executes. It would be ideal to ensure that the base clone images have the needed resources for both READ and WRITE I/O. Meaning you will have say 100 clients all with a reference to a few templates or clones. The clones track many changes as result of the 100 shared GuestOS's. Advanced strategies would be to put the entire clone image into SSD (data+metadata). You might use a S200 and increase the L1 drop behind cache avoiding even IB lookups. Meaning have all the VMware hosts map to the same S200's (large memory, SAS, SSD). Very ideal for VDI.
* NUI containers are OBJECT files. They are not unlike MXF files. There will almost certainly be a WRITE sequence and READ sequence in the creation of the NUI container. The work-flow for the files used in creation of the pre files and then NUI will be such that there is RANDOM I/O. I would run tests with CONCURRENCY layout and measure performance. I expect latency on 7.x for writes will be good. I expect concurrency it will be close to optimal. You could simulate tests with RANDOM@16 with prefetch on but coalescer off. Unless the gains are substantial I tend to favor the least number of changes to maintain in setup. Meaning if the out of the box meets needs, KISS, keep it simple s...
* Mimic environment. Once you have a Work-flow map, understanding on I/O latency tolerance. I would use
isi statistics protocol --proto=nfs3 --totalby=node,class,op --orderby=node,op
Repeat this for each prot o in use SMB/SMB. From the output you will want to ensure that there is uniform or good scale-out to all nodes participating in the work-flow. What you will get from the CLASS/OPS is the rate and avg size and response for the operations. This will give you a clue on how to mimic.
* InsightIQ will allow you to create more custom reports were you filter on the separate work-flows. What you want to be mindful of is mixing too much bursty work-flows on same nodes. Meaning you want to avoid node hot-spots where WORK-FLOW X contenting on node 1 for resources (NET/MEMORY/CPU) with WORK-FLOW Y. A strategy may be from early on to create SMARTCONNECT that allow you to control which nodes are part of a given work-flow. Typically, it's consistency that is sought over PEAK and LOW performance, why the strategy to allocate resources and avoid NODE hot spots.
I would pursue follow-up with partners that own the VDI setup and NUI container creation. Both parties will have QA labs that test the environments. A few well placed conversations and joint planning meetings will allow you to mimic and design the optimal solution.This will be were you A4/Letter drawings will be very useful, if you can print out the data in a table from the isi statistics protocol to capture the service response times for each work-flow type and access a GREEN/YELLOW/RED, meaning if tolerable WRITE latency from work-flow X is 100ms and you see you are at 60ms, I would call that YELLOW. Potentially you may need more DISK resources. Adding a NODE to a SMARTCONNECT and repeating the test should change the 60ms to say 40ms, that may mean high end of GREEN. In simulation you want to run 10% above whatever the expectation in terms of protocol operations and you want to be in the middle of YELLOW latency times for long 24hour simulations runs; meaning you handle the bursts in work-flow and still have some head room. You want the head room for variance in data set and handling failures on disk spindle of VMware ESXi hosts that fail and put more resources on a given set of ESXi survivors.
If you need more follow-up on the above, please open a support case and let me know a time to have a follow-on conversation.
Best,
John
dynamox
11 Legend
•
20.4K Posts
•
87.4K Points
0
August 22nd, 2013 11:00
John,
what tools do you use that are also available to customers to perform workload simulation. On block side i always used IOmeter, iozone but on NAS it's so different. Something that could simulate hundreds of cifs users ..etc.
Thank you
cassij
19 Posts
0
August 22nd, 2013 15:00
I can offer two approaches:
a) model your work-flow as is using tools like
WINDOWS: processmonitor (old: SYSinternals: Filemon) # enable Advanced Output.
Unix: strace, ltrace or the platform equivalents HP-UX (tusc) SunOS (truss)...
From the system calls that include system execution times, you would profile a real application elapse time. From this build a chart for the following data points:
Mix of READ
and WRITE operations (min/avg/max) per second or minute.
Size of Operations
Think-time between operations.
Number of execution threads competing for same file targets.
Another tool to use is tcpdump and then wireshark (tshark from cli) or netmon. From these tools you can build
a Service Response Time chart that shows you the operation class, number of calls, MIN, AVG and MAX times.
"isi statistics protocol --totalby=proto,class,op --orderby=proto,ops" # is like a Wireshark Service Response Time
chart.
THEN
i) Write your own model tool in .c or .net or perl/python as skillset allows <<< then you know.
ii) Use off the shelf tools like fileop (iozone.org), or filebench and on both windows/unix 'fio'
b) Pick a tool that is close to your work-flow as well as industry recognized.
IOmeter
IOzone
hammerDB
sqlio benchmark
bonnie++
fileop
fio
filebench
I wouldn't pick too many, you can get lost as you begin to look at more and more synthetic work-flows.
Key thing to remember in SCALE-OUT, many of the above tools are single thread or acting on single file. Now, I think you should know what a single thread and single stream will achieve but, you need to run the tests at scale. OneFS as a technology truly shines at scale meaning run hundreds of threads over many connections with many file actors.
All of the above tools allow you to perform side by side comparison in a set/repeatable execution way. Meaning the data
points are valuable. However, they may not be anything like your work-flow.
OneFS is not a closed box solution, we let you login and allow for kernel sizing and tuning. We allow the storage or work-flow architect to change ondisk layouts, optimizing for STREAMING, CONCURRENCY or RANDOM I/O.
Google Searches: OneFS best practice site:emc.com
OneFS solutions site:emc.com
Will give you quick links to best recipe on setting up for type of work-flow.
We do offer advanced and training for administration or you can consult solution architects that can be reached through your account team to work-through proof of concept tactics and strategy.
Thanks,
John
Peter_Sero
6 Operator
•
1.2K Posts
0
August 22nd, 2013 22:00
>I wouldn't be overly concerned. However, we should see uniform inode balance. If you have OFF_HOURS to allow the "autobalancelin" job, I would suggest you run that. If it does not better balance the inodes on drives. Please open up a support case. Let me know the case number and I can assist.
The current AutoBalanceLin is now 30% done, and since started yesterday,
those few disks low on inodes (~0,5 Mio) have lost ~10000 inodes, while
other disks in the same pool (~2 Mio inodes/disk) have gained inodes,
as also new data is written to the cluster. I will definitely watch this to its end
and, if no miracle appears, open a case, and keep you posted.
Thanks a lot!
-- Peter
Peter_Sero
6 Operator
•
1.2K Posts
0
August 23rd, 2013 01:00
> what tools do you use that are also available to customers to perform workload simulation. On block side i always used IOmeter, iozone but on NAS it's so different. Something that could simulate hundreds of cifs users ..etc.
In addition to John's excellent advise, I'd suggest using dsh or pdsh
for actually distributing the load generators to a larger number of clients.
Of course you can also use a distributed batch system (Grid Engine etc),
in particular if it already exists in your environment, or if you build
a dedicated heavy-duty test lab. Otherwise the installation might be overkill.
(Another -- pretty simple, but stable -- batch distribution
tool is http://cs.anu.edu.au/~bdm/autoson/)
-- Peter
dynamox
11 Legend
•
20.4K Posts
•
87.4K Points
0
August 23rd, 2013 06:00
it may not be specific work flow like genetics but something more generic like simulating thousand of cifs connections, something that you would see where Isilon i used as traditional office file share.
cassij
19 Posts
0
August 23rd, 2013 11:00
Nice tips, thank you,
John
cassij
19 Posts
0
August 23rd, 2013 12:00
I typically use Windows IOzone and set up SSH between the hosts to drive a modest load. However IOzone is more about measuring IOtypes and not really a synthetic approximation of real work-flows. As a tip, a way around Cygwin/Windows admin permissions is the following and it's worth noting if you ever want to scale out IOzone on windows with the least know-how:
# Create Mount juncture points.
cd c:/
mklink /d node1_client1 \\172.91.240.17\data\client1
mklink /d node2_client1 \\172.91.240.18\data\client2
...
mklink /d node10_client10 \\172.91.240.50\data\client10
You would then create your iozone -+m config file mapping to as many clients as you have then you map a given thread
-t 50
-+m file
winhost1 c:\node1_client1
winhost2 c:\node2_client1
...
winhost50 c:\node10_client10
I know there are lightweight VM's using windows PE, but i have found that setup to be costly and I avoid as too expensive on my time '-). I tend to look for the easiest and biggest hammer I can find that is relevant to testing or measurements you want.
IOzone will create 1 file to pound on. Fileop, which I haven't ran on Windows allows you to create a more varied work-flow. I tend to run fileop more as a contention generator than a real-world simulator.
Now as has been pointed out there are GRID engines and you can run whatever you want under them. It comes down to how you want to monitor the results. I tend to like benchmark tools that measure the CLIENT side performance as well as having a set of instrumentation and measurement on the Server i.e. isi statistics and InsightIQ for OneFS. In file serving you want to measure the CLIENT + NETWORK + SERVER. You then want to determine what you technology latencies and were serialization is occurring.
I am not as savvy on multi-client windows connectathon testing, it gets a bit complicated on windows all the fiddly GUI things and all I want are raw numbers and simple CLI execution '-).Historically I would use Connectathon for Linux/NFS, just and example of the sort of thing to find (http://www.linux-nfs.org/wiki/index.php/Connectathon_test_suite)
Best,
John
mcbris
1 Rookie
•
77 Posts
0
August 23rd, 2013 13:00
This Ask the Expert event has now concluded, but the conversations continue in the Isilon Support Forum. Thanks to John Cassidy for all his advice and answers!
This discussion will be locked while John prepares a quick summary of the discussion highlights, after which other subject matter experts on Isilon are welcome to post their thoughts and experiences on the topics. Thank you to all those who participated in this event!
MRWA
83 Posts
0
August 26th, 2013 08:00
I have unlocked the thread, and while we wait for John to summarize, the discussion can continue.
JonK1
2 Intern
•
247 Posts
0
September 19th, 2013 23:00
Superb, thanks for your reply!