Unsolved
This post is more than 5 years old
3 Posts
0
1570
April 22nd, 2008 08:00
LUN performance issues
Hi, I am wondering if a performance issue on a 2TB file-share LUN is the reported 360,753,830 Stripe Crossings. This LUN does not have the alignment offset, which I know is a best practice, but I would like to be sure that performance will be noticeably improved before moving it to a new LUN which would have the alignment offset.
Can anyone give some feedback on determining the performance issues root cause for this LUN? Some additional information: Read Throughput is 170 [IO/s] and Write throughput is 109 [IO/s], utilization is 94.93%.
Can anyone give some feedback on determining the performance issues root cause for this LUN? Some additional information: Read Throughput is 170 [IO/s] and Write throughput is 109 [IO/s], utilization is 94.93%.
No Events found!


kelleg
6 Operator
•
4.5K Posts
0
April 22nd, 2008 10:00
1. Raid type
2. number is disks in the Raid Group
3. number of LUNs in the Raid Group
4. Total IO throughput for all LUNs in the RG
5. Queue Length for each of the LUNs in the RG
Stripe Crossings is a measurement of the times that your data crosses a stripe when you write. If you have a 4+1 R5, the stripe is 64KB (each disk) * 4 (data disks) = 256KB. If you write 1MB of data, you'll get 4 stripe crossings. So it is not always a good indication of a perforamnce issue.
The off-set alignment affects disk crossing, which will affect performance. The Best Practices guides have information on the type of IO that will be more affected by this than other types.
Some on this forum can probably provide you with specific examples of what they have seen.
Please see the below for additional information about Best Practices.
EMC CLARiiON Best Practices for Fibre Channel Storage: FLARE Release 26 Firmware Update - Best Practices Planning
http://powerlink.emc.com/km/live1/en_US/Offering_Technical/White_Paper/H2358_clariion_best_prac_fibre_chnl_wp_ldv.pdf
EMC CLARiiON Fibre Channel Storage Fundamentals - Technology Concepts and Business Considerations
http://powerlink.emc.com/km/live1/en_US/Offering_Technical/White_Paper/H1049_emc_clariion_fibre_channel_storage_fundamentals_ldv.pdf
regards,
glen kelley
Kiran3
410 Posts
0
April 22nd, 2008 21:00
mikel2031
3 Posts
0
April 23rd, 2008 07:00
1. Raid type is RAID 5
2. number is disks in the Raid Group is 9
3. number of LUNs in the Raid Group is 2, but the 2nd one has low host i/o.
4. Total IO throughput for all LUNs in the RG is similar to the single LUN
5. Queue Length for each of the LUNs in the RG: Where do I find this?
The sp cache has been enabled, where do I check if the LUN cache is enabled?
Thanks
kelleg
6 Operator
•
4.5K Posts
1
April 23rd, 2008 15:00
Queue length is normally viewed when you view an Analyzer archive file.
In a Raid 5 using 9 disks, this is an 8+1 - if each disk is a 10K FC disk then the IO/s for the raid group would be 120 IOPS (the limit on each disk for best performance) * 8 (the number of non-parity disks) = 960 IOPS.
If your IOPS are exceeding this, then you will see higher queue lengths and higher response times.
glen
h1pan1
18 Posts
0
July 15th, 2010 07:00
Good Glen!
Henry
RyanP2
261 Posts
0
July 15th, 2010 09:00
The utilization statistic is a bit misleading, so be careful on saying there is an issue because of it. The utilization by definition "Describes the fraction of a certain observation period that the system component is busy serving incoming requests." (from Navisphere help). Basically if the lun had at least 1 IO to complete during every poll that was done, 100% busy.
By chance do you have the license for Navisphere Analyzer? If so, then you can review many other thigns that might help you figure out what is happening.
First off you can check the queuing on the lun and drives under the lun. If there is queuing on the drives, then as Glen was mentioning, you may be exceeding the physical capabilities of the drives. If there is queuing, then consider reviewing the response time and service time values. Response time is the average total time an IO saw for the chosen device, and the service time is the average time the operation took to complete. the difference is that response time factors in time spent in a queue and service time doesn't. If the values have a large difference, then the queuing is having a large impact.
The 120 mark for the 10k RPM drives along with bandwidth specs for each drive speed can be seen in the EMC Best Practices Guide. The 120 IOP for 10k rpm drives was a benchmark test under very specific parameters, so the actual point where the drive bottlenecks depends on the type of IO and the locailty of the IO and size of the IO. Also in there is a section on "Storage System Sizing and Performance Planning" which has a lot of good info. Other performance info can be found in Navisphere help under the section "Analyzing storage-system performance using Analyzer".