LUN performance issues

Question

Hi, I am wondering if a performance issue on a 2TB file-share LUN is the reported 360,753,830 Stripe Crossings. This LUN does not have the alignment offset, which I know is a best practice, but I would like to be sure that performance will be noticeably improved before moving it to a new LUN which would have the alignment offset.
Can anyone give some feedback on determining the performance issues root cause for this LUN? Some additional information: Read Throughput is 170 [IO/s] and Write throughput is 109 [IO/s], utilization is 94.93%.

kelleg · Answer

Will need a bit more data:1. Raid type2. number is disks in the Raid Group3. number of LUNs in the Raid Group4. Total IO throughput for all LUNs in the RG5. Queue Length for each of the LUNs in the RGStripe Crossings is a measurement of the times that your data crosses a stripe when you write. If you have a 4+1 R5, the stripe is 64KB (each disk) * 4 (data disks) = 256KB. If you write 1MB of data, you'll get 4 stripe crossings. So it is not always a good indication of a perforamnce issue.The off-set alignment affects disk crossing, which will affect performance. The Best Practices guides have information on the type of IO that will be more affected by this than other types.Some on this forum can probably provide you with specific examples of what they have seen.Please see the below for additional information about Best Practices.EMC CLARiiON Best Practices for Fibre Channel Storage: FLARE Release 26 Firmware Update - Best Practices Planning http://powerlink.emc.com/km/live1/en_US/Offering_Technical/White_Paper/H2358_clariion_best_prac_fibre_chnl_wp_ldv.pdfEMC CLARiiON Fibre Channel Storage Fundamentals - Technology Concepts and Business Considerations http://powerlink.emc.com/km/live1/en_US/Offering_Technical/White_Paper/H1049_emc_clariion_fibre_channel_storage_fundamentals_ldv.pdfregards,glen kelley

Kiran3 · Answer

has the lun cache and sp cache been enabled?

mikel2031 · Answer

Here's the information requested:
1. Raid type is RAID 5
2. number is disks in the Raid Group is 9
3. number of LUNs in the Raid Group is 2, but the 2nd one has low host i/o.
4. Total IO throughput for all LUNs in the RG is similar to the single LUN
5. Queue Length for each of the LUNs in the RG: Where do I find this?

The sp cache has been enabled, where do I check if the LUN cache is enabled?
Thanks

kelleg · Answer

To see the cache setting for the LUNs, right click on the LUN and select Properties and then the cache tab.

Queue length is normally viewed when you view an Analyzer archive file.

In a Raid 5 using 9 disks, this is an 8+1 - if each disk is a 10K FC disk then the IO/s for the raid group would be 120 IOPS (the limit on each disk for best performance) * 8 (the number of non-parity disks) = 960 IOPS.

If your IOPS are exceeding this, then you will see higher queue lengths and higher response times.

glen

h1pan1 · Answer

Good Glen! Henry

RyanP2 · Answer

The utilization statistic is a bit misleading, so be careful on saying there is an issue because of it. The utilization by definition "Describes the fraction of a certain observation period that the system component is busy serving incoming requests." (from Navisphere help). Basically if the lun had at least 1 IO to complete during every poll that was done, 100% busy.

By chance do you have the license for Navisphere Analyzer? If so, then you can review many other thigns that might help you figure out what is happening.

First off you can check the queuing on the lun and drives under the lun. If there is queuing on the drives, then as Glen was mentioning, you may be exceeding the physical capabilities of the drives. If there is queuing, then consider reviewing the response time and service time values. Response time is the average total time an IO saw for the chosen device, and the service time is the average time the operation took to complete. the difference is that response time factors in time spent in a queue and service time doesn't. If the values have a large difference, then the queuing is having a large impact.

The 120 mark for the 10k RPM drives along with bandwidth specs for each drive speed can be seen in the EMC Best Practices Guide. The 120 IOP for 10k rpm drives was a benchmark test under very specific parameters, so the actual point where the drive bottlenecks depends on the type of IO and the locailty of the IO and size of the IO. Also in there is a section on "Storage System Sizing and Performance Planning" which has a lot of good info. Other performance info can be found in Navisphere help under the section "Analyzing storage-system performance using Analyzer".

CLARiiON

LUN performance issues

Was this post helpful?