This post is more than 5 years old

23 Posts

969

May 24th, 2009 04:00

Confused on IOPS count in Analyzer

Hello All,

I'm a bit confused on IOPS count in my analyzer readings. I'm on a CX-500 (2.19 flare). Upon opening some analyzer archive files, I am seeing numbers like 7 and 14 million I/O/sec as average sample points, whereas I believe the CX-500 is rated at 120,000 IOPS total.

Is there a difference between I/O/sec and IOPS? Or do I have to divide by my sample interval or some other number? My interval is 600, so this would make sense if I were to divide by 600, but since the sample points seem to be indicated in I/O per sec, I took this to mean that was the number at that very second.

Thanks in advance and sorry if a dumb question.

261 Posts

May 26th, 2009 07:00

If you review the SP Logs and find the trespassing every few seconds, then that is probably part of the problem and maybe more.

I know that with PVLinks that every setting needs to be correct to fully stop trespassing. If you can pull up Primus, here are a few to take a look at:

emc21180, emc143405, emc3038, emc44736, and emc99467.

-Ryan

261 Posts

May 24th, 2009 18:00

Not a dumb question at all.

First off, if you go into help in Navisphere there is a section called "analyzing storage system performance using analyzer" (or something along those lines). In there is a basic information section that has background info on the counters.

As for the I/O/s and IOPs, they are the same (both are IO per second).

The archive interval of 600 will give a pretty good broad look at the performance and the points in there do represent IOPs, so no dividing of anything is needed. One thing to keep in mind when reviewing these is that each point does represent 10 minutes of averaging, so they may not be as good as picture as you may want.

Now, as far as the 7 million IOPs goes, this is a "feature" of analzyer :) . Some things that can cause these huge numbers are a trespass during that 10 minute interval, or just a missed poll due to the busyness of the array (IO takes priority over analyzer). Can't speak for any known issues in analzyer as I don't know your patch (if you pull up the analzyer release ntoes you can check into those).

I would disregard the points that just don't seem possible.

-Ryan

23 Posts

May 25th, 2009 05:00

Hmmm....

There are just too many points to ignore. I am running 2 HP-UX hosts with PVLinks as opposed to Powerpath, and I do get many trespasses even though I have set PV timeout to 180.

I wonder if that is screwing with the IOPS count?

23 Posts

May 28th, 2009 06:00

Thanks much Randy.

I have seen these before tech notes before, but will look into changing other values beyond pvlink timeouts.

23 Posts

June 18th, 2009 23:00

Hi Everyone. I just wanted to mention that I solved the issue. I figured that something had to be polling the paths every so often for them to be tresspassing so much. I then noticed that 'proactive polling' was turned on (as it is by default) I turned it off by doing (in HP-UX 11.11) a pvchange -p n /dev/dsk/cXtXdX and voila! No more tresspasses--none!

I waited a few days and then checked my archive and was happy to find that I was only peaking out at 1536 i/o/sec

My users have also noticed an improvement in performance, which is also a plus.

After searching again through powerlink, I noticed that this issue was documented in emc143405. Interestingly, I did not have the -p option on my dev/test box until I happened to upgrade to the patches mentioned in that powerlink article.

Hopefully this might help someone.

Take care and thanks for all the help!

2 Intern

 • 

1.3K Posts

February 10th, 2010 04:00

Proactive polling was added in an LVM patch several months ago. Unfortunately, it was turned on automatically for every disk in the system. For internal disks, it's fine as the LVM code periodically polls all the disks to see if they are still alive. The idea is to possibly catch an error condition before real data is affected. So for JBOD (non-array disks like internal) it is fine.

However, for smart disk arrays such as EMC, Hitachi, IBM, Axiom, HP, etc, for path managers such as PowerPath, DynaPath, etc and for specialized SAN appliances such as products from FalconStor, this proactive polling interferes with performance and is a waste of CPU and I/O cycles. In a smart array, all management of disk failures is completely hidden from the host (as it should be).

You can remove proactive polling with pvchange -p n . The pvchange command will not have the -p option if the LVM patch level is quite old. You can do it during production but for volumes with lots of alternate paths, it can take 3-6 seconds per lvol. This is a one time change. Unfortunately, there is no way to change the default setting so adding any new disks (LUNs) will require turning the polling off.

No Events found!

Top