QFULL and Execution Throttle / queue depth

Question

Hello everybody, I'm in a long and painful discussion with a colleague of mine about preventing QFULLs from the storage ports. For those who don't know what QFULLs are: these are little messages sent from a storage port if the outstanding IOs are hitting the ports ceiling of 1600 (on VNX). If a new IO reaches the port and the port hit its limit, the port will send back a QFULL message to the initiator and the OS belonging to that HBA will have to deal with it. In the old days OSs might have responwith these nice BSODs, but nowadays it's dealt with in a way that IOs are paused and after a little while will slowly try to start running to the storage ports again. I noticed that a few years ago the (QLogic) 'execution throttle' had a max value of 256, but nowadays I'm seeing as high as 64k (!!!). Bare in mind that a single storage port on Clariion or VNX can only deal with 1600 outstanding IOs. So if a server sends out too many IOs that end up in the 'outstanding IO queue' on the CX/VNX, these nasy little QFULLs start flowing in again. I've seen VNX 5700s getting hundreds of them every few minutes or so (in Analyzer real time view), so I can imagine that the customer will notice delays all the time. The way we can solve this anoying QFULL protection mechanism is to set the HBAs 'execution throttle' to a more convenient level like 32 or 16, depending on the number of servers attached to each port and the number of LUNs each server actually uses. In VMware there are 2 knowledge base articles you might want to read: http://kb.vmware.com/kb/1267 and http://kb.vmware.com/kb/1268. My question here is: does anyone other than me actually uses this setting? I know customers who are, but also a few that don't and I'm looking for a piece of documentation which I can use to convince my collegue to start using the setting.

jpveen · Answer

Hi RRR,

Most of the cases the VNX port queue depth of 1600 is not a bottleneck. If you run ALUA across 4 paths and distribute luns across both SP's you will effectively have a max queue depth for your ESX cluster(s) of 6400. In personal I don't see this as a bottleneck in most of my large customerconfigs.

However there is another issue with QueueFull condition in combination with lun-queuefull. The queue depth is limited per lun. This means that a certain queue on a single lun can cause a QFull to be reported to the host. My experience is that server QFull conditions are mostly caused due to lun queue-full conditions, and not to port queue fulls.

The lun queue full conditions will occur quite fast. With Flare 31 and R5 41 pools the lun queue full will be triggered with 88 IO's in the queue. So even with a pool of 100 drives the max queue depth on a lun on this pool will be 88. With Flare32 this number increases, depending on the number of datadrives in the pool. For example with 20 drives in a pool this number raises to 224 IOPS. See also emc204523. Especially with a low number of large luns in a VMFS config you can hit this limit relatively quicker than the port queue limit.

And finally what are we doing with VMware? I allways make customers aware of the queue full possibilities. But lowering the HBA-queue depth settings for each environment I usually don't recommend. What should you set? In my opinion it's better to monitor QFull conditions and take actions on specific severs/environments if needed.

In ESX3.5 adaptive queuedepth throttling is introduced. In my opinion this is a better approach than lowering HBA queue depths globally. In ESX5i it's even possible to configure adaptive queue depth on an individual disk basis.

So to summarize my opinion:

- Monitor for Queue Full conditions
- Be aware of the lun queue limits
- In ESX environments use adaptive queue depth rather than limiting all HBA instances.

RRR · Answer

Thanks for explaning this. I didn't know these things about VMware

dynamox · Answer

can i see LUN queue in Analyzer ?

kelleg · Answer

In Analyzer you can look at the Queue Length or the Average Busy Queue Length (better as this shows what the queue is when the object is busy). This is the Queue that is referred to above (14 * disks) +32 -- ex. R5 <4+1> would be (14 * 4) + 32 = 88.

glen

dynamox · Answer

thank you, i never understood what optimal and nonoptimal next to the counter mean ?

kelleg · Answer

That's supposed to be when the access to the data is using the CMI bus - like with a ALUA trespass - it's not working yet - optimal path (direct path), non-optimal path (ALUA using CMI bus). glen

RRR · Answer

Wow, it hardly happens that you don't know some specific thing

RRR · Answer

Thanks. I knew the queue length for a LUN was 88, but I never knew a LUN queue FULL was an actual status as well.

dynamox · Answer

RRR wrote:Wow, it hardly happens that you don't know some specific thing happens all the time

RRR · Answer

That can only mean 1 thing: you’re not a bot after all

alkhvo · Answer

Hi all! And what about lun queue length at MCx?

RRR · Answer

Hello alkhvo: as far as I know it remains the same with MCx.

VNX

Was this post helpful?