1 Rookie

 • 

103 Posts

March 31st, 2009 11:00

Host=Windows 2003 64bit
Array=6.26

2 Intern

 • 

1.3K Posts

March 31st, 2009 11:00

which OS??

2.2K Posts

March 31st, 2009 15:00

Glen,
When did the storport recommendation change? We have been deploying 950903 based on the e-lab and support recommendations from a month or so ago. Now when I look in e-lab all the recommendations are for 943545.

That sucks, I am no looking forward to going back and removing the new storport and installing the older storport.

4 Operator

 • 

4.5K Posts

March 31st, 2009 15:00

Make sure that you get the Qlogic drivers and installation guide from the EMC section on Qlogic's web site - the drivers are setup for clariion and the install guide have EMC' recommended settings. Also, be sure to install Microsoft hit fix 943545 as recommended on the site - this is very important for performance. Also, check to see if you have 950903 installed - if so, you should remove it - we've seen issues with this hot fix.

glen

45 Posts

April 1st, 2009 07:00

I am not crazy about the high EMC NVRAM default value for QLogic execution throttle. I discovered the issue on QLA2340s when I started using Invista (because it send alerts for queue full events).

The Clariion will respond with a queue full for two different events. This information is directly out of the Clariion Best Practices Planning guide for FLARE 26.

"A high degree of request concurrently is usually desirable, and results in a good return on investment. However if an array's queues are too large, it will respond with a queue-full flow control command. The CX3 front-end port drivers return a queue-full status command under two conditions:
- The total number of concurrent host requests at the port is 1,984 (internally, the port value is 2048, but 64 requests are reserved for special commands)
- The total number of requests for a given LUN at a given port is
(14 * (the number of data drives in the LUN)) + 32.

The host response to a queue-full is HBA-dependent, but it typically results in a suspension of activity for more than one second. Though rare, this can have serious consequences on throughput if this happens repeatedly."


I was having serious issues with some of my high IO servers with the default of 255. I had some LUNS on 4+4 R1_0. This makes the calculation 14*4+32=88. Therefore, if I ever had more than 88 requests for a given LUN on a given port, I was getting queue full events.

I have since manually changed the execution throttle on all of my QLogic cards down to 64 and no longer get the queue full events. I have only had good results from this change. If I understood it correctly, and remember it correctly, the QLogic waits a random amount of time up to 1 second before trying IO's again after getting a queue full. Can anyone confirm or deny this information?

I have notified a few people in EMC support that the default EMC NVRAM setting of 255 is too high for most environments and should be set lower, but I have not seen a change. You would need 16 data drives in a raid group to NOT hit the queue full message if you were using one clariion port. Unfortunately, the max # of spindles in a RG is 16, so it's impossible to have 16 data spindles in a RG. So, this means that your LUN would need to be a MetaLUN across at least two raid groups to NOT hit this threshold. I am not sure why EMC changed the value to 255 from QLogics default of 16 (as found in the QLA2340 help document).

Please use the above calculation to determine what your execution throttle should be set to, and make it a little bit lower. Or just set it to 64 and you'll be good to go in most cases (e.g. 4+1, 4+4 and anything larger). If you're using a bunch of 1+1s, you'll need to be lower.

I found that there is a setting on the QLogic called "Enable Extended Error Logging" which is supposed to log queue full events in the event log. There is also a counter in NaviAnalyzer on the SP Port that is supposed to show queue-full events, but I have had issues with that not working properly on some versions of FLARE (sorry, I can't recall which ones it did and didn't work on).

I hope this information can help you and others out there who are having poor disk performance due to queue full events.

4 Operator

 • 

4.5K Posts

April 1st, 2009 07:00

An errata was posted in the Jan. ESM to cover until the Feb. ESM was published.

glen

1 Rookie

 • 

103 Posts

April 1st, 2009 09:00

Thxs Glen, I will look into this.

1 Rookie

 • 

103 Posts

April 1st, 2009 11:00

shewitt,

Great info!

I'm going to see if changing the execution throttle will make difference. Where is the "Enable Extended Error Logging" located? Can this be adjusted in SanSurfer?

Here's another question to you all...based on the (14*(number of data drives in a lun)) +32, would i still use this rule if my server is configured as follows?

DB lun=2x4+1
indexes/misc db=6+1
tran logs=3+3
tempdb/temp logs=2+2
backup drive=6+1

if added this correctly, it's about 566? isn't the max execution throttle at 256?

45 Posts

April 1st, 2009 18:00

The way I understand it, you need to set the execution throttle to work for the smallest LUN (smallest as in the least number of data disks in the RG). In your example your smallest is a 2+2, so, 14*2+32=60.
As far as I can tell, on the QLA2340 cards I have, this is a global setting for the entire HBA. Therefore, it needs to be set low enough to handle the "smallest" LUN. The Emulex cards I use (LPe11000) specifically say that the setting is per lun "Outstanding Requests on a per Lun or Target Basis (see QueueTarget)". In either case, you still need to be set to handle the "smallest" LUN.

If I was in your situation, I would change the value to 60 on the qlogic card and see if you notice any performance changes.
Also, check NaviAnalyzer for queue-full errors on the SP ports. As mentioned before, I did not see them in Analyzer even though I was getting them.
The "enable extended error logging" on the qla2340 is on the "settings" tab under "advanced HBA port settings" on the left side. It's a checkbox.

Is there anyone else out there who has seen similar results as me? Can anyone confirm my understanding of this setting?
It made a huge, positive, impact in my environment, so I'm hoping it'll help you as well.

75 Posts

May 5th, 2011 13:00

Hi,

I've a CX4-480 Flare 30 Patch 509 with 35x ESX Vsphere 4.1 attached on.

I've ALUA and Round Robin set, those can see:

1) 3x Datastore made of 3x extents as follow: 15x HDD 300 GB 15K (4+1 Raid5 x3), divided by 3 and striped between them with metalun (stripe multiplier 4)

2) 4x Datastore of 6x 146 GB 15K in RAID 10

3) 2x Datastore made of 2x Raid 10 of 4x 450 GB 15K by striped with the metalun

With all of this disks, I can see a lot of QUEUE on my SP with I do some mod on my vmware enviroment (storage vmotion or vm cloning....) etc...

I've Execution Throttle set to 256 on all of my hosts... Can It be too much great?

75 Posts

May 5th, 2011 15:00

Thanks for your answer.

Can this setting be too much fewer for ESX?

I saw this primus, but can this mod create some kind of bottleneck on my esx's hba?

4 Operator

 • 

4.5K Posts

May 5th, 2011 15:00

See Support Solution emc204523 "What is the cause of high Queuing on CLARiiON drives? " on PowerLink.

The Execution Throttle for ESX should be 32 (default), 256 is way too high.

glen

4 Operator

 • 

4.5K Posts

May 9th, 2011 14:00

If the Execution Throttle for Qlogic HBA's on ESX is set to 256, that is too high - set to 32. Then test performance.

glen

No Events found!

Top