72 Posts

February 17th, 2014 07:00

I did install SANHQ and am keeping a close eye on it. Firmware and drivers is a little tricky at the moment since I need to migrate everything off existing storage and onto EQL as I don't want to make things works. I'll go check the vmware HCL and check versions though as soon as I can. thanks for the suggestion

72 Posts

February 17th, 2014 07:00

I have followed that document and have disabled delayed ack and set LRO to 0. As for the Round Robin, I'm using the dell eql routed PSP.

That document was a little confusing but i think i'm okay with the dell PSP and don't have to adjust the round robin (the document seems to indicate that if i'm not using mem, use round robin and adjust)

1 Rookie

 • 

88 Posts

February 17th, 2014 07:00

Hi,

Have you checked you have the latest NIC firmware and also the latest drivers from the VMware support site.

Install EqualLogic SANHQ to see if this offers any further clues.

7 Technologist

 • 

729 Posts

February 17th, 2014 07:00

You should ensure that your setting on all your ESX hosts are configured as per this EqualLogic and ESX Best Practices document:

en.community.dell.com/.../20434601.aspx

Typically this is from a misconfigured Delayed ACK and/or LRO settings.

Also, you should check that MPIO is set to Round Robin with the IOs per path at 3.  (Note that the Default for RR is set to 1000 IOs before it actually stars to switch path)

-joe

72 Posts

February 17th, 2014 09:00

The vms only have 1 disk and the 1 or 2 with multiple disks do have multiple controllers. Seems like the spikes are too quick for SANHQ to pickup a lot of the time. With esxtop, i can see the disk (and guest) read/write latency jump to 30ms and then back down. I also noticed the DQLEN changes between 32 and 128. Is this normal?

Also, when the queue depth reported in SANHQ goes up, is that a bad thing?

72 Posts

February 17th, 2014 10:00

I'm looking at the "u" disk device view and I can see the disk/kernel/vm values. kernel seems minimal, it's the disk and vm values that spike.

SANHQ seems to be in the 5ms range. Only a couple spikes to 10 or 20ms. Do I need to increase the polling frequency to catch the spikes (if so, how?). In the live few, i can see spike to 60, 100... even saw a 200ms

As for delayed ack, i have rebooted each host after changing the value so i would assume i'm ok.

Are my expectations too high? Is spikes like this normal as long as it's not sustained for than a few seconds?

No Events found!

Top