DurkinR

50 Posts

194663

November 1st, 2012 13:00

ESXi 5.1, Intel X520, PS6110XV - Latency issue

Hello,

Configuration

Setting up our first R710 host running ESXi 5.1, with a dual-port Intel X520 10GbE card for sw iSCSI, connected to a PS6110XV through Force10 S4810 switches.

DCB is disabled on the PS array. No warnings/errors in the logs. It's running the 5.2.5 firmware.

The EQ MEM 1.1.1 is installed with storage heartbeat properly configured on the ESXi host, and appears to be working fine. DelayedAck is disabled, and iSCSI login timeout is set to 60s.

The Force10 switches have jumbo frames and flow control rx on tx on setup, and they are stacked.

A test virtual machine is setup and running on a test volume.

Problem

Seeing high latency issues on the host. Using the vSphere Client to watch read/write latency on the sw iSCSI HBA. In the past hour, the max has shot up to 7000 ms, with a couple of other bumps in the 1000-2000 range. In between these peaks is generally seems low, less than 10 ms. During the configuration of the Intel NICs, I noticed I cannot enable flow control. The default setting is autonegotiate on, rx off, tx off. I've tried a couple ways to enable it and even added it to the startup script and it still says it is disabled, when viewed with the ethtool -a vmnicx command.

Can someone point me in the right direction on where to look for a problem? This is my first run with 10GbE, our current production setup uses ESX 4.1u2 at 1GbE, and it runs fairly well, certainly no latency above 100ms.

Responses(34)

A

Anonymous

5 Practitioner

•

274.2K Posts

0

December 11th, 2012 12:00

Did you see OP issue was resolved by Force10?

It looks like we have resolved the issue, and by we, I mean Force10 tech support.

To recap, after setting up our new EqualLogic array , connected to two Force10 S4810 switches and an ESXi 5.1 host using an Intel X520 NIC, we were having connection issues. 5-8% retransmits (as measured from SAN HQ), high latency spikes on the ESXi host, The EqualLogic MPIO was not working properly, usually only one connection per volume at a time was working, frequent re-connections to the VMFS volume, pings would drop to the SAN, basically it was unusable.

We began by troubleshooting the ESXi host, but that lead nowhere, other than a helpful driver update that allowed me to enable FlowControl on the NIC.

Eventually we figured out that if we shut off one of the switches, the problem went away. Then we figured out that if we left both switches up but pulled the plug on one of the Port Channel connections, the problem went away.

The F10 S4810 switches are connected to our 1 gig SAN via a 10 GbE port channel, connected to two interfaces, one on each F10 switch. The F10 tech support agent figured out by looking at each stack config that the Cisco 3750Es had a dynamic port-channel config, and the F10s had a static config. So he re-built the port-channel on the F10s and made it dynamic, and for the last couple of hours, it has been working correctly, no re-transmits, no latency spikes, no connection drops. I would have never guessed the port-channel was working right because it never dropped a link. But the Force10 tech was very methodical and eventually figured out the port-channel was to blame.

Anyway, I hope that this helps someone out there someday. Definitely a strange problem that required turning on and off switches and pulling cables to track down.

P

Prime_RyanP

9 Posts

0

December 11th, 2012 12:00

Hello,

Configuration

Setting up our first R710 host running ESXi 5.1, with a dual-port Intel X520 10GbE card for sw iSCSI, connected to a PS6110XV through Force10 S4810 switches.

DCB is disabled on the PS array. No warnings/errors in the logs. It's running the 5.2.5 firmware.

The EQ MEM 1.1.1 is installed with storage heartbeat properly configured on the ESXi host, and appears to be working fine. DelayedAck is disabled, and iSCSI login timeout is set to 60s.

The Force10 switches have jumbo frames and flow control rx on tx on setup, and they are stacked.

A test virtual machine is setup and running on a test volume.

Problem

Seeing high latency issues on the host. Using the vSphere Client to watch read/write latency on the sw iSCSI HBA. In the past hour, the max has shot up to 7000 ms, with a couple of other bumps in the 1000-2000 range. In between these peaks is generally seems low, less than 10 ms. During the configuration of the Intel NICs, I noticed I cannot enable flow control. The default setting is autonegotiate on, rx off, tx off. I've tried a couple ways to enable it and even added it to the startup script and it still says it is disabled, when viewed with the ethtool -a vmnicx command.

Can someone point me in the right direction on where to look for a problem? This is my first run with 10GbE, our current production setup uses ESX 4.1u2 at 1GbE, and it runs fairly well, certainly no latency above 100ms.

Having the same issue here, tried all the above methods of resolution, no luck yet...

Has anyone come across any certified/confirmed fixes yet?

DurkinR

50 Posts

0

December 26th, 2012 09:00

Hello Searching...

I do not know anything about managing PowerConnect switches, I recommend you follow-up with a Dell support ticket.

I have been told that the latest Force10 firmware allows you to mix 10GbE and 1GbE traffic with flow-control enabled and no performance issues. So perhaps for you migrating all your connections to the Force10 equipment would be a good option.

As for creating a 1 GbE swiscsi adapater and a 10 GbE swiscsi adapter, no we have not done that, VMware only supports one software iscsi initiator. We are migrating hosts to 10 GbE, and during the transition, we will have two EqualLogic groups, one with all 1 GbE arrays and another with 10 GbE arrays. the 1 GbE hosts will continue to connect to the 1 GbE arrays and the migrated hosts will only connect to the 10 GbE arrays. We'll have two clusters of hosts until we are all migrated to 10 GbE.

See this for more information http://pubs.vmware.com/vsphere-50/index.jsp?topic=/com.vmware.vsphere.storage.doc_50/GUID-99BB81AC-5342-45E5-BF67-8D43647FAD31.html

A

Anonymous

5 Practitioner

•

274.2K Posts

0

December 26th, 2012 09:00

Re: SW iSCSI adapter. If you had the 1GbE and 10GbE arrays on different IP subnets and not in same group, you can in effect do that. Since the SW iSCSI adapter binds VMK ports which have a physical NIC assigned to them. You can create VMK kernel ports (where iSCSI is driven through) that use 10GbE NICs for the 10GbE arrays and 1GbE NICs for the 1GbE arrays.

Again, not for mixed 10GbE/1GbE groups.

1
2
3

View All

No Events found!