Start a Conversation

This post is more than 5 years old

Solved!

Go to Solution

125378

January 25th, 2012 20:00

Multiple iSCSI paths (aka Multipath) on ESXi v5 with PS6100XS

Configuration:

  • EqualLogic PS6100XS with 5.1.2 firmware, with half (2) of the ports on each CM plugged into one switch and half the other switch. Dedicated management network is also configured
  • 2 ProCurve 2910al 24 port switches, W.14.69 firmware. LACP connects on 4 ports (2 on each switch) . Half of each switch is dedicated to data traffic and the other half is for iSCSI traffic.
  • HP G7 DL380 Server with 2 NICs for data and 2 NICs for iSCSI. Running ESXi 5.0.0, 474610 on VMware Essentials (not Plus)
I've followed the steps here  http://yourmacguy.wordpress.com/2009/11/09/vsphere-iscsi-multipathing/ to enable multipathing. I've tried many variations, but the closest I can get to it is by first disabling (via the ProCurve web UI) one of the NICs before following the steps. If I do this, then I can actually get ESXi to show me two Active connections in the Manage Paths interface on my LUNs. However, If I enable the second NIC, I lose all access to the SAN. I can't even ping it from the ESXi server. Either NIC by itself (with the other disabled) works just fine with the SAN.
At first I thought it was a limitation of Essentials Plus, but I've confirmed that Essentials is capable of multipathing. I'm wondering if it's a setting on the switches. I've got flow control and jumbo frames enabled. I've also been able to enable NIC Teaming on the data side of the server (I know this is not to be done on the iSCSI side, and it's not). 

61 Posts

January 26th, 2012 16:00

Just to close the loop on this:

Tech support had me do did several things:

  1. Fixed the MTU (to 9000 - I had done this before, but had forgotten the last time I recreated it)
  2. Changed the IP address of the heartbeat network to be an IP address lower than the IPs for the iSCSI traffic (I don't think I'd read this anywhere else)
  3. That still didn't fix it, so we removed the whole network and started over from scratch
  4. That fixed it if I had both NICs plugged into the same switch, but not if they were separated across two switches
  5. I changed the LACP connection to a Trunk and tagged all the VLANs we use on that trunk (I couldn't find a way to put VLAN tags on an LACP connection)
Success! I can now take even drastic steps, such as yanking the power to one of the switches, and everything stays up with nary a single ping lost. Very pleased. Thanks Don for your help.

5 Practitioner

 • 

274.2K Posts

January 25th, 2012 21:00

Sounds like a spanning tree problem.   On general principle, upgrade array firmware to 5.2.0.  

I would first try just using one switch.  Remove the LACP link cables as well.

Even an incorrect iSCSI MPIO won't cause what you are seeing.

You should also upgrade ESXi, to current build 515841.  That has two important iSCSI fixes.  One that can cause restoring iSCSI connections to take hours and the other to allow you to change a timeout value for iSCSI logins.

There's a Tech Report on the Equallogic Support site that will show you how to configure iSCSI MPIO with ESXi v5.  The link you provided was for ESX v4.x.   It's allot easier to do it in v5 since you can use the ESX GUI for everything.

One question, is the array on the same subnet as the VMkernel ports?   Once you enable MPIO, you can't route on those VMK ports any more.  

One thing I'm not positive on is if the HP2910al uses the Provision chipset or not.  That's required to run both Flowcontrol and Jumbo Frames at the same time.   Not related to this issue but it could lead to a performance problem in the long run

Regards,

61 Posts

January 26th, 2012 10:00

I know that there can be only one NIC per VMkernel port, and I thought I had it configured that way. I read about the storage heartbeat, but it sounded as though that was a best practice, but not required, so I hadn't worried about it yet. Here's the output of those commands:

~ # esxcfg-vmknic -l

Interface  Port Group/DVPort   IP Family IP Address                              Netmask         Broadcast       MAC Address       MTU     TSO MSS   Enabled Type                

vmk0       Management Network  IPv4      10.0.0.23                               255.255.255.0   10.0.0.255      b4:99:ba:01:b2:34 1500    65535     true    STATIC              

vmk1       iSCSI1              IPv4      10.9.9.23                               255.255.255.0   10.9.9.255      00:50:56:7d:79:18 9000    65535     true    STATIC              

vmk2       iSCSI2              IPv4      10.9.9.24                               255.255.255.0   10.9.9.255      00:50:56:7a:84:a3 1500    65535     true    STATIC              

~ # esxcfg-route -l

VMkernel Routes:

Network          Netmask          Gateway          Interface      

10.0.0.0         255.255.255.0    Local Subnet     vmk0          

10.9.9.0         255.255.255.0    Local Subnet     vmk1          

default          0.0.0.0          10.9.9.1         vmk1          

~ # esxcfg-vswitch -l

Switch Name      Num Ports   Used Ports  Configured Ports  MTU     Uplinks  

vSwitch0         128         4           128               1500    vmnic0,vmnic1

 PortGroup Name        VLAN ID  Used Ports  Uplinks  

 VM Voice              210      0           vmnic0,vmnic1

 VM Network            200      0           vmnic0,vmnic1

 Management Network    0        1           vmnic0,vmnic1

Switch Name      Num Ports   Used Ports  Configured Ports  MTU     Uplinks  

vSwitch1         128         5           128               9000    vmnic2    

 PortGroup Name        VLAN ID  Used Ports  Uplinks  

 VM Network 2          0        0           vmnic2    

 iSCSI2                0        1           vmnic3    

 iSCSI1                0        1           vmnic2    

~ #

5 Practitioner

 • 

274.2K Posts

January 26th, 2012 10:00

One need for sure is the "Storage Heartbeat" VMK port I mentioned.   If you pull the cable associated with VMK1, pings will fail, since that's the default port for that subnet.  

2nd issue that the VMK2 isn't set for Jumbo Frames, the MTU is at 1500.   That will cause problem as well.

Dell Support can help you rebuild the configuration to add the heartbeat VMK port in.  You'll end up having to unbind VMK1 so that can become the Storage Hearbeat and create a VMK3 to replace VMK1.  The lowest VMK port in a given subnet becomes the default port used.  

5 Practitioner

 • 

274.2K Posts

January 26th, 2012 10:00

You can only have one active NIC per iSCSI VMKernel port.  That other(s) HAVE to be UNUSED.  Otherwise it won't use that VMK for iSCSI.   It has to be a 1:1 relationship.  

Which Tech Report did you use?  You have to use the one for ESXi v5.  It also talks about creating a "Storage Heartbeat" VMK port, and NOT using VMK0 for iSCSI.

Send the output of the following and I can take a quick look at the configuration.

#esxcfg-vmknic -l

#esxcfg-route -l

#esxcfg-vswitch -l

Best bet for quick resolution may be to open a support case with Dell/EQL to get your configuration reviewed.

Regards,

Don

61 Posts

January 26th, 2012 10:00

Thanks, Don-

I followed all of your suggestions: updated to 5.2.0, upgraded to 515841, read thru the Tech Report, and disabled the second switch. I confirmed that 2910 switches can do both jumbo frames and flow control. Also, to answer your question, Yes, the VMkernel ports are on the same subnet as the array. I also installed and began using MEM.

So, after all that, I have good news and bad news. The good news is that, if I disconnect the second switch and plug both NICs from the server into the main switch, I can still ping the SAN (previously having both NICs plugged in disabled my ability to ping). The bad news is that it still seems to be only using one of the ports, because if I disconnect the one it's using, I can no longer ping the SAN.

The strange new development is that the second pNIC on my iSCSI vSwitch no reports "No used" as its status. The other one still says "1000/full". Seems as though it's somehow determining that it can't/shouldn't use both NICs.

Since I'm only using one switch, I don't think that spanning tree is the issue here, right?

5 Practitioner

 • 

274.2K Posts

January 26th, 2012 16:00

You are most welcome.

FYI:  It's not the IP address that has to be lower.  That won't make any difference.  The Storage Heartbeat (SHB) must be the lowest VMkernel PORT.   I.e. VMK1   That's being used in the iSCSI subnet.

If you run #esxcfg-route you will see what VMK port is the default for each subnet and which is the default VMK port associated with the default GW.

If SHB isn't setup correctly then if a switch were to fail or restart connectivity might not be restored in a timely fashion.  Which could cause VMs to crash.

Regards,

5 Practitioner

 • 

274.2K Posts

January 27th, 2012 11:00

FYI:  Here's a link to the updated iSCSI config guide for ESXi v5 and EQL

Configuring iSCSI connectivity with VMware vStorage 5 and Dell EqualLogic PS Series Storage

en.community.dell.com/.../19997606.aspx

This Technical Report will explain how to configure and connect a Dell™ EqualLogic™ PS Series SAN to a VMware® vSphere™ 5 Environment using the software iSCSI initiator.

No Events found!

Top