This post is more than 5 years old

1 Rookie

 • 

61 Posts

125508

January 25th, 2012 20:00

Multiple iSCSI paths (aka Multipath) on ESXi v5 with PS6100XS

Configuration:

  • EqualLogic PS6100XS with 5.1.2 firmware, with half (2) of the ports on each CM plugged into one switch and half the other switch. Dedicated management network is also configured
  • 2 ProCurve 2910al 24 port switches, W.14.69 firmware. LACP connects on 4 ports (2 on each switch) . Half of each switch is dedicated to data traffic and the other half is for iSCSI traffic.
  • HP G7 DL380 Server with 2 NICs for data and 2 NICs for iSCSI. Running ESXi 5.0.0, 474610 on VMware Essentials (not Plus)
I've followed the steps here  http://yourmacguy.wordpress.com/2009/11/09/vsphere-iscsi-multipathing/ to enable multipathing. I've tried many variations, but the closest I can get to it is by first disabling (via the ProCurve web UI) one of the NICs before following the steps. If I do this, then I can actually get ESXi to show me two Active connections in the Manage Paths interface on my LUNs. However, If I enable the second NIC, I lose all access to the SAN. I can't even ping it from the ESXi server. Either NIC by itself (with the other disabled) works just fine with the SAN.
At first I thought it was a limitation of Essentials Plus, but I've confirmed that Essentials is capable of multipathing. I'm wondering if it's a setting on the switches. I've got flow control and jumbo frames enabled. I've also been able to enable NIC Teaming on the data side of the server (I know this is not to be done on the iSCSI side, and it's not). 

1 Rookie

 • 

61 Posts

January 26th, 2012 16:00

Just to close the loop on this:

Tech support had me do did several things:

  1. Fixed the MTU (to 9000 - I had done this before, but had forgotten the last time I recreated it)
  2. Changed the IP address of the heartbeat network to be an IP address lower than the IPs for the iSCSI traffic (I don't think I'd read this anywhere else)
  3. That still didn't fix it, so we removed the whole network and started over from scratch
  4. That fixed it if I had both NICs plugged into the same switch, but not if they were separated across two switches
  5. I changed the LACP connection to a Trunk and tagged all the VLANs we use on that trunk (I couldn't find a way to put VLAN tags on an LACP connection)
Success! I can now take even drastic steps, such as yanking the power to one of the switches, and everything stays up with nary a single ping lost. Very pleased. Thanks Don for your help.

1 Rookie

 • 

61 Posts

January 26th, 2012 10:00

I know that there can be only one NIC per VMkernel port, and I thought I had it configured that way. I read about the storage heartbeat, but it sounded as though that was a best practice, but not required, so I hadn't worried about it yet. Here's the output of those commands:

~ # esxcfg-vmknic -l

Interface  Port Group/DVPort   IP Family IP Address                              Netmask         Broadcast       MAC Address       MTU     TSO MSS   Enabled Type                

vmk0       Management Network  IPv4      10.0.0.23                               255.255.255.0   10.0.0.255      b4:99:ba:01:b2:34 1500    65535     true    STATIC              

vmk1       iSCSI1              IPv4      10.9.9.23                               255.255.255.0   10.9.9.255      00:50:56:7d:79:18 9000    65535     true    STATIC              

vmk2       iSCSI2              IPv4      10.9.9.24                               255.255.255.0   10.9.9.255      00:50:56:7a:84:a3 1500    65535     true    STATIC              

~ # esxcfg-route -l

VMkernel Routes:

Network          Netmask          Gateway          Interface      

10.0.0.0         255.255.255.0    Local Subnet     vmk0          

10.9.9.0         255.255.255.0    Local Subnet     vmk1          

default          0.0.0.0          10.9.9.1         vmk1          

~ # esxcfg-vswitch -l

Switch Name      Num Ports   Used Ports  Configured Ports  MTU     Uplinks  

vSwitch0         128         4           128               1500    vmnic0,vmnic1

 PortGroup Name        VLAN ID  Used Ports  Uplinks  

 VM Voice              210      0           vmnic0,vmnic1

 VM Network            200      0           vmnic0,vmnic1

 Management Network    0        1           vmnic0,vmnic1

Switch Name      Num Ports   Used Ports  Configured Ports  MTU     Uplinks  

vSwitch1         128         5           128               9000    vmnic2    

 PortGroup Name        VLAN ID  Used Ports  Uplinks  

 VM Network 2          0        0           vmnic2    

 iSCSI2                0        1           vmnic3    

 iSCSI1                0        1           vmnic2    

~ #

1 Rookie

 • 

61 Posts

January 26th, 2012 10:00

Thanks, Don-

I followed all of your suggestions: updated to 5.2.0, upgraded to 515841, read thru the Tech Report, and disabled the second switch. I confirmed that 2910 switches can do both jumbo frames and flow control. Also, to answer your question, Yes, the VMkernel ports are on the same subnet as the array. I also installed and began using MEM.

So, after all that, I have good news and bad news. The good news is that, if I disconnect the second switch and plug both NICs from the server into the main switch, I can still ping the SAN (previously having both NICs plugged in disabled my ability to ping). The bad news is that it still seems to be only using one of the ports, because if I disconnect the one it's using, I can no longer ping the SAN.

The strange new development is that the second pNIC on my iSCSI vSwitch no reports "No used" as its status. The other one still says "1000/full". Seems as though it's somehow determining that it can't/shouldn't use both NICs.

Since I'm only using one switch, I don't think that spanning tree is the issue here, right?

No Events found!

Top