VNXe has no ALUA

Question

Good Day

To the experts well done this post.

Question: I have found out that the VNXe does not support currently ALUA????

Reason/Causes: Because of no ALUA ESXi causes system downtime as no traffic is being send via the Switch2. ALUA has not been enabled/supported.

Scenario:

1x VNXe 3300

1x iSCSI server (on SPA) assisnged with two ip's.

2x HP Procurves 2910al (not stacked as they can't be). Both act as single switch as we do not want to have stacked switch (single unit basiclly)

ESXi hosts with multiple vmkernel ports assinged to iSCSI vSwitch. Two links per switch.

Jumbo frames enabled.

If we connect everything up to a single switch we do get 4 paths per datastore and all is working 100%.

Port 5-8 connection to switch 1 (EMC VNXe SPA) and Port 9-12 should be connecting to switch2 (EMC VNXe SPB)
- This is for High Availability
For the ESXi server 02 the following iSCSI vmk IP’s have been assigned.
- Quad NIC card 01:
  - ESX01: Port13 -10.1.1.30 to switch 1
  - ESX01: Port14 -10.1.2.30 to switch 2
- Quad NIC card 02:
  - ESX01: Port15 -10.1.1.31 to switch 1
  - ESX01: Port16 -10.1.2.31 to switch 2
All paths have been redundant configured and you can see that all 4 iSCSI paths on the ESXi do work 100% on load balancing as seen in the single switch configuration(Ports 13-16)
EMC LACP on the switch working 100% (port 5-8)
EMC SPB does not receive any traffic at all. Port 9-12 because only a single iSCSI target has been configured that resides on EMC VNXe SPA
SAN iSCSI configuration: EMC has confirmed below is correct. A support call has been logged with EMC support call reference SR#XXXX (request it from me if rquired).

Ignore the line on the right of the ESX hosts (mistake). Only two lines of course.

Problem being experienced: As soon as we introduce switch2 for redundancy no traffic is being send over switch2 via the connected vmknics. If we rescan the datastores we only see suddenly 2 paths and not anymore 4. If we move all connections to switch2 everything works.

If we connect vmknic 5 & vmknic 7 to switch1 (How it should be)
1. We only can vmkping iSCSI target 10.1.1.10
2. We cannot vmkping iSCSI target 10.1.2.10
If we connect vmknic 4 & vmknic 6 to switch2 (How it should be)
1. We can pint nothing
2. If I manually fail over the ISCSI target IP 10.1.2.10 to SPB (switch2) I suddenly can vmkping iSCSI target 10.1.2.10
If we connect vmknic5 & vmknic4 to switch1
1. We can ping iSCSI IP target 10.1.1.10 & 10.1.2.10
If we connect vmknic6 & vmknic7 to switch2 we cannot ping anything
1. If we manually fail over to SPB (switch2) we cannot ping anything.

The above does not make sense at all. On a single switch we can vmkping via all iSCSI vmk ports. As soon as we split the configuration for HA we lose paths? This causes ESXi to panic and freezes the whole server which causes system down time as I suspect it cannot connect properly to the ISCSI targets.

TAKE NOTE: In single switch configurations it works 100%. We can remove one by one all three iSCSI cables from the ESXi hosts and it fails over to the active paths. When we plug back all comes back to normal. Same counts for the EMC VNXe.

Hope it makes sense what I am trying to show.

Thank you

buckelij · Answer

I don't believe the VNXe uses ALUA. It does have an internal Fail Safe Network so traffic can be routed to SPA from an interface on SPB. If the ports on SPA go down (or switch 1 fails), the IP addresses will migrate to SPB, though the iSCSI service will remain on SPA.

If we connect vmknic 5 & vmknic 7 to switch1 (How it should be)
We only can vmkping iSCSI target 10.1.1.10
We cannot vmkping iSCSI target 10.1.2.10

this is a problem. It sounds like these vmknics don't know they can reach 10.1.2.

If we connect vmknic 4 & vmknic 6 to switch2 (How it should be)
We can ping nothing

With just 4 & 6 connected to switch2 (and no connections to switch 1), this is expected.

If we connect vmknic5 & vmknic4 to switch1
We can ping iSCSI IP target 10.1.1.10 & 10.1.2.10

Then this is what you should do. And Connect vmknic 6 & 7 to switch 2. vmknic 6&7 will only be used in switch1 fails.

If you want traffic to travel over switch 1 and 2 in normal operation, you should break up the LACP and connect the first half of the ports on SPA to switch1, half to switch2, and the same for SPB.

Mabro1 · Answer

I moved this post to the VNXe Support Forum from the Ask the Expert Forum. This ensures your question is in the right community and get's visibility from all VNXe Support community members.

Regards,

Mark

Ducan123 · Answer

Hi Thank you for your reply. 'this is a problem. It sounds like these vmknics don't know they can reach 10.1.2.' The weird thing is if all ports are on a single switch all IP's can be reached which just does not make sense to me. 'Then this is what you should do. And Connect vmknic 6 & 7 to switch 2. vmknic 6&7 will only be used in switch1 fails.' How can it fail over if ESXi only see the 2 online paths.  If all is connected to a single switch we can see 4 paths. 'If you want traffic to travel over switch 1 and 2 in normal operation, you should break up the LACP and connect the first half  of the ports on SPA to switch1, half to switch2, and the same for SPB' Can you please explain this more in details?

Ducan123 · Answer

Thank you.

buckelij · Answer

'this is a problem. It sounds like these vmknics don't know they can reach 10.1.2.'The weird thing is if all ports are on a single switch all IP's can be reached which just does not make sense to me. ESX thinks it can reach the 10.1.2 network only over port14 and port16. If those are plugged into switch1, you'll be able to ping the 10.1.2 iscsi targets. If they're plugged into switch2, you won't (because SPA is only connected to switch1). How can it fail over if ESXi only see the 2 online paths.  If all is connected to a single switch we can see 4 paths. You only need the two paths. If you breakup the LACP and run SPA to switch1 and switch2, then those two paths will go over both switches. If you don't breakup the LACP, the redundancy will come from Failsafe Networking, which will route traffic from SPB ports (on switch2) to SPA via the VNXe's internal network (if all the ports on SPA go down). Regardless, I think there is an ESX configuration issue. You want ESX to think it can reach both the 10.1.1 and 10.1.2 networks over switch1 (and that switch 1 should be prefered, with failover to switch2). Why do you have multiple IP addresses for the iscsi target if it's all going over the same switch? Unless you need the LACP for another reason (CIFS or NFS bandwidth, for example), you should run SPA-port1-10.1.1 to switch1, SPA-port2-10.1.2 to switch2, SPB-port1 to switch1, SPB-port2 to switch2.

Ducan123 · Answer

Thank you for the feedback

Created LACP for higher throughput

If I break the LACP I will only have 2x1GB/s troughput. To achieve more I have to add then more iSCSI targets that reside on SPB as an example. Currently we only implemented one.

Ducan123 · Answer

An article that I have found: https://community.emc.com/message/562286#562286

Ducan123 · Answer

From reading various support articles and the comments in this support forum can I comment the following:

VNXe does not support ALUA

When using iSCSI do not use LACP. Rather implement more iSCSI servers and present them to ESXi

To achieve HA from a SAN perspective use stacked switches. Separated switches will not work, except if you link both switches via a trunk (to not create a bottleneck)

buckelij · Answer

Separate switches will work -- but SPAport1 needs to be connected to the same switch as SPBport1, and SPAport2 needs to be connected to the same switch as SPBport2, etc.

Stacking works fine, but be aware that in some implementations upgrading the software on the stack may require an outage for the whole stack.

Henriwithani · Answer

I'm not sure If I got all the ports and vmknics firgured out correctly. Are those switches connected together?

Ducan123 wrote:
Created LACP for higher throughput
If I break the LACP I will only have 2x1GB/s troughput. To achieve more I have to add then more iSCSI targets that reside on SPB as an example. Currently we only implemented one.

Even with 4 port LACP your max throughput will be the throughput of two ports.

These might be helpfull:

Ask the Expert: VNXe front-end Networks with VMware

Ask The Expert Wrap Up - VNXe Front-end Networks with VMware

@henriwithani

Ducan123 · Answer

Hi

The vmknics are configured correctly as it has been confirmed by other team members.

Both switches have not yet been connected together but thats the next test I will do in the next maintenance schedule.

Will update the post as soon as I am done.

VNX

VNXe has no ALUA

Was this post helpful?