AndMar2
1 Nickel

Failover mode and Initiator Type best Practices for VNX5300 VSphere 5 iSCSI

Hello guys,

I am configuring a VNX5300 in order to connect a VSphere 5.0 thru iSCSI. Checking at the following link I found the best practices advise to configure the VNX this way, with the native mp plugin:

NMPVMW_SATP_DEFAULT_AAVMW_PSP_FIXED

http://partnerweb.vmware.com/comp_guide2/detail.php?deviceCategory=san&productid=19518

I tried some configuration on the VNX but I never get the VMW_SATP_DEFAULT_AA on the VSphere server, I always get VMW_SATP_CX.

Could you please help understand how to configure the Failover mode and the Intiator type for the VSphere Host on the VNX array?

Thanks in advance

Andrew

12 Replies
Highlighted
christopher_ime
4 Ruthenium

Re: Failover mode and Initiator Type best Practices for VNX5300 VSphere 5 iSCSI

Firstly let me direct you to two very good documents regarding vSphere and CX/VNX connectivity.  They are both available from PowerLink via the breadcrumb trails below:

1) EMC Host Connectivity Guide for VMWare ESX Server
Home > Support > Technical Documentation and Advisories > Host Connectivity/HBAs > Installation/Configuration

2) TechBook: Using EMC VNX Storage with VMware vSphere

Home > Support > Technical Documentation and Advisories > TechBooks


SATP (Storage Array Type Plug-in) and (Unisphere) Host initiator Failover Modes
================================================================
Firstly, VMX_SATP_* is reference to an SATP (Storage Array Type Plug-in) and would change depending on the vendor's implementation of the extension to the PSA (Pluggable Storage Architecture) framework.  When connected to a VNX/CLARiiON you would only use/see one of two (and will never see the generic: VMW_SATP_DEFAULT_AA):

1) VMW_SATP_CX
2) VMW_SATP_ALUA_CX

These are defined by the host initiators' "Failover Mode" as set in "Connectivity Status" within Unisphere.  With your environment, there are only two possible choices for ESXi (and ESX for 4.x).

1) Failover Mode 1 (FM1) = PNR (Passive Not Ready)


2) Failover Mode 4 (FM4) = ALUA (Assymetic Logical Unit Access)

In summary, the Failover Mode defines how the array responds to I/O via a path to the non-owning SP.  Without going into it here, I'd like to defer you to an old white paper but still relevant and the diagram showing the redirection as required should speak the proverbial "thousand words":

http://www.emc.com/collateral/hardware/white-papers/h2890-emc-clariion-asymm-active-wp.pdf

It is fair to say that where supported, ALUA (simulating an ACTIVE/ACTIVE array model reducing the trespass requirements) is the best choice but the array and the host have to support it.  In regards to ESX/ESXi, ALUA is supported with:

1) ESX/ESXi 4.x (this is when it was first introduced by VMware)

2) FLARE 28.5 patch 704 (or newer)

a) ALUA array support was actually released with FLARE 26 (this is when EMC first introduced it but the initial implementaiton only supports SCSI-2)

b) With the VNX, ALUA has been supported since its initial release

Therefore, from your comment about always getting VMW_SATP_CX, this means that your Failover Mode is set to 1 (or it was changed to 4 and the ESXi 5 servers weren't yet rebooted as is required when changing from one to another but I'm assuming this is not the case for you).  So your first consideration should be to change the Failover Mode to 4 (ALUA) since you meet/exceed the requirements above.  This is possible within Unisphere:

1) Using the "Failover Wizard" in the menu to the right when clicking into "Hosts"

2) Or, for each path associated with the registered host in "Connectivity Status":

a) Highlight the registered host (will select all paths associated with it or can select individual paths)

b) Use same settings as before but only modify the "Failover Mode"

Then when updated:

3) Reboot the ESXi 5 hosts

4) Confirm the host recognizes the modified settings (Storage Array Type = VMW_SATP_ALUA_CX)

Also, from the host's perspective the array advertises itself as an ACTIVE/ACTIVE architecture (even though there is still LUN ownership and via the upper-director requests are forwarded as necessary via the CMI channel as you will have read) so all paths will now show as "Active" instead of half "Inactive" (as was the case with Failover Mode = 1) when viewed within the vSphere Client.  However, the host will only (by default) use the optimal paths as described below)


NMP (Native Multipathing Plugin)
==========================
Then, once set to ALUA and confirmed (after a reboot) that the SATP has been updated, you then have two choices for the PSP (Path Selection Policy):

1) Round-Robin
2) Fixed (default in ESXi 5)

NOTE: in ESX 4.x (not relevant to this conversation), there was the introduction of a PSP called: "FIXED with ARRAY PREFERENCE".  The observed behavior was as follows:

The ARRAY PREFERENCE from my experience aligned more with the “optimized” path, meaning the current owner and not the default owner.  Thus if you have 30 hosts all booting up at different times they would choose the optimized path which could be different depending on when the host was rebooted and what the current owner of the path was.  Without AP and FM4 any path that responds the quickest would be chosen.  Of course having paths chosen by default owner would be nice, but I don’t believe that was part of it.

However, this is no longer in ESXi 5, but it basically made choices for you in regards to the "Preferred Path".  Also, in its decision tree, it did not make any effort to balance the paths either so it was possible that per SP/per host, the same preferred path was always chosen.  This also happens to be the default PSP in ESX/ESXi 4 when the SATP is set to VMX_SATP_ALUA_CX.

Depending on which perspective you consider as discussed later, there would be preference over one or the other, but it is incorrect to state that EMC or VMware only supports one or the other.  They are both valid choices (when configured with ALUA), but each have their own management concerns. 


PROS/CONS

==========

ROUND-ROBIN better balances the load across the paths than one can ever do with FIXED; guaranteed.  RR is sending I/O down one (optimal) path then the next but not simultaneously.  It is possible, though, that if the trespassed LUNs are not managed, for example, after an NDU or code upgrade everything is now running on one SP thus overloading it.  The solution would of course be to manage the trespassed LUNs over time and keeping in mind that seeing a trespassed LUN isn’t a result of it being RR, just that it doesn’t have a mechanism to fail-back.  Whatever condition that prompted the trespass originally would have occurred with either FIXED or RR.

More than once, I've either heard the statement "ROUND-ROBIN causes trespasses storms" or the question was asked if it does.  I wanted to share my thoughts about it and makes some points about the possible Native Path Selection Policies (PSP) when PowerPath/VE is not used.

More often than not, what clients running ROUND-ROBIN are calling a “trespass storm” is simply because overtime the LUNs have explicitly/implicitly (ALUA) trespassed and without a failback mechanism it remained on the peer SP (unlike FIXED which restores it eventually back to the assigned preferred path).  A client who hasn’t been monitoring their trespassed LUNs with a RR configuration suddenly sees many/all of their LUNs on the peer SP and calls that a “trespass storm”.  Technically a true “trespass storm” would be seen in Unisphere with the LUN bouncing continuously back-and-forth from default to peer SP.   However, barring this scenario, in actuality FIXED causes more trespasses than RR when taken literally.  Under normal conditions, if a LUN were going to trespass originally, again it isn’t because it is FIXED or RR; the question is would it revert back to the original owner when the original path(s) are again available.  With FIXED it would trespass once more (back to what would be the original default owner in a properly configured environment), but with RR, it would remain on the peer SP; so in actuality, you have twice as many trespasses with this literal example.  Also, in a way, one can even take it further and suggest that FIXED can cause trespass storms if the original issue that caused the trespass is intermittent.

Personally when I am with a client, I mention both possible choices: RR or FIXED (of course with anything ESX 4.0+ and at least FLARE 28.5 patch 704 or greater they should be running ALUA, but never an argument there).  It would be imho a disservice to not mention both options which are each valid and leave it up to the client to choose.  Even our documentation mention both solutions and for every example where FIXED is recommended, there is an equal number of statements where ROUND-ROBIN is suggested.  With the failback mechanism, many will suggest that FIXED is best practice, but in a well managed environment ROUND-ROBIN can most certainly be implemented (except in a MSCS environment).

CONFIGURATION

=============

1) ROUND-ROBIN
a) Better distributes the load across the fabric than anyone can do manually with FIXED (guaranteed) by sending by default 1000 I/O down one optimal path then 1000 I/O down the other (never simultaneously though)

b) However, you will need to manage trespassed LUNs and of course I don’t expect them to do it in the GUI which is cumbersome and instead leave with them the following commands:

naviseccli <SPA> trespass mine
naviseccli <SPB> trespass mine

c) I also remind clients, as tempting as it may be, to not enable “useANO=1” (use Active-Non Optimized); they will eventually read about it.
By setting this, you are telling your hosts to include the non-optimal paths even in a healthy environment where all configured paths from the host to the VNX are available for I/O.  A non-optimal path would be a path from the host to the owning SP's peer then through the CMI (CLARiiON Messaging Interface) then to the owning SP.  By leaving it at 0 (default), then unoptimized paths reported by the array won't be included in the RR path selection until optimized paths become unavailable.  With ALUA configured all paths will show ACTIVE; however, only the optimal paths or those associated with the current SP owner will show ACTIVE (I/O).

d) Also, there is a way of changing the default 1000 I/O’s of RR.  I’m indifferent about it personally, but the client will eventually read about it.  EMC has a good whitepaper about the results of changing from 1000 and 1 and the effects on different I/O profiles.  I'll supply the command for sake of completeness.

esxcli nmp roundrobin setconfig --device <device UID> –iops

e) Make RR the default PSP (path selection policy) for the ALUA SATP (storage array type plugin)
- Currently, FIXED is the default PSP when ALUA is used
- Depending on the version of ESX, the command to change this behavior is as follows:

ESX 4.x (reboot required):

esxcli nmp satp setdefaultpsp --satp=VMW_SATP_ALUA_CX --psp=VMW_PSP_RR

ESX 5.x (reboot not required):

esxcli storage nmp satp set -s VMW_SATP_ALUA_CX -P VMW_PSP_RR

h) Install the Path Management feature of the Virtual Storage Integrator plug-in available from PowerLink via the following breadcrumb trail:

Home > Support > Software Downloads and Licensing > Downloads T-Z > Virtual Storage Integrator (VSI)

- In bulk, can manage the PSP for “EMC devices” (versus manually modifying individually per LUN on each host)
- Keep in mind (whether or not you agree with the behavior) that this only affects what is currently presented, for instance, unless you change the default PSP with the commands above, any new LUNs that are presented will use FIXED and the admin will need to rerun the plugin (or of course, change the default behavior).

While you are in PowerLink, you may also want to install the other relevant VSI features:

- Storage Viewer

- Unified Storage Management


2) FIXED
a) Unlike RR, this has a mechanism to failback (preferred path)

b) However, FIXED has its management concerns as you have to manually select the preferred path (VSI Path Management feature does not assist with the preferred path):
- You need to make sure the path corresponds to the default owner or else you force a trespass unintentionally
- Furthermore, need to do this for each host and for each lun
- Also, the assumption is that they are manually balancing the LUNs as best they can so that one path (per host) is not utilized more than the other
- it is fair to say that for each example where clients weren’t managing their trespassed LUNs with RR and eventually running entirely on one SP, there are just as many examples of clients with misconfigured preferred paths.  For instance, imagine a scenario where host 1 had a preferred path for the same LUN on SPA and host 2 had a preferred path for the same LUN on SPB.


So what would be the best solution?  It would be one that load balances and has visibility to the queue on the SP’s, can send I/O down the optimized (pool of) paths simultaneously, and has a fail-back mechanism.  Of course, PowerPath/VE offers this but is not a solution for all as it requires a vSphere Enterprise/Enterprise Plus license and a PowerPath/VE license (trial is available: emc.com/powerpath-ve-trial).

0 Kudos
christopher_ime
4 Ruthenium

Re: Failover mode and Initiator Type best Practices for VNX5300 VSphere 5 iSCSI

Sorry, a few corrections (in red):

1) "ALUA (Asymmetric Logic Unit Access)"

Not the mess in the beginning of the response.

2) Firstly, "VMW_SATP_* is reference..."

Typo in just the beginning but correctly listed throughout

Also, meant to make a quick note above that ALUA (Failover Mode 4) is also one of the prerequisites for VAAI (vStorage APIs for Array Integration) support required by the host.  This hardware acceleration/offload feature and its primitives has been discussed in detail in other posts.

0 Kudos
AndMar2
1 Nickel

Re: Failover mode and Initiator Type best Practices for VNX5300 VSphere 5 iSCSI

Hello Chris,

I would thank you for your detailed answer and the URL provided. Just to be more shortly I did not written down all the story of my question...

The point is, I've open a SR to VMWare because I received a lot of disconnection warning message from the VNX, and I found that the cause of this disconnection are the hosts (ESXi 5). (the server is a Cisco UCS C200M2 with Broadcom NIC NetXtreme II 5709 Quad with iSCSI HBA/TOE)

They told me that ESXi 5 only support the option mentioned aloft (hearing that was weird because the ESX 4.1 was supporting ALUA and other pretty stuff), that's way I've mentioned only the VMW_SATP_CX, I was not considering the ALUA as "configurable" in this circumtance.

Anyway, imho the problem is related to the hardware and not to the configuration(and your answer helps me to state that), because I've tested all the SATP and PSP options available, but I always get the same error message on the vmkernel log.

Moreover, I found that I'm not hte first one that is experiencing problems with that NIC.....

2012-03-14T17:14:40.459Z cpu0:4804)bnx2i::0x41001360ad10: bnx2i_conn_stop::vmnic9 - sess 0x41000d70af48 conn 0x41000d70b2d0, icid 31, cmd stats={p=0,a=1,ts=20653,tc=20652}, ofld_conns 8

2012-03-14T17:14:40.459Z cpu0:4804)iscsi_linux: [vmhba40: H:8 C:0 T:1] session blocked

2012-03-14T17:14:40.459Z cpu8:5297)WARNING: LinScsi: SCSILinuxAbortCommands:1798:Failed, Driver bnx2i, for vmhba40

2012-03-14T17:14:40.569Z cpu0:4804)bnx2i::0x41001360ad10: bnx2i_ep_disconnect: vmnic9: disconnecting ep 0x41001321a0b0 {31, 14dc00}, conn 0x41000d70b2d0, sess 0x41000d70af48, hba-state 1, num active conns 8

2012-03-14T17:14:40.570Z cpu6:4102)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x2a (0x412400ec1c80) to dev "naa.60060160caf02e00b4845fd80148e111" on path "vmhba40:C0:T1:L3" Failed: H:0x8 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.Act:EVAL

2012-03-14T17:14:40.570Z cpu6:4102)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.60060160caf02e00b4845fd80148e111" state in doubt; requested fast path state update...

2012-03-14T17:14:40.570Z cpu6:4102)ScsiDeviceIO: 2316: Cmd(0x412400ec1c80) 0x2a, CmdSN 0x4f0 to dev "naa.60060160caf02e00b4845fd80148e111" failed H:0x8 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.

2012-03-14T17:14:45.404Z cpu0:4804)<1>bnx2i::0x41001360ad10: conn update: icid 32 - MBL 0x40000 FBL 0x0MRDSL_I 0x20000 MRDSL_T 0x10000

2012-03-14T17:14:45.405Z cpu0:4804)iscsi_linux: [vmhba40: H:8 C:0 T:1] session unblocked"

If you have any ideas about that, I would be gratefull if you can share it with me.

Thanks again

Andrew

0 Kudos
christopher_ime
4 Ruthenium

Re: Failover mode and Initiator Type best Practices for VNX5300 VSphere 5 iSCSI

Thanks for the clarification.

I was able to dig up the following from VMware Communities.  As you mentioned, there are many people having issues when using the Broadcom iSCSI driver (configuring as Hardware iSCSI).  While not ideal, but for any dedicated iSCSI HBA/TOE card the iSCSI Software Adapter is always an option, and not surprisingly people have reported that this is an alternative work-around.  If you choose this option, remember to configure the iSCSI VMkernel Port bindings, per best practice, as you would if they were generic 1GbE NIC's.  ESXi 5 now provides a GUI interface to perform this task but in 4.x it could only be performed via the CLI: esxcli swiscsi nic add

NOTE: in ESXi 5.x, VMkernel Port binding can still be done via CLI via the slightly modified syntax:  esxcli iscsi networkportal add

Seems that Broadcom acknowledged the issue recently with a test driver as suggested from one of the later responses in that post from 3/10/2012 (5 days ago):

http://communities.vmware.com/thread/276107?start=0&tstart=0

[...]

I wrote a lot of emails from broadcom in this case. And?
Finally a solution!

I got a new test driver for the Broadcom iSCSI adapter.

Now everything works as it should offloading and properly supports iscsi.

[...]

Finally, probably just a reminder, even though I only addressed the specific question you had regarding initiator settings (failover mode) and PSP, make sure you reference the two guides I noted earlier regarding iSCSI connectivity.  For instance the usual best practices for iSCSI connectivity:

1) Separate subnets for each adapter and the corresponding SP ports they will connect to (also separate from the "mgmt" network)

2) Disable Delayed Ack

3) Review single vSwitch (multiple VMkernel ports while still maintaining separate subnets of course) vs. separate vSwitch for each NIC/VMkernel ports

4) VMkernel Port binding (VMkernel to network adapters mapping)

etc.

0 Kudos
christopher_ime
4 Ruthenium

Re: Failover mode and Initiator Type best Practices for VNX5300 VSphere 5 iSCSI

AndMar wrote:

The point is, I've open a SR to VMWare because I received a lot of disconnection warning message from the VNX, and I found that the cause of this disconnection are the hosts (ESXi 5). (the server is a Cisco UCS C200M2 with Broadcom NIC NetXtreme II 5709 Quad with iSCSI HBA/TOE)

They told me that ESXi 5 only support the option mentioned aloft (hearing that was weird because the ESX 4.1 was supporting ALUA and other pretty stuff), that's way I've mentioned only the VMW_SATP_CX, I was not considering the ALUA as "configurable" in this circumtance.

Interesting, if they are suggesting that only Failover Mode of 1 (PNR) is supported (or just VMW_SATP_CX) with ESXi 5, then I will only say that it is not a true general statement.  That would suggest then that you couldn't benefit from the hardware offload of VAAI as ALUA is one of the prerequisites.  However, I definitely don't want to second guess their comment, so I'll assume there is something specific to your environment that disqualifies it (but am personally not seeing the culprit).  Without looking through the VMware compatibility guides, from EMC's perspective and iSCSI Adapters we support as follows:

"All 1 Gb/s or 10 Gb/s NICs for iSCSI connectivity, as supported by the Server/OS vendor."

However, still search the "ESM by Host" PDF for any reference to this specific adapter (NOTE: you won't find it separately when building results from "Advanced Query" or "Solutions and Wizards" as you will with FC HBA's):

https://elabnavigator.emc.com/vault/pdf/esm_by_host.pdf

You'll only find the following comment:

[..]

Broadcom iSCSI boot is supported with following adapters

- Broadcom 57710 based cards

- Broadcom 57711 based cards

- Broadcom 5708

- Broadcom 5709

[..]

0 Kudos
christopher_ime
4 Ruthenium

Re: Failover mode and Initiator Type best Practices for VNX5300 VSphere 5 iSCSI

Simply wanted to mention that an EMC KB article is "In Progress" and should be made public shortly.  It makes mention of the ESX Driver: bnx2i (which the OP is using as noted in the pasted error logs above)

emc290457: "iSCSI Logout Info=0x0120071d [Target NopTimeout] with ESX 5.0 Host and NetXtreme II NIC"

[...]

When installing ESX(i) 5 and using the NetXtreme II NIC with TOE based on Broadcom 57711 chip (ESX Driver bnx2i) make sure to download and install latest Driver CD for Broadcom NetXtreme II Netowrk/iSCSI/FCoE Driver. This Driver CD is available in the download Section from VMware.

[...]

0 Kudos
Baif
2 Iron

Re: Failover mode and Initiator Type best Practices for VNX5300 VSphere 5 iSCSI

vSphere 5.1 and EMC Storage Multipathing

kb.vmware.com/kb/2034799 Load balancing using Round Robin multipathing policy on EMC VNX arrays on vsphere 5.1

kb.vmware.com/kb/2034797 Load balancing using Round Robin multipathing policy on EMC Symmetrix arrays on vsphere 5.1

PowerPath/VE fails to load on vSphere 5.1

kb.vmware.com/kb/2034796 VMware and EMC have identified two issues with PowerPath/VE 5.7 and VMware vSphere 5.1.

0 Kudos
christopher_ime
4 Ruthenium

Re: Failover mode and Initiator Type best Practices for VNX5300 VSphere 5 iSCSI

Yes, now that vSphere 5.1 is out, that changes a few things when talking about the use of vSphere Native Multipathing as Round-Robin is now the default PSP for ALUA/failover mode of 4 (or rather for the VMW_SATP_ALUA_CX SATP).  Whereas in 5.0 FIXED and in 4.x FIXED "with array preference" was the default.

RR PSP also now provides proper path rebalancing when used with VNX OE 32 (and a mentioned backport to FLARE 30 in the comments section of the following article from Chad).  Baif, please review the following articles for more detail.

http://virtualgeek.typepad.com/virtual_geek/2012/08/vmworld-2012-vmware-emc-storagethe-best-gets-bet...

http://velemental.com/2012/09/07/fixedround-robin-in-5-1-and-a-simple-powercli-block-pathing-module/

What I find interesting is in that in some cases before these recent enhancements, it was identified (as seen in Chad's video) with Fixed the paths may not have been rebalanced properly after all and manually trespassing LUNs back was required.  Therefore, our arguments before these enhancements/fixes where Fixed provided failback wasn't always the case.  Interesting.

0 Kudos
A1exp1
1 Copper

Re: Failover mode and Initiator Type best Practices for VNX5300 VSphere 5 iSCSI

I've got a cx3-40 that I can only run in failover mode 1 (vmw_satp_cx).

Having read all the info about round robin in 5.1 I can't see any issues with swapping to RR but I can't find anybody that mentions it's okay for active/passive, all talk is about ALUA arrays.

Are there any issues with going to RR with vmw_satp_cx using 5.1?

0 Kudos