I have a customer that makes use of large ESX Clusters (32 nodes, ESX 4.1 and 5). Historically they have presented all storage (500gb-2TB striped metavolumes from dmx/vmax) via four FA paths. Each ESX node has two hba ports, each going to a separate fabric, where two of the FA paths reside
Their IO load has grown to the point where we need to add additional FA paths to support this.
We are looking at splitting the LUN count into two separate FA groups to increase their capabilities, but the client is concerned that this approach will not work. They have stated that past experience with ESX showed it would not recognize a LUN as a valid resource unless it was presented to all ESX nodes from the same storage port.
In my limited knowledge/expertise, I was only aware of the requirement that every LUN presented to ESX, needed to be presented as the same LUN number to all the ESX nodes in the cluster. (not necessarily from the same storage port)
I am attaching a JPG that shows where they are today, and two proposed options, each with some benefits...however zone counts are a significant concern for them (over a thousand right now just for ESX in each fabric - good old SIST zoning :-( )
Can anyone shed additional light on this topic? I have done some research with EMC docs and vmware, but haven't stumbled across the details or situation as described.
Since ESX 3.5 (i think U2) the LUN number is no longer used in identifying paths for devices. As long as the array is compliant (which these are) the WWN of the device will be used which does not change across FAs.So ESX (4/5) will not have an issue seeing these as valid paths to these devices no matter how you pair the FAs with the host initiators.
so it's not an issue if on my FA 4a:0 and 13a:0 device is presented to host as LUN ID 20 and from FA 8a:0 and 9a:0 it's presented as host LUN ID 25 ?
Correct. With the version of ESX indicated, as well as the Enginuity levels LUN numbers are not used in identifying valid paths. We always recommend consistent lun numbers, but since ESX changed how they identify device paths to use the WWN it is mostly just for ease of management and legacy support.
thank you for the replies..in looking at the SAN config guide for ESX 4.1 it states that LUN addressing still has impact with multipathing?
Fibre Channel SAN Configuration Guide ESX 4.1
For multipathing to work properly, each LUN must present the same LUN ID number to all ESX/ESXi
To ensure that the ESX/ESXi system recognizes the LUNs at startup time,
provision all LUNs to the appropriate HBAs before you connect the SAN to the
VMware recommends that you provision all LUNs to all ESX/ESXi HBAs at the
same time. HBA failover works only if all HBAs see the same LUNs.
For LUNs that will be shared among multiple hosts, make sure that LUN IDs
are consistent across all hosts. For example, LUN 5 should be mapped to host
1, host 2, and host 3 as LUN 5.
I guess i should have added a caveat to what i said. There is an issue with performing a vMotion of a VM with RDMs that do not have consistent LUN #s between the hosts and paths. It will fail until you fix it or temporarly remove the LUNs and re-add them after vMotion.
Okay, based on some of this discussion, stumbled across this KB which appears to indicated 3.5 U5 and up does not rely on LUN ID, but NAA support (Network Address Authority)...NOTE that SYMMETRIX supports NAA usage as long as the SPC2 bit is set (as documented).
This behavior is prevented through the support and use of NAA for SAN LUNs with VMware ESX 3.5 Update 5 and later. We did introduce the supportability for NAA ID's with ESX 3.5 GA, however we stopped using the LUN ID to reference the device from Update 5 onwards.
If the SAN array or LUNs do not support NAA, the SAN presentation of LUN IDs must be uniform or consistent across all hosts/initiators. Thus versions of VMware ESX prior to 3.5 Update 5 require uniform LUN IDs and storage presentation. See VMFS resignaturing in the Additional Information section for more information.
any additional comments or suggestions from anyone?
So, for multiple ESX nodes prior to 3.5 Update 5, Uniform LUN ID is required. Is this LUN ID the one mapped on symmetrix FA which can be designated by storage admin from symmwin?
from symcli it could be either the address on the FA or if you are using dynamic lun masking it's the available LUN id, not necessarily address on the FA.
Could you please comment on how other host OSs\multipathing software identify LUNs - eg, Veritas DMP, windows, Linux, etc. Do they need to see the same LUN ID after being rebooted or if LUNs are removed & re-added?
I am not to worried about clustered servers, my concern is for a DR environment where I have to remove the R2s & add gold clones to the DR hosts as the need be. I am planning to to use nested SGs so the LUN ID may change when I remove the R2s & add the Gold clones.