Host losing paths.
ESX host may stop responding and require a reboot to recover.
Message from the ESXi vmkernel log:
2020-08-30T03:52:23.501Z cpu187:66638)WARNING: lpfc: lpfc_els_unsol_buffer:8330: 0:(0):0115 Unknown ELS command x7f26e705 received from NPORT x1f04c0
2020-08-30T03:52:28.325Z cpu187:66638)WARNING: lpfc: lpfc_els_unsol_buffer:8330: 0:(0):0115 Unknown ELS command x7effc405 received from NPORT x1f04c0
Message from the VPLEX firmware logs:
event fc/4: "This port has discovered the departure of the indicated port from the fabric."
128.2XX.2XX.37/cpu0/log:5988:W/"006016abc83a153324-2":36008:<6>2020/08/30 03:39:07.65: fc/4 A0-FC02.0: port 200000109b59a55d:100000109b59a55d:330fc0
(spn Emulex PPN-10:00:00:10:9b:59:a5:5d) (snn Emulex LPe16002B-M6 FV12.2.299.27 DV12.2.373.1 HN:localhost OS:VMware ESXi 6.5.0) (speed <unsupported by fabric>) departed
128.2XX.2XX.37/cpu0/log:5988:W/"006016abc83a153324-2":36009:<4>2020/08/30 03:39:07.65: stdf/18 FCP connection lost. IT: [Host1_vmhba1 (0x100000109b59a55d)
A0-FC02 (0xc00144879a780200)]
event fc/3: "This port has discovered the arrival of the indicated port on the fabric."
128.2XX.2XX.37/cpu0/log:5988:W/"006016abc83a153324-2":36020:<6>2020/08/30 03:40:37.66: fc/3 A0-FC02.0: port 200000109b59a55d:100000109b59a55d:330fc0
(spn Emulex PPN-10:00:00:10:9b:59:a5:5d) (snn Emulex LPe16002B-M6 FV12.2.299.27 DV12.2.373.1 HN:localhost OS:VMware ESXi 6.5.0) (speed <unsupported by fabric>) arrived
128.2XX.2XX.37/cpu0/log:5988:W/"006016abc83a153324-2":36027:<4>2020/08/30 04:03:28.34: stdf/17 FCP connection established. IT: [Host1_vmhba1 (0x100000109b59a55d)
A0-FC02 (0xc00144879a780200)]
RecoverPoint HBAs log the same events but do not reestablish the FCP connection to VPLEX.
128.2XX.2XX.38/cpu0/log:5988:W/"0060167206e212230-2":576811:<6>2021/05/04 23:20:32.37: fc/4 B0-FC03.0: port 5001248000642e75:5001248100642e75:394980 (spn ?) (snn ?) (speed 8Gb FC) departed
128.2XX.253.38/cpu0/log:5988:W/"0060167206e212230-2":576962:<4>2021/05/04 23:20:32.38: stdf/18 FCP connection lost. IT: [RPA1_P1 (0x5001248100642e75) B0-FC03 (0x50001442b0753a03)]
128.2XX.2XX.38/cpu0/log:5988:W/"0060167206e212230-2":577128:<6>2021/05/04 23:22:02.42: fc/3 B0-FC03.0: port 5001248000642e75:5001248100642e75:394980 (spn ?) (snn ?) (speed 8Gb FC) arrived
Change:
Zone activation.
HBA ports and VPLEX front-end ports are not involved in the zoning changes.
VPLEX performs fabric discovery on fibre channel ports (front and back-end, and FC-WANCOM) every 90 seconds and does this using "Get all next" (GA_NXT) name server command. It performs this outside of receiving a RSCN from the switch or PLOGI from a zoned HBA.
Due to Cisco issue CSCvw75655, if VPLEX is performing its fabric discovery on a front-end port while a zone set activation or commit is underway, there is a small chance that VPLEX only returns its own fibre channel address (FCID). It assumes that any HBA logged into it, is no longer connected to the fabric, and sends a logout (PLOGO) to each HBA zoned to it.
VPLEX logs the fc/4 events for every HBA that it logs out and fc/3 events, on the next 90 second fabric discovery, when it receives the correct information from the switch name server.
How the HBA handles this logout depends on its driver or firmware. The ESX host in this example is unavailable and required a reboot.
Note: Periodic fabric discovery is done to ensure that VPLEX has updated fabric data. There is a possibility that not all the RSCNs reach the VPLEX from the fabric.
Workaround:
On the Connectrix switch, disable the name server or zone server shared database feature as follows:
switch# no zoneset capability active mode shared-db vsan <vsan-id>
Note: The zone set shared-db function is an efficiency where name server and the zone server share information. Disabling the feature has no negative impact on the environment.
Cisco has confirmed that the change is a local change and not global. This command should be run on every switch that has VPLEX attached to it.
Resolution:
NX-OS 8.4(2c)
Products:
Cisco MDS 9000 NX-OS and SAN-OS Software
Known Affected Releases
NX-OS 8.3(2)
VPLEX Fabric Discovery:
Example:
Host 1, Host 2, and Host 3 zoned to a single VPLEX front-end port.
VPLEX front-end port: FCID 0x200b20
Host 1: FCID 0x340000
Host 2: FCID 0x340020
Host 3: FCID 0x340040
Working.
Cisco issue CSCvw75655.