PowerPath/VE for VMware 无法在启动时声明(所有)VPLEX LUN
Summary: 重新启动主机时,NMP 将管理部分或全部 VPLEX LUN(而不是 PowerPath/VE)。
Symptoms
环境:
操作系统:VMware ESXi 6.0.0 Update 2 (build-3620759, build-4192238)
EMC 软件:PowerPath/VE for VMware vSphere 6.0
EMC 软件:PowerPath/VE for VMware vSphere 6.0 SP1
EMC 软件:PowerPath/VE for VMware vSphere 6.1
Server:HP ProLiant BL460c Gen9
主机总线适配器:Emulex Corporation Emulex OneConnect OCe14000,FCoE 启动器:650FLB CNA
HBA Driver: lpfc 11.1.145.18-1OEM.600.0.0.2768847 EMU VMwareCertified 2016-12-04
产品:VPLEX(5410、5520)
从 vmkernel.log
2017-05-16T08:06:50.035Z cpu21:33912)ScsiClaimrule: 1165: The current claimrules indicate that path vmhba0:C0:T0:L1 should be claimed by plugin PowerPath. 2017-05-16T08:06:50.035Z cpu21:33912)ScsiClaimrule: 1169: Path vmhba0:C0:T6:L1 which appears to refer to the same physical media as path vmhba0:C0:T0:L1 is already claimed by plugin NMP. 2017-05-16T08:06:50.035Z cpu21:33912)ScsiClaimrule: 1171: If neither of these paths is being masked by ESX, this condition indicates a problem with the claimrules. 2017-05-16T08:06:50.035Z cpu21:33912)WARNING: ScsiPath: 608: Path vmhba0:C0:T0:L1 claims to be a VVol PE but has a version of 4 (expected 5 or higher). Not treating it as a PE. 2017-05-16T08:06:50.036Z cpu21:33912)ScsiPath: 5549: Plugin 'NMP' claimed path 'vmhba0:C0:T0:L1'
Cause
在启动过程中,对问题设备的查询命令失败。
Resolution
在这种情况下,主机供应商在两台服务器中将两个 HBA 卡从 FLB 650 更换为 FLB 630。
服务器重新启动后,未发现任何问题。PowerPath/VE 正在正确管理设备。
Additional Information
HBA 的型号可在localcli_storage-core-adapter-list.txt输出中找到。
服务器品牌和型号可以在esxcfg-info_-a.txt.FRAG-00000.txt中找到。
VMware 的版本可在 vmware_-vl.txt中找到。
阵列和固件的型号可在localcli_storage-core-device-list.txt中找到。
为了正确地对问题进行故障处理,使用了工程专用内部版本。
PowerPath 依靠 SCSI 查询命令来声明路径。从工程测试包日志中,我们可以观察到查询命令最初是失败的,并出现 HOST_RETRY (0x12) 或 HOST_NO_CONNECT (0x1) 错误。但后来,当 ESXi 再次提供该设备时,它们成功导致 PowerPath 声明了该设备(如果没有工程内部版本,则不会看到第二个序列)
为了响应HOST_RETRY错误,PowerPath 甚至在内部 0.1 秒内多次重试查询命令。但是,从日志中可以看出,主机适配器仍然无法执行该命令。
开始时查询失败
2016-12-14T11:53:51.561Z cpu24:33396)PowerPath:Claiming path vmhba0:C0:T1:L0 2016-12-14T11:53:51.561Z cpu24:33396)PowerPath:PowerPlatformScsiIoErrorIsRetryable: cmd=0x12 Failed H: 0xc S: 0x0 P: 0x0 Path=vmhba0:C0:T1:L0 2016-12-14T11:53:51.663Z cpu24:33396)PowerPath:PowerPlatformScsiIoErrorIsRetryable: cmd=0x12 Failed H: 0xc S: 0x0 P: 0x0 Path=vmhba0:C0:T1:L0 2016-12-14T11:53:51.765Z cpu24:33396)PowerPath:PowerPlatformScsiIoErrorIsRetryable: cmd=0x12 Failed H: 0xc S: 0x0 P: 0x0 Path=vmhba0:C0:T1:L0 2016-12-14T11:53:51.867Z cpu24:33396)PowerPath:PowerPlatformScsiIoErrorIsRetryable: cmd=0x12 Failed H: 0xc S: 0x0 P: 0x0 Path=vmhba0:C0:T1:L0 2016-12-14T11:53:51.969Z cpu24:33396)PowerPath:PowerPlatformScsiIoErrorIsRetryable: cmd=0x12 Failed H: 0xc S: 0x0 P: 0x0 Path=vmhba0:C0:T1:L0 2016-12-14T11:53:56.772Z cpu46:33491)ALERT: PowerPath:MpxRecognize failed. Path vmhba0:C0:T1:L0 not claimed
查询在末尾成功,PowerPath 声明了设备 — 常规 GA 内部版本不会发生此序列。
2016-12-14T11:54:08.542Z cpu12:34080)PowerPath:Claiming path vmhba0:C0:T1:L0 2016-12-14T11:54:08.545Z cpu12:34080)PowerPath:Path Claim: Successfully claimed path vmhba0:C0:T1:L0
同时,我们可以看到 lpfc 驱动程序报告链路断开和启动、端口发现消息延迟等。
2016-12-14T11:53:46.586Z cpu30:33490)WARNING: lpfc: lpfc_mbx_cmpl_read_topology:3271: 0:1305 Link Down Event x5 received Data: x5 x20 x800220 x0 2016-12-14T11:53:46.704Z cpu4:33493)WARNING: lpfc: lpfc_mbx_cmpl_read_topology:3271: 1:1305 Link Down Event x5 received Data: x5 x20 x800220 x0 2016-12-14T11:53:49.334Z cpu30:33490)WARNING: lpfc: lpfc_mbx_cmpl_read_topology:3247: 0:1303 Link Up Event x6 received Data: x6 x0 x5 x0 x0 2016-12-14T11:53:52.337Z cpu25:33493)WARNING: lpfc: lpfc_mbx_cmpl_read_topology:3247: 1:1303 Link Up Event x6 received Data: x6 x0 x5 x0 x0 2016-12-14T11:53:52.452Z cpu25:33493)WARNING: lpfc: lpfc_sli4_async_fip_evt:5702: 1:2546 New FCF event, evt_tag:x7, index:x0 2016-12-14T11:53:52.479Z cpu24:33396)PowerPath:PowerPlatformScsiIoErrorIsRetryable: cmd=0x12 Failed H: 0xc S: 0x0 P: 0x0 Path=vmhba0:C0:T1:L0 2016-12-14T11:53:52.505Z cpu25:33493)WARNING: lpfc: lpfc_do_scr_ns_plogi:8098: 1:3334 Delay fc port discovery for 10 seconds
要声明路径,调查必须成功。但是,由于启动过程中出现主机适配器错误,查询失败,因此 PowerPath 未声明设备。
这不是 PowerPath 问题。
我们的建议是与 VMware/适配器供应商联系,以了解这些暂时性故障的原因:主机启动期间出现主机重试 (0xc)、无法连接 (0x1) 错误。
如果修复了这些与适配器相关的瞬时错误,则 PowerPath 在声明设备时应该没有任何问题。