We have VAAI Primitive HardwareAcceleratedMove disabled on 2 of our esx clusters. This corrected the issue covered below in my Feb forum post and in the recently released emc KB (attached as pdf) .
We have been unable to mount datastores replicated by Recoverpoint on the DR cluster. We have tried using SRM and Manual methods. The hosts try to mount the volumes, but are unable. They act similar to when we have seen SCSI reserve issues on older setups. Datastores appear and disappear from the host, and the vms cannot start. HAMove disabled is the only difference between ESX clusters where RP/SRM works fine and this cluster in which it does not. We have upgraded from 3.5 to 4.0 and replaced an RPA, still no luck. Does anyone know if Recoverpoint will have issues with this VAAI primitive disabled?
My earlier post on the VNX forum, containing the details on the VAAI issue:
2 Replies Latest reply: Feb 6, 2013 11:35 PM by Gearoid Griffin
This question is Not Answered.
spaceman Feb 6, 2013 5:14 PM
Brand new VNX installs, 2 separate datacenters, 12 node ESXi clusters on HP BL460cG8. ESXi 5.0U1, RP 3.5SP1, VNX 7500 5.32.000.5.011, PP/VE 5.7 P02, Brocade 5300s with HP/Brocade modules, FOS 7.0.2. Pretty standard stuff...
Getting datastore latency alarms, datastore connectivity alarms, general all around badness whenever we svmotion or clone. Svmotion fails, users complain, RP also loses connection to ports, LUNS. Almost brings the whole datacenter to a halt, very scary. Acts much like SCSI reserve or APD...but we are mode 4 ALUA, VAAI.
I've seen some blog entries on this with older Block OE, but I thought all of this was resolved in current block OE and splitter code. Does anyone know where all this stands and what needs to be done to prevent this from occurring, if this is in fact what I assume it to be?
Content tagged with vnx
Content tagged with vnx7500
Average User Rating
Can you provide some ESX logs which may help pinpoint the issue?
o Like Show 0 Likes (0)
Sooo more than likely you are hitting a newly documented issue emc313487
May not be customer viewable yet, but it will be shortly (its just going through final approval)
A Service Request would be required for us to verify this
But anyway the issue more than like is an issue with VAAI and Virtual Provisioning
Operations which use VAAI Primitive HardwareAcceleratedMove are considerably slower in operation and extreme spikes in latency can be seen
These include examples like :
Deploying Templates with VMWare
Any ESX operations which is used to copy or migrate data within the same physical array
EMC has determined that there is an issue with the usage of VAAI by Virtual Provisioning within VNX Block OE Release 32. This can occur any time an ESX host using VAAI issues a reqest for a Storage vMotion, deploying a VM from a template, or for cloning a Virtual Machine which utilise the Extended Copy (xcopy) SCSI command. The area of concern is pools and it applies to both fully provisioned (thick) or thinly provisioned (thin) pool Luns. The issue may occur when the request to create a template or clone via the use of VAAI HardwareAcceleratedMove feature to a mapped lun on a non-owning storage processor. If the issue is encounter the symptom will be either response time spikes in VCenter and/or longer than expected completion times for the templates or clones.
A memory allocation problem can lead Virtual Provisioning timeouts which lead to timeouts and to high response times on the host.
Disable the VAAI Primitive HardwareAcceleratedMove (other VAAI Primitives do not need to be changed)
See VMWare KB http://kb.vmware.com/kb/1033665 for instruction on how to disable VAAI Primitives within ESX
Avoid sending ExtendedCopy/Xcopy request for a LUN to the non-owning SP or clone or deploy a VM from a temalate to a LUN on the non-owning SP.
This issue is due to be fixed in a future release of VNX OE, scheduled Q1 2013
Note as well that you are on Rel 32 P11 which is also vulnerable to ETA EMC308914