IHAC that is attempting a test failover on VMware SRM and its failing with the following error:
"Failed to recover datastore 'V_PLAT_01'. VMFS volume residing on recovered devices '"60:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:x:xx:xx:xx"' cannot be found."
They are using Recoverpoint 4.1 to replicate from source to target and the storage is VNX at both the ends. There is only 1 lun in the CG and they have a single VM in the datastore that they are trying to test the recovery.
The lun is added to the correct ESX storage groups on both source and target side and is visible. Everything looks good from storage and RP perspective, but the error still occurs every time they retry.
I think they have already tried changing the Host rescan value, setting the Hardware Accelerated Lock to 0 etc and did not help.
If anyone has seen this behavior in the past and got it fixed, pleas share your thoughts. Appreciate your help.
Solutions Architect | EMCTAe, EMCCAe
Global Professional services | EMC²
EMC Global Services
US Toll Free: +1-800-782-4362 || Ext : 785-5304
I am having the same problem right now as well. I just got it and I'm in the middle of troubleshooting still but I've seen this before with another install that I did a few years back and it came down to the fact that their WAN between the sites was not sufficient enough. What was happening is that even though RP is showing the remote copy in "Image Access" mode (as in my case) SRM is timing out because it's taking too long for the image to be available to the host. I now see new advanced settings in SRM that might allow me to extend the mount time for the replica copy. I'm going to try that first. A few years back these settings weren't available and we ended up having to have the customer increase their WAN and the problem went away. But this time I'll try changing some settings. I'll let you know if that fixes it. In the meantime please let us know if you have found the solution to your problem so we may learn from it as well.
I figured out the problem. Well, at least what was happening with mine. But i think you are having the same problem. Some SRA's (in this case RP) take a little more time to enable access to the replica for presentation to the host so that it can see and mount the VMFS Datastore. The default setting for waiting to allow for this in SRM is "0" so you need to modify the setting to be anywhere from "20 to 180" and that will allow RP time to enable remote access so that the host can mount it. It's actually mentioned on pg 140 of the SRM Admin guide for 5.8. Here's a clip of what your problem is.
Rescanning Datastores Fails Because Storage Devices are Not Ready
When you start a test recovery or a recovery, some SRAs send responses to Site Recovery Manager before a
promoted storage device on the recovery site is available to the ESXi hosts. Site Recovery Manager rescans
the storage devices and the rescan fails.
If storage devices are not fully available yet, ESXi Server does not detect them and Site Recovery Manager
does not find the replicated devices when it rescans. This can cause several problems.
--Datastores are not created and recovered virtual machines cannot be found.
--ESXi hosts become unresponsive to vCenter Server heartbeat and disconnect from vCenter Server. If this happens,vCenter Server sends an error to Site Recovery Manager and a test recovery or real recovery fails.
--The ESXi host is available, but rescanning and disk resignaturing exceed the Site Recovery Manager or vCenter Server timeouts, resulting in a Site Recovery Manager error.
The storage devices are not ready when Site Recovery Manager starts the rescan.
To delay the start of storage rescans until the storage devices are available on the ESXi hosts, increase the
storageProvider.hostRescanDelaySec setting to a value between 20 and 180 seconds. See “Change Storage
Provider Settings,” on page 108.
NOTE: In Site Recovery Manager 5.1 and earlier, you might have used the storageProvider.hostRescanRepeatCnt parameter to introduce a delay in recoveries. Use the storageProvider.hostRescanDelaySec parameter instead.
To make the changes needed you have to modify the advanced settings for the Sites in SRM.
1. In the vSphere Web Client, click Site Recovery > Sites, and select a site.
2. On the Manage tab, click Advanced Settings.
3. Click Storage Provider.
4. Click Edit to modify the storage provider settings.