NVP-vProxy: Image recovery to VxRail failing when VMware stops the operation
Summary: NVP-vProxy: Image recovery to VxRail failing when VMware stops the operation because datastore is marked as out of space.
Symptoms
The NetWorker VMware Protection (NVP) integration is configured with the vProxy Appliance. The Virtual Machine (VM) image level restore fails when selecting a target VxRail vSAN datastore. The vSphere Web Client reports a disk space alert for the vProxy VM, even though the VxRail vSAN datastore has enough space for the restore.
MM/DD/YYYY HH:MM:SS TRACE: [@(#) Build number: 362] Finished setting up resources for VMDK restore. Transport mode used is "hotadd". MM/DD/YYYY HH:MM:SS NOTICE: [@(#) Build number: 362] Moving data ...
The NetWorker Management Console (NMC) reports the restore timed out and the vProxy processing that the work order becomes unavailable. The NetWorker server daemon.raw shows:
- Linux: /nsr/logs/daemon.raw
- Windows (default): C:\Program Files\EMC NetWorker\nsr\logs\daemon.raw
- NetWorker: How to use nsr_render_log to render .raw log files
MM/DD/YYYY HH:MM:SS NetWorker_Server nsrdisp_vproxy NSR info libCURL: function "curl_easy_perform" returned error 28: "Connection timed out after 90000 milliseconds" MM/DD/YYYY HH:MM:SS NetWorker_Server nsrd NSR info VM proxy Warning event: vProxy 'vProxy_name' is unavailable."
Cause
The available free space is not balanced on the VxRail vSAN datastore, so it is filling one of the disks in the array. VMware pauses the vProxy restore when this disk runs out of space. The timeout errors noted in NetWorker are the result of the vProxy VM being paused when this condition occurs.
Resolution
The vProxy will have the restore VMDKs mounted, so they should be manually removed and deleted after the failure. The vProxy has no control or influence over the storage policy configuration on the VSAN. You can modify vSAN storage policy settings to remove Fault Tolerance (FTT) and increase Stripe Width (SW). This spreads recovery data across more disks and prevents filling a single disk.
SW = 9 Changing the stripe width to 9 to spread the recovery out over more disks on the array.