This post is more than 5 years old

3 Posts

11005

November 20th, 2012 19:00

Specifying a non-replicated datastore for transient data using array-based replication and VMware SRM

Hi all,

Apologies if this has been asked already, but I've searched and come with nothing on this.

The basic problem is that it does not appear to be possible to take quiesced virtual machine snapshots with the EqualLogic Host Integration Tools for VMware (HIT/VE) in an environment that is set up to NOT replicate "transient data" (virtual machine swap files and Windows paging files). In such an environment, a replication schedule configured in HIT/VE will run as specified, but not take VM level snapshots of the protected VMs, hence replicas aren't quiesced at the file system and application levels, and recovery is done from a replica which is, at best, crash consistent.

I've been testing SRM 5.0 with Dell Equallogic firmware 5.2.5. In production I'll eventually be deploying SRM 5.1 and firmware to 6.0.1. From my reading it appears that I'll almost certainly face the same issues with these versions, as I believe the issue is a limitation of the EqualLogic VMware Host Integration Tools, which I am already using the latest version of (3.1.2).

I want to avoid replicating the transient data to save bandwidth. The SRM 4.0 Administration guide included a comprehensive explanation of how to avoid replicating both vswp files and Windows paging files. The guide for SRM 5.1 only has the procedure for placing the vswp on a non-replicated datastore, and is qualified by the statement:

"CAUTION Under normal circumstances, you should keep the swap files in the same datastore as the other virtual machine files. However, you might need to prevent replication of swap files to avoid excessive consumption of network bandwidth. Also, some storage vendors recommend that you do not replicate swap files. Only prevent replication of swap files if it is absolutely necessary."

For HIT/VE to take a quiesced snapshot at the VM level in sync with the replication snapshot at the array level, the entire VM, including all of its transient data, must reside on one datastore. If the vswp file file resides on a different datastore, the HIT/VE "Create Smart Copy Replica" wizard throws up the warning "VM [virtual machine] is located on 2 datastores. VM snapshot will not be created". As I stated above, a smart copy replication schedule created in HIT/VE will run without any warnings, and simply not snapshot the VMs.

It would appear that we are forced to choose between properly quiesced VM replicas, and saving bandwidth by excluding useless data from replication. Has anyone else encountered this situation? If so, what was your choice? Better chance of a smooth recovery at the expense of a higher RPO is presumably preferable. Is there a workaround?

3 Posts

November 28th, 2012 15:00

Thanks for your suggestion, Don. I didn't realise it had been renamed. I will test that and see if what I want to do is possible.

1 Message

December 11th, 2012 02:00

Hi There, 

Did you manage to isolate a page file into a separate non-replicated datastore and get VSM to smartcopy it successfully!?

3 Posts

December 20th, 2012 23:00

Sorry it's taken me so long to reply here - had some production issues that delayed my testing.

My test environment consists of vSphere 5.1, Equallogic firmware 6.0.1 and Equallogic VSM 3.5.

So, some good news: setting a separate non-replicated datastore (SAN volume) as the VM swap file location no longer causes VSM to throw the error "VM [vm name] is located on 2 datastores. VM snapshot will not be created", which is the behavior shown by previous versions of ASM/VE. ESXi takes a quiesced snapshot as expected, before replication of the volume begins. So the disclaimer in the SRM 5.1 admin guide that "some storage vendors recommend that you do not replicate swap files" would appear not to apply to EqualLogic any more.

My replicated test volume contained 1 Windows 2008 R2 VM with its .vswp file on a non replicated datastore. I promoted the replica and added the VM to the inventory of an ESXi host at my remote recovery site and it booted without issue, with the snapshot available. However, if the virtual machine has virtual disks that reside on more than one datastore (such as if a separate virtual disk is used for the Windows paging file, as described in the SRM 4.0 admin guide), VSM will still throw the above quoted error. Essentially the new version of VSM solves half my problem.

I have tried attaching an RDM as a potential location for the paging file, but VSM refuses to snap the VM with the error "VM [VM name] will not be snapped: VM has RDM attached". When I had that idea I really thought it might work :-(

My only other idea to avoid replicating Windows paging files is to provision a PS volume for each protected VM, connect directly to that volume using the Windows iSCSI initiator within the protected guest OS, and set the paging file on it. This is problematic as the iSCSI initiator relies on a service that starts as the OS boots, so Windows is without a paging file when it first starts up, and creates a temporary one on the system disk even if that disk is set to have no paging file. Manually setting a very small (~200 MB) paging file on the system disk, and a normal size one of the guest iSCSI attached volume seems to overcome this problem, but I am going to do some very thorough testing of this concept before considering applying it to production for the sake of a better RPO.

I've not yet tested with SRM, and I do not think the SRM version will have any bearing on the outcome as I am dealing with the interaction between vSphere/ESXi and the array, via the VSM (VMware Host Integration Tools).

No Events found!

Top