Start a Conversation

Unsolved

This post is more than 5 years old

1279

February 22nd, 2010 10:00

RecoverPoint 3.2 - Offset (in MB) from current.........

When I failover my ESX environment with SRM, in RP, under the system monitoring tab>groups, I see the Offset (in MB) from current snapshot to accessed snapshot value.  It steadily climbs during the failover and it has a limit of 98.  Can someone tell me what this means and if it can be changed?  We had a DR test and this setting went to 98 during the test and basically rendered the LUNS inaccessible.

TIA

5 Practitioner

 • 

274.2K Posts

February 23rd, 2010 06:00

I assume you were doing a "Test Recovery Plan" operation and not a real Recovery Plan.

Meaning the production was still up and running.

When using a Test failover SRM uses RecoverPoint Virtual image access mode.

When in image access mode all writes on the DR site are written to the DR Journals TSP portion.

Since DR VMs are up and running, they are writing to TSP.

The TSP Portion is normally 20% of the DR journal size.

When this TSP portion fills up, RecoverPoint must take away image access at the DR site.

Another option is that the Replication portion of the Journal is filling up due to continued writes from the Production site VMs.

But the explanation to this would be different.

In order to allow a longer period of time for a test you can either:

1. Increase the size of the Target Journal (add more journal devices)

2. Increase the portion of TSP in the journal, this can be increased up to 80% but it does mean you will have less journal history for replication.

This shouldn't happen on SRM "real" failover.

Regards,

Niv

92 Posts

March 11th, 2010 04:00

I have been having this same issue and I wanted to report what I am seeing.  First, I increased the Target journal size and I also increased the TSP to 80%.  I gathered from what I just read that even if the virttual access log fills up that as long as the journal size was big enough then image access at the target would continue.  That is not the case.  Here is the error I got:

Target-side log or virtual buffer is full -- writing by hosts at copy is disabled.

My log was not where close to 800 GB full.  The journal log was less than 100 GB.  So it looks like no matter what, when the virtual buffer fills up, you are screwed and target writes cease.

This is a HUGE problem for testing a recovery plan.  This was not an issue prior to RecoverPoint 3.2 and it needs to be fixed ASAP.  Now the only way I can test and have enough time to test is to bag SRM, manually present the LUNS via physical access in RecoverPoint and manually import all the virtual machines into vCenter.  So much for saving time using SRM for a test.  Ridiculous.

117 Posts

April 7th, 2010 04:00

In RP 3.2 and Below SRM test functions used virtual access mode, which is limited to 40GB of change tracking. This is not using the 20% if the journal as mentioned in the previous post, the 20% TSP area is only used when in logged access mode. If you upgrade to RP 3.3 you SRM test will use logged access and therefore you change tracking capability will increase with the amount or journal space allocated to the CG copy.

One word of caution.  When I see this issue, I always validate that you are using SRM best practices of relocating the default swap file storage location to a none replicated vmfs lun.  See the SRM release notes for details.   When a VM powers on Vmware will zero out the swap file, if you have 10 vm's each with 2GB of memory.. default swap file is 2GB , multiplied by the number of vm's .. that 20GB of change at power-up during your test.

ALWAYS use this best practice at a minimum on your target SR site.. but best practice is to do it for both sites.. otherwise you are constantly adding replication load to your replication infrastructure with not value added for DR recovery.

Making this change benefits all RP /SRM installation irregardless of the RP/SRM versions in use.

No Events found!

Top