I would like to make some application changes on the RPA replicated CG. For contingency purpose, the replication will be paused during the change. In case the application changes is failed, we need to roll back. My question is what happen during the pause period, the journal is full. Can I still perform restore when journal is full?
I am worried the journal will fill up as it only can survive 3 days under usual operation. If the application change is too huge and Journal is fill up.
before the application change,
1. Apply bookmark.
2. Pause the transfer.
In case we need to rollback, can I still rollback to the bookmark if the changes is too huge.
I'd be more inclined to take a bookmark, and at the same time define a consolidation policy for the snapshot, say to "Survive Weekly" consolidations. So even if the journal fills and consolidation is ran, the bookmark/snapshot will be preserved.
This way you can leave replication running, roll back to the bookmark if required, and if everything does go smoothly your journal history will still remain available.
Just to clarify couple of things. First, the production journal cannot fill-up, it contains a bitmap of the changes performed on the production volumes on a per production copy basis.
When replication is paused, we enter marking mode which indicates which blocks have changed. When replication is resumed, the changes will be transferred to the target side, if the changes are larger than the target journal we enter long resync which will cause journal history loss.
The default behavior is to allow that long resync but to keep replication operational and continuous. One can disable the "allow long resync" parameter through the CLI which will pause replication in case that long resync is due, allowing the user to image access to any of his PiTs before they are deleted with a manual user intervention.
Another thing to note is that if snapshot consolidation is not configured then the setting of bookmark survival will have no effect. Furthermore, if long resync occurs, the entire journal history will be lost.
Going back to the use case at hand, if the load during the application maintenance is due to be high, pausing replication will ensure that the target PiTs will not be lost. If there is a requirement to have replication running during the maintenance, then I would recommend to either try to account for the write burst in terms of the journal capacity or disallow long resync. For the latter, just keep in mind that replication will be disrupted if the changes are larger than the journal history.
Hope that helps,
RecoverPoint Corporate Systems Engineer
Tech Staff Engineering Technologist - Data Protection
I just perform a quick test in the LAB. The purpose of this exercise is due to space constraint we can only have 10% of Journal volume catered. Customer is attempting to perform database changes and potentially they might need to recover or revert the changes. It seem to be pause the transfer is the best here.
1. Scenario 1: - Create a bookmark. Enable image access at DR. Try to fill up the journal by keep overwriting the production volume. The entire CG will go into error state as the Journal is fill up. If I attempt to recover production using the bookmark, I won't be able to do so.
2. Scenario 2 - Create a bookmark. Pause the transfer. Try to fill up the Journal and since it is in paused state, the journal will never fill up. I can successfully recover production using the bookmark.
3. Scenario 3 - Create a bookmark. Enable image access at DR and also Pause the transfer. Try to fill up the Journal and since it is in paused state, the journal will never fill up. I tried to recover the production but before I can do that I have to disable the test copy. However if I do that, the cg will auto go from paused to init state. If the changes is too huge, the previous bookmark is gone.
We are having network drop issue in the wan link, many of the CGs are went into long re-sync mode frequently. I got confused between two cases , will the long re-sync due to data corruption/marking the journal as dirty owe to packet drop or it can be a problem with changes to be transferred are greater than the target Journals ? . is there any way to confirm between these two scenario's?
The two things are related. When the network is having problems the link will pause and marking will take place. When the link is resumed the metadata that is marked is read and the associated data on the array is read and replicated (a short init – although it may not seem that short!). If the accumulated write data is larger than the replica copy journal then a long resync will occur. You can check this by checking your replica copy journal history as this will have been wiped.
Issues such as these are best dealt with by manually pausing the less business critical CGs because of competing over resource because it can compound the problem.
Consultant Corporate Systems Engineer - RecoverPoint & VPLEX (EMEA)
Data Protection and Availability Solutions
EMC Europe Limited
Mobile: +44 (0) 7730 781169
EMC Europe Limited
Registered in England with Company No. 00990752 Registered office address: Level 1, Exchange House, Primrose Street, London C2A 2EG
The information contained in this e-mail message and any files transmitted with it are confidential. It is intended only for the addressee and others authorised to receive it. If you are not the intended recipient or the person responsible for delivering the message to the intended recipient, you are advised that you have received the e-mail in error; please delete it and notify the sender immediately. You should not retain the message or disclose its contents to anyone. Any disclosure, copying, distribution or action taken in reliance on the contents of the e-mail and its attachments is strictly prohibited.