dynamox

2 Intern

•

20.4K Posts

0

9477

August 10th, 2015 11:00

SyncIQ - clean up after failback

Hello guys/gals,

Below is copy/paste from OneFS WebAdmin guide. I have gone thorugh multiple failover/failback on my virtual cluster without issue, steps are very simple. One thing that i don't undertand is why after we complete the failback process we do not get rid of the _mirror policy. It did its job, it pushed new/changed data back to primary cluster so why do we need it. Is it an over-site in the documention or there is a good reason to keep this policy around ?

Thank you

**********************************************

Procedure

On the primary cluster, click Data Protection > SyncIQ > Policies .
In the SyncIQ Policies table, in the row for a replication policy, from the Actions column, select Resync-prep.SyncIQ creates a mirror policy for each replication policy on the secondary cluster.SyncIQ names mirror policies according to the following pattern: _mirror
On the secondary cluster, replicate data to the primary cluster by using the mirror policies.You can replicate data either by manually starting the mirror policies or by modifying the mirror policies and specifying a schedule.
Prevent clients from accessing the secondary cluster and then run each mirror policy again.To minimize impact to clients, it is recommended that you wait until client access is low before preventing client access to the cluster.
On the primary cluster, click Data Protection > SyncIQ > Local Targets .
In the SyncIQ Local Targets table, from the Actions column, select Allow Writes for each mirror policy.
On the secondary cluster, click Data Protection > SyncIQ > Policies .
In the SyncIQ Policies table, from the Actions column, select Resync-prep for each mirror policy.

After you finish

Redirect clients to begin accessing the primary cluster.

**********************************************

Responses(35)

bhalilov1

114 Posts

0

August 10th, 2015 13:00

When the *_mirror policy is created on DR cluster, protection domain is created on your PROD cluster, you can see it if you run

isi_classic domain list

When you're finished and failed back to PROD, the domain is in Writable state. If you delete the *_mirror policy the domain also gets deleted. The next time you failover and run "isi sync recovery resync-prep" to failback, you will have to wait for that domain mark job to run again - it may take long time.

On the other side of the coin, if you decide to keep the mirror policy and the protection domain on your PROD cluster, you will not be able to mv files to the directory that's under the domain, even if it's in "Writable" state

bhalilov1

114 Posts

0

August 10th, 2015 13:00

Yes,

Step 6 changes your protection domain in PROD cluster from "Write Disabled" to Writable

if you run the forward policy at this point it will fail, since your DR side is still also Writable

in step 8 by running the resync-prep from PROD you change the state of the domain in DR to "Write Disabled"

dynamox

2 Intern

•

20.4K Posts

0

August 10th, 2015 13:00

Hi Burhan,

i guess i don't understand what happens in step 8 ? Why do we need to run resync-prep ? My data has been copied from DR back to primary cluster via the mirror policy and now i am just ready to restart my primary --> DR replication. Does step 8 do something to DR cluster location to prepare it to become replication target again (makes it read/only) ?

Thank you

sluetze

300 Posts

1

August 11th, 2015 00:00

the mirror policy also leaves a snapshot (at least under 7.0.2.x) that grows like "nether world" (this americans with their censorship of common not harmful words like hell) - in our case several TB in a few weeks and is not needed, since it is the snapshot that gets created on the source of a replication. It is deleted automatically after one year. We started to delete it right after we failed back, to not waste unnecessary clusterspace

sluetze

300 Posts

0

August 11th, 2015 02:00

"SIQ-%ID-latest"

The %ID is the id of the policy.

i had an SR# with EMC and they said they are useless. Currently we are trying to either use our processtool for DR to automatically delete these snaps or we will create a feature request to let them delete automatically inside OneFS.

Regards

--sluetze

dynamox

2 Intern

•

20.4K Posts

0

August 11th, 2015 02:00

Burhan Halilov wrote:

Yes,

Step 6 changes your protection domain in PROD cluster from "Write Disabled" to Writable

if you run the forward policy at this point it will fail, since your DR side is still also Writable

in step 8 by running the resync-prep from PROD you change the state of the domain in DR to "Write Disabled"

Burhan, this make sense but is there any reason to keep the mirror policy around after completing step 8 ?

dynamox

2 Intern

•

20.4K Posts

0

August 11th, 2015 02:00

sluetze wrote:

the mirror policy also leaves a snapshot (at least under 7.0.2.x) that grows like "nether world" (this americans with their censorship of common not harmful words like ****) - in our case several TB in a few weeks and is not needed, since it is the snapshot that gets created on the source of a replication. It is deleted automatically after one year. We started to delete it right after we failed back, to not waste unnecessary clusterspace

Do you remember the name of the policy, i want to check on my 7.1.x cluster.

johnsonka

130 Posts

0

August 11th, 2015 06:00

Hello Dynamox,

I conferred with the subject matter expert for SyncIQ in Support and he stated that the reason we do not recommend removing the mirror policy is it destroys the LIN map being used by it. So, if you were to have to redo the failover process it has to recreate the LIN map for the mirror policy from scratch. This can be a HUGE time penalty when your failover window is tight and there are millions of files to scan in to creating the new LIN map to be used by the new mirror policy once failover is complete.

I hope this answers some of your question, but please let me know if there is any other information that I can get for you! I am more than happy to help.

dynamox

2 Intern

•

20.4K Posts

0

August 11th, 2015 17:00

Katie,

I am still not following you, maybe you can work through the below steps with me and explain what's happening with these SIQ snapshots and then how the mirror policy is playing its role.

here is a test scenario:

policy name = upgrade_ir

1) I am replicating from primary cluster to secondary cluster. When i check on primary cluster i see snapshot SIQ-3d510f0edc31e18bdb2100dc1441306e-latest and on secondary cluster i see snapshot SIQ-Failover-upgrade_ir-2015-08-11_19-18-35

This is what i consider my normal state.

************

2) i am failing over from primary cluster to secondary cluster. Upon completing that step on primary cluster i still see the same snapshot but on secondary cluster i now see two snapshots:

SIQ-3d510f0edc31e18bdb2100dc1441306e-restore-latest

SIQ-Failover-upgrade_ir-2015-08-11_19-33-34

***********

3) I am failing back from secondary back to primary. I go to primary cluster, select my policy and hit "Resync Prep". At this point i look on primary cluster and see these snapshots:

SIQ-3d510f0edc31e18bdb2100dc1441306e-latest

SIQ-w2isilonpoc-upgrade_ir_mirror-2015-08-11_19-37-17

SIQ-w2isilonpoc-upgrade_ir_mirror-latest

secondary cluster has on only this snapshot

SIQ-0050568f48352c87ca553e2004ae341d-latest

Now i am going to the secondary cluster and run upgrade_ir_mirror policy. When i look on primary cluster i see these snapshots:

SIQ-3d510f0edc31e18bdb2100dc1441306e-latest

SIQ-Failover-upgrade_ir_mirror-2015-08-11_19-41-39

and on secondary

SIQ-0050568f48352c87ca553e2004ae341d-latest.

Next on primary cluster i go to Local Targets and select Write Enable for the upgrade_ir_mirror policy. At this point on primary i see these snapshots

SIQ-3d510f0edc31e18bdb2100dc1441306e-latest

SIQ-0050568f48352c87ca553e2004ae341d-restore-latest

SIQ-Failover-upgrade_ir_mirror-2015-08-11_19-48-25

and on secondary

SIQ-0050568f48352c87ca553e2004ae341d-latest

Finally i am going back to Secondary cluster and select Resync-prep for upgrade_ir_mirror policy. Primary cluster has this snapshot

SIQ-3d510f0edc31e18bdb2100dc1441306e-latest

and secondary

SIQ-0050568f48352c87ca553e2004ae341d-latest

SIQ-n2isilonpoc-upgrade_ir-2015-08-11_19-51-57

SIQ-n2isilonpoc-upgrade_ir-latest

So the question is why mirror policy needs to stay and why do i end up with 3 SIQ snapshots on secondary cluster compared to my "normal" state in 1)

Thank you very much for your time

johnsonka

130 Posts

0

August 13th, 2015 06:00

Hello dynamox!

In reading through your question I have some information that I hope is helpful for you in understanding the mirror policy and snapshots. The mirror policy is left in place to preserve the LIN map for the policy so any future need to failover/failback would faster and more efficient. This will reduce time to recovery in the event of an unforeseen catastrophic failure or planned migration between the clusters.

As for the snapshots, this is what I received from the SyncIQ SME here in Support in reagrds to your question:

All of the SIQ- snaps interact with yet another database used by the active cluster(whether it's Primary/Secondary) to sync up the LIN map over at the side. That is why you need the SIQ- snaps at all times. If you delete these from say the mirror policy referencing them then you have destroyed that mirror policy.

With regards to the snaps he pasted at end of question. [Based on naming convention]

SIQ-0050568f48352c87ca553e2004ae341d-latest --- explained this as part of the SIQ- explanation above.

SIQ-n2isilonpoc-upgrade_ir-2015-08-11_19-51-57 -- This one is a manually enabled option configured by the Primary's policy. It has nothing to do with the mirror. This can be disabled to prevent creating this snap.

>SIQ-n2isilonpoc-upgrade_ir-latest -- This is the same reasoning as above.

If you have any more questions or there is anything I can clarify, please do let me know.

dynamox

2 Intern

•

20.4K Posts

0

August 13th, 2015 08:00

Katie,

thank you for trying but your SME replies are still very ambiguous .

Katie Johnson wrote:

In reading through your question I have some information that I hope is helpful for you in understanding the mirror policy and snapshots. The mirror policy is left in place to preserve the LIN map for the policy so any future need to failover/failback would faster and more efficient. This will reduce time to recovery in the event of an unforeseen catastrophic failure or planned migration between the clusters.

i don't understand why mirror policy needs to be there, especally for failover. If you look at my "normal" state, i have a snapshot on primary cluster and SIQ-Failover snapshot on secondary. I can failover to secondary within seconds so why the need to keep this mirror policy in place for "future" failovers ?

Katie Johnson wrote:

With regards to the snaps he pasted at end of question. [Based on naming convention]

SIQ-0050568f48352c87ca553e2004ae341d-latest --- explained this as part of the SIQ- explanation above.

SIQ-n2isilonpoc-upgrade_ir-2015-08-11_19-51-57 -- This one is a manually enabled option configured by the Primary's policy. It has nothing to do with the mirror. This can be disabled to prevent creating this snap.

>SIQ-n2isilonpoc-upgrade_ir-latest -- This is the same reasoning as above.

I relaise we need snapshots for failover/failback operations but you have yet to explain why at the end of failover/failback i am not back to my original state of one snapshot on source and one snapshot on target. In the reply above why <SIQ-n2isilonpoc-upgrade_ir-2015-08-11_19-51-57> snapshot not deleted, why is it left behind and not cleaned up by the platform ? Sloppy implementation or there is a good reason for it ?

Thank you

sluetze

300 Posts

0

August 13th, 2015 09:00

Katie,

also some additional questions:

EMC told me in one of my SRs (if you want i can give you the number) that the SIQ-PolicyID snaps are safe to delete. And i deleted them on ALL of my clusters. Now you wrote that i destroy the mirror_policy. Also the SIQ-PolicyID Snaps have a expirationdate of one year... so do i have to Failover / Failback every year to save this snap?

The snap GROWS depending on the file modifications on the cluster. It grows to several TB (and i have not a fast changing Environment!)

When i Failover / switchover to secondary site the SIQ-snap is deleted and a new one is created (as far as i could see). The snap is only valid and useful on the SOURCE of a replication.

Regards

Steffen

johnsonka

130 Posts

0

August 13th, 2015 12:00

Hello dynamox and Steffen,

Thanks for replying here, we have been looking at this question today and my SME has discovered a couple more things:

In this case, dynamox, you are correct. Engineering added 2 extra redundant snaps for protection in case something unexpected happens during a failback.

As for the mirror policy, it is not required to stay. One keeps the mirror to achieve faster failovers past the 1st ever failover.

If you choose to keep the mirror policy then SIQ snaps should remain there because it is used by the mirror policy once it becomes active. Without them the mirror policy will break. If you choose to delete the mirror policy then the snap below can be deleted based on the reasoning the follows the original reasoning.

SIQ-0050568f48352c87ca553e2004ae341d-latest – This one cannot be deleted since this one will destroy the mirror relationship. You can delete it you want to by deleting the mirror policy.

As for these snaps, they can be removed safely as they are not related to the mirror and they are not needed here. As to why they are not removed? That appears to be a design decision.

SIQ-n2isilonpoc-upgrade_ir-2015-08-11_19-51-57 – This one is fine to delete since it’s a manual snap. It’s not really tied to the internals of SyncIQ.

SIQ-n2isilonpoc-upgrade_ir-latest -- This one is also fine to delete since it’s not really tied to the internals of SyncIQ as well.

sluetze,

I would like to see that SR number where the SyncIQ snapshots would be OK to remove. I'd like to double check the information you were given. You can leave that here, or private message me the information if you'd prefer.

dynamox

2 Intern

•

20.4K Posts

0

August 13th, 2015 12:00

Katie Johnson wrote:

As for these snaps, they can be removed safely as they are not related to the mirror and they are not needed here. As to why they are not removed? That appears to be a design decision.

SIQ-n2isilonpoc-upgrade_ir-2015-08-11_19-51-57 – This one is fine to delete since it’s a manual snap. It’s not really tied to the internals of SyncIQ.

SIQ-n2isilonpoc-upgrade_ir-latest -- This one is also fine to delete since it’s not really tied to the internals of SyncIQ as well.

Katie,

none of the snaps i listed were created manually by me, they were created during failover/failback. So when you say "manual" snap and "not tied to the internals of SyncIQ" i am even more perplexed and confused ? If SyncIQ created these snapshots how can they be not tied to SyncIQ ?

I am not doing anything exotic here, feel fee to recreate my steps in your environment and see what happens.

sluetze

300 Posts

0

August 14th, 2015 01:00

Hi Katie,

SR#69012210

Still wanting to know why the snapshots have an expiration date set when they are so important.

Appreciate your Feedback.

Regards

Steffen

1
2
3

View All

No Events found!

Isilon

SyncIQ - clean up after failback

Procedure

After you finish