Start a Conversation

This post is more than 5 years old

Solved!

Go to Solution

9070

November 4th, 2013 15:00

SRDF Desired states for powering off a vmax

Greetings,

We are moving our entire Disaster Recovery data center to a new facility and I have some questions about the SRDF operations, experience with powering down a target array, and any best practices therein.

I have about 200 R1-R2 pairs (dynamic) SRDF (synchronous).  After the databases and servers are shut down, I was planning on suspending SRDF.  Then resuming SRDF once the equipment has been moved and powered up.

Questions:

1.)  Is there a command that will suspend/resume an entire SRDF link or SRDF group?

     a.) If not, is it preferable to suspend one symdg at a time or create a device file with all 200 R1/R2 pairs?

2.)  Will the state of the R1 change with the target VMAX being powered off and losing the RA connections?

    a.) If so, what are the requirements to get the correct state changes when the target VMAX powers back up?

Thanks In Advance,

-jq

November 5th, 2013 14:00

Hi Julianne,

According to you answer the scenario looks prrty clear, since first of all it is important so determine if you are working in a mainframe or open system.

1 - There is no real advantage to use CG vs DG or File, it is just a different ways of working, some SA(Storage administrators are used to work with CG, and some other use the -file option for specific actions)

You could use the split option not only with -file option but CG and DG

But keep in mind the following:

About power(energy) issues, I am not totally sure about it, but remember that if you are unplugging Vmax, the box needs to be in "rest mode" for a certain period of time before you plug it again, Consult it with the official EMC supports for you company, anyway it is quite probably that this task is going to be done by them, right? But I think it is good knowing this for you knowledge.

2 - Regarding Solutions Enabler, remember to include all your devices, so if you use DG, it has to include all devices, less some special ones, for instance in my installation there are a couple devices called(coupling (mainframe stuff)) that always remain with invalid tracks, so I do not include them in any SRDF task.

3 - Take a look at this command too, it is best practice, and it gives you a consistent scenario at all time.

     symrdf -sid XXXX -file filename.txt query -summary -rdfg xx

The -summary option, will show you the quantity of sym devices in each box(depending the symmetrix ID you are issuing) and their state (synch, partitioned, etc)

Here you have an example attached

Now going back to your questions

1) It sounds like I need to use the command file option to split the entire SRDFgroup at one time without needing to split each individual symdg (I have about 30 of them).  Do I understand that correctly?

It depends, depends on how you have done you DG if it has all the devices you want to include in your task you could use DG, or create a specific file containing all your sym devs for this particular task

It would be a mess and difficult to control for you, if you have multiple DG in your installation, and on the other hand your idea would be to split or resume all the devices included in all your DG, in that case, I suggest creating a file instead, does it make sense?

In my personal experience, when I have to face a challenge like you have to do, I use the file opction, because it is a task "one of kind" and not a repetitive one, in that case I would use DG, but you can use any you feel confortable with

2) what is the benefit of doing a SPLIT instead of RESUME?

Split, you make sure the traffic link is cut at any time, because both boxes stop seeing each other.

Something I forgot to mention is that sometimes you have to use -force or -symforce flag with split

Split is frequently used in such cases, where traffic to R2 site is cut, but you still have R1 site up and productive, which I think is the case

Take a look at this:

A) Split: Leaves both R1 & R2 in write enabled state.  Actions: 1. Suspends the Rdf link. 2. Write Enables R2, Besides if you run a split It will resume the rdf link i/o Traffic

In a nutshell:

Split - The R1 and the R2 are currently ready to their hosts, but the links are not ready or write disabled.

Resume - The SRDF links have been suspended and are not ready or write disabled. If the R1 is ready while the links are suspended, any I/O will accumulate as invalid tracks owed to the R2.

Finally,

It all depends on how you are handling your R1 site, not only the one you are relocating

5 Practitioner

 • 

274.2K Posts

November 5th, 2013 01:00

Hi Julianne,

1. Yes the SYMRDF SUSPEND|RESUME can be used to change the state of all devices in an SRDF group.

2. I believe so, when the remote array is powered down I suspect you will see PARTITIONED as opposed to SUSPENDED.

After powering back up the RDF state will be dependent on whether the PREVENT RA's ONLINE UPON POWER ON is set to YES|NO.

If it is set to YES then after powering up the RA's would need to be forced online. With this setting I believe it will only change from PARTITIONED to SUSPENDED after the RA's are brought online.

If it is set to NO then after powering up the RA's will come online automatically, and thus a query will report as SUSPENDED.

I should also mention the PREVENT AUTOMATIC RDF LINK RECOVERY, this will dictate what happens at a device level with respect to RDF. With this enabled, in a case where all the links are down and then resumed the RDF devices will remain suspended on the RDF link and require manual intervention to resume them.

With this disabled if all links are down and then recover the devices will automatically recover on the link.

So depending on what the two parameters above are set to will dictate what RDF states you will see and different points in the process.

regards

Evan

92 Posts

November 5th, 2013 08:00

Hi Evan,

Thank you for the response. 

Do you know what the command is for the symrdf suspend for an entire RDFgroup?  I know what it is for an individual symdg:

    symrdf -g symdg suspend

I do not see a command for the entire srdf group ....

November 5th, 2013 11:00

Hi Julianne,

It is good to hear about you and to share your questions whitin the entire group.

I have in the Company I work for Symmetrix Vmax SRDF in R1-R2 configuration, over 2000 devices in each site, and I think I can give you a couple tips, since I have recently knockdown the entire Datacenter to test parallel clone, so SRDF was an important issue to take care.

The key point you are mentioning may be achieved in several ways, it depends what you are useD to work with,

So giving you a stright answer would depend o many things

1) Are you using mainframe or Open systems, because if you are using mainframe you probably have installed in you system EMCRDF and ENCSCF both as started task(STC) you may achive what you are requesting using EMCRDF, but it looks like you are using Solutions Enabler - SYMCLI(Symmetrix Command Line Interface)

In that case As you know to perform commands related to RDF you need to use Symrdf command,

In that case symrdf is suitable for three variants, I mean

A)Are you using CG(Comsite Groups)?

B)Are you using DG(Device Groups)?

C)Are you using Command file option)?

Let's focus in option C

I suggest first of all "to get a picture" of actual situation

In that case create a .cmd or TXT file containing the list of devices you plan "to cut" srdf link

For instance:

SYMRDF_DEVICES.TXT should look like:

0120 0120

0121 0121

0123 0123

The first column is related to primary site R1 and the second column is related to remote site or R2

So,

Issue this, symrdf -sid xxxx -file SYMRDF_DEVICES.TXT query -rdfg XX>LIST_A.txt

this command will give your a list of all your devices before yo make any move, so it is good practice since in case of failure you get an idea of what the previous situation look like, does it make sense?

Secondly

If you are powering off the entire vmax, suspend is not the best option, instead you could SPLIT the SRDF link traffic beacuse the Vmax is going to be unplug and reconnected in its new location

So you could type,

symrdf -sid xxx -file SYMRDF_DEVICES.TXT split -rdfg

This command will cut the link, and devices will turn into

site R1 RW

LNK NR

Site R2 RW

Finally once the Vmax is up again, you will see Invalid tracks in one site agains the other, you will have to establish by synching both "boxes"

symrdf -sid xxx -file SYMRDF_DEVICES.TXT establish -rdfg

This will perform an incremental synchronization(only the differences that could exist in one box against the other)

Hope it helps

Any other questions do not hesitate to ask

Regards,

Pablo Sanzoni

92 Posts

November 5th, 2013 13:00

Hello Pablo,

Thank you for your response and sharing your experience. 

To answer your questions, I am using Open Systems (mostly AIX).  Also, I manage my srdf replication on each host using device groups (symdgs).

It sounds like I need to use the command file option to split the entire SRDFgroup at one time without needing to split each individual symdg (I have about 30 of them).  Do I understand that correctly?

This is a good tool for me to use during the move - especially during the restart - so I can monitor the invalid tracks with one command (as opposed to running the commands for each individual symdg).

Additionally, what is the benefit of doing a SPLIT instead of RESUME?  Just curious.

Thanks again for all of your insight.

Regards,

Julianne

92 Posts

November 6th, 2013 09:00

Hello Pablo,

Thanks again for all of the experience and information you are sharing with me (and others).


I have created a file to use the -file option.  That is a good check against how many R1-R2 pairs I am dealing with.  Although it was a bit cumbersome, since the symrdf list command includes meta members and the symrdf query summary rdfg command won't work with meta members.  I have modified the list and feel confident all of R1-R2 pairs are included, since I came up with the same count in three different methods. 


Thanks for including that command - since it will be the one I use and it also inspired me to do a full count including meta members. 


As for unplugging the VMAX, I have two CEs that will be onsite doing the power-down, and dismantling (uncabling, etc) of the VMAX.  It will then be packed up, loaded into a truck and driven 30 miles to the new Disaster Recovery site.  I'll have to ask them about the "rest mode" and how long that will be.


1.) Symdgs vs device files. 

Thanks again, I still have the individual symdgs for each database, and now I've got two comprehensive device files (one with metamembers and one without them).  I feel very prepared.


2.) I meant to ask you what is the benefit of doing a SPLIT vs a SUSPEND.  Am I correct in understanding that the SUSPEND is only one part of a SPLIT?


Also, the primary site will remain up and running.  I'm curious what the state of the R1s on the source array will be when the target with the R2s is powered off....


Thanks so much for sharing all of your expertise!


-julianne


5 Practitioner

 • 

274.2K Posts

November 6th, 2013 10:00

Hi Julianne,

I believe it will report as PARTITIONED while the remote array is powered down.

Regards

Evan

92 Posts

November 6th, 2013 11:00

Hi Evan,

Will the source array report as PARTITIONED even if I split all R1-R2 pairs first?

Thanks,

Julianne

November 7th, 2013 00:00

Hi Julianne,

I am back again since I am writing from Argentina and today it was a holiday day, so sorry for my late response.

1 - Going back to you comments, regarding meta head and meta menbers, you could use this command to get a clean list for your control and checking in your symmetrix of you meta's, this is: symdev -sid xxxx list -meta and if you want to save the result, include the following: symdev -sid xxxx list -meta>META'S.TXT

2- Keep in mind, that using -file option is a good prectice, and you can create as many "file groups" as you want and feel confortable with, for instance:

You could have a file containing VDEV, another including R1+MIR, etc, The possibilities are countless.

3 - On ther hand something very important you need to take care of and control before and after vmax relocation is the following:

You know Julianne, when this type of tasks are done in Symmetrix, frequently the CE taking care of the issue, dials remotely a couple days before "D-day" into the box and sets a lock into Symmetrix, are you familiar with locks?

What I mean is:

Be sure that after the relocation is done, there are no locks set onto symmetrix, and in such case, ask the CE to remove it, otherwise, if the lock is accidentally forgotten to be removed , you won't be able to perform any change to the symmetrix until it is released, to check that, issue this teo commands:

3.1 - One of them is related to the entire symmetrix(all boxes you have)

       symcfg list -lock, in case you see any lock on any VMAX, add -v(verbose) flag for further details, since there are

       plenty of possible locks

       The second one, is related to Devices locks, so:

       Symdev -lock list. You show see something like "No devices are locked"

4 - About the "rest mode", it depends, but I think is something that has to do more to the way the a CE works, than a

     specific technical note backing it up, since when I have to deal with this issue, our Vmax has to wait 12 hours before

     it could be reconnected, but when I looked for information supporting that behavior I found nothing, it also on which is

     your deadline to put your vmax operative again, I suggest ask the CE and share it with me please, it is a point I

     would like to have additional info, since there is no much in powerlink, support.emc.com, etc.

5 - Split vs Suspend

     If you split the SRDF link in terms of R1(Primary site) it will be RW enabled, so the R1 is ready to the Host, on the

     other hand If the R1 is ready while the links are suspended, any I/O will accumulate as invalid tracks owed to the R2.

   

6 - Julianne It is my believe that, when a Vmax is totally unplug, the site that still is alive(R1), won't see the other box, so any attempt to send a command to the other box, will report nothing, it is quite sure that your communications guys, will also unplug FiCON, so the most likely situation, is that all new I/O being received in R1 site will be accumulated in terms of tracks owe to R2, and once both Symmetrix see each other, a Synch process will be needed to be done

So, R1 will probably show "PARTITIONED" and you won't see any state over R2

Your welcome Julianne, it is good to share it among the entire group and to know people around the globe!

See you buddy

5 Practitioner

 • 

274.2K Posts

November 7th, 2013 01:00

Hi Julianne

Yes partitioned is the expected state, if the SYMAPI is unable to communicate to a remote symmetrix from an RA group, devices in that RA group will be marked as being in  partition state

Hope this helps

Regards

Evan

92 Posts

November 7th, 2013 09:00

Hi Evan,

Thanks for the response.

One more thing, when the target array is powered back up and I am ready to re-synch the R1-R2 pairs, will the PARTITIONED state impact my ability to perform an establish?

According to the SRDF documentation I have (which I could be mis-interpretting), it is not allowed to go from PARTITIONED to establish.

What would be the SRDF operation to allow me to move from PARTITIONED to a state that will allow an establish?  Will I need to change to SUSPEND first before the establish?

I'm thinking my operations will be this:

symrdf -sid xxxx -file filename SPLIT          R1s WR, R2s WR, pair state =SPLIT

Power down the target VMAX                     R1s WR, R2s ?, pair state = PARTITIONED ?

Power up the target VMAX                         R1s WR, R2s ?, pair state = PARTITIONED ?

symrdf -sid xxxx -file filename SUSPEND   R1s WR, R2s WR, pair state =SUSPENDED

symrdf -sid xxxx -file filename establish      R1s WR, R2s WR, pair state =SYNCHRONIZED

Does that look accurate?

Thanks!

Julianne

92 Posts

November 7th, 2013 11:00

Greetings Pablo!

I hope you had a nice holiday. 

I'm just trying to clarify, I still don't understand why I should use SPLIT as opposed to SUSPEND before the power down.

Thanks for all of your excellent advice, though.  I've always just used symdgs (device groups) for all my SRDF and TimeFinder controls.  I like having everything in one big device file.  It will simplify my process on the move day.

Much gratitude from Oregon!

-julianne

5 Practitioner

 • 

274.2K Posts

November 18th, 2013 08:00


Hi Julianne,

Sorry for the delay, when the array is powered back up the RA's will either be offline or online. This will depend on the setting I mentioned in my original post "FORCE RA's OFFLINE ON POWER UP". Assuming they are offline then while they are offline the group will be reported as partitioned. As soon as they are brought back online (presumably by EMC field folks) then the state will change accordingly (suspended or split etc.) and you will be able to resume remote replication from there.

Regards

Evan

No Events found!

Top