Start a Conversation

Unsolved

This post is more than 5 years old

4976

April 9th, 2010 02:00

how to recover fromTransidle state??

Hi,

We have SRDF/A  configured with two dynamic devices on each sites. Initially all devices are in "consistent" state. But when the FC link goes down between both the  Arrays, it shows the device state in "Transidle".

My questions are:

1. Why does not it go to the "Partitioned" state directly instead of "transidle".

2. How to recover from "transidle" state. or what are the steps(procedure) to make the state "partitioned from "transidle".

3. Where to get good docs on the state transition , especially on SRDF/A.

Thanks in advance.

--Shubh

62 Posts

April 9th, 2010 04:00

https://powerlink.emc.com/nsepn/webapps/btg548664833igtcuup4826/km/live1//en_US/Offering_Technical/Technical_Documentation/300-002-940_a04_elccnt_0.pdf

My original post assumes you want to use Transmit idle as an option, the document link provides options to remove or check that you have it properly operational.

Trust this helps

62 Posts

April 9th, 2010 04:00

Hi

The point of using SRDF/A Transmit idle is to provide the max level of resilience to any unplanned replication link outages. Transmit idle allows SRDF/A to continue processing during periods of SRDF link disruption. Therefore to want to change it from the transmit option would appear to go against the point of maintaining replication activity to DSE when the link is unexpectedly down. When the link recovers replication across the link will return transferring data retained on DSE.

Others with more direct experience in this will I'm sure provide more details

2.1K Posts

April 9th, 2010 10:00

As was already mentioned, Transmit Idle is there to help you weather short periods of network connectivity issues. If your state is "Transidle" then it is doing it's job. By remaining in this state, your R2 devices remain consistent during and after the link failure is resolved. If you dropped to partitioned you would not have consistent R2 devs on recovery until you resynched.

Technically you could issue a split (or maybe it would have to be a half split) if you really wanted to, but that would defeat the purpose. If you want your pairs to drop to partitioned during a link outage you need to disable Transmit Idle for that rdf group. Without TI set your state will be partitioned as soon as the Link Limbo expires (usually 10 seconds) during a link failure.

Again I say... Transmit Idle (and DSE as an extension of that) is a good thing for you, so I'd be really curious to hear the circumstances that would justify turning it off. I'm sure they probably exist, but I can't think of them off the top of my head.

44 Posts

April 11th, 2010 22:00

Thanks all.. !!

73 Posts

April 21st, 2010 07:00

hello Allen,

what is the transmit idle time ? 

2.1K Posts

April 21st, 2010 08:00

Hello and Welcome to the Forums!

I'm not sure if I understand your question clearly yet. Are you asking how long Transmit Idle can maintain your groups in async mode during a link outage, or are your referring to something reported while you are in Transmit Idle mode?

For the former, the answer is the same as everything else you could possibly ask... Altogether now everyone: "It Depends" :-)

Honestly though, there are several factors that can have a significant effect on how long you can run in TI mode. The more cache you have in the array the longer you can survive. The lower the Rate of Change on the mirrored devices, the longer you can survive. If you have DSE configured and enabled it will increase your time, and the more capacity allocated to DSE the more you will extend it as well. And finally, the level of activity consuming cache on the rest of the array can have an effect on how much cache is available to TI. There is a cap on the amount of cache to allocate to TI, but that doesn't prevent other processes from using up some of that cache first.

Under the right circumstances we have seen link outages of several hours where we never dropped out of async mode and recovered seamlessly once the link was restored. This was without DSE too. Another factor to consider is that when the cache limit is exceeded you won't immediately drop all your groups (if you have multiples). You will first lose the busiest (based on cache utilization I believe) and those resources will get shared out to keep the rest of the groups running.

If your question was in reference to actually being in TI mode, the TI time reported would be how long you have been in TI mode (not how much time is left). I'm going to go dig and see if I can find where this is reported (assuming I remember correctly as it being available at all).

I hope this helps.

2.1K Posts

April 21st, 2010 09:00

OK, I've confirmed that there is a listing for the current "Transmit Idle Time" in the output from a symrdf query command. It should be on the last line before the detailed listing of the pairs and their states. This will report the current time that TI has been active (which is going to be about 10 seconds less than the actual outage since Transmit idle doesn't kick in until the Link Limbo time expires... this is normally 10 seconds).

73 Posts

April 21st, 2010 12:00

it was recovered to async again but am just confused what is 00:00:00 and how it can help in transmit Idle , i checked the documentation i have nothing mentioned about this time ? 

Thanks

73 Posts

April 21st, 2010 12:00

Its clear now , thanks

any way to have the old transmitidle statistics ?

73 Posts

April 21st, 2010 12:00

Hi ,

ya am asking about this value , its 00:00:00 but what does this means ??

last time link dropped it taked 6hrs and still few groups was in transidle mode ..  so what 0's mean ?

2.1K Posts

April 21st, 2010 12:00

I'm not sure exactly what you mean by that. What exactly are you looking for?

2.1K Posts

April 21st, 2010 12:00

If the group was in TI state this should have shown the time it was in that state (at least that is what I'm seeing in the documentation). I don't have a test system I can try this on to see. If you have confirmed that the group you are querying is definitely still "live" in TI (see the pairs state below that) then you should probably open a case with Support to see if they can help.

2.1K Posts

April 21st, 2010 12:00

That explains why it was showing 00:00:00... under normal operations that is what should be there. This should only change if Transmit Idle is active. Then this will increment to show the time that TI is active. Once the link is restored and you are back to Async mode this will immediately revert to 00:00:00

73 Posts

April 21st, 2010 13:00

assume link dropped and Transmit Idle time was 02:02:02 , my link is up now and the group in Async mode , the time will be resseted to 00:00:00 ,,, can i got this information back?? that it took (02:02:02) ?

2.1K Posts

April 21st, 2010 13:00

I don't think you are going to be able to get that directly from SymCLI. I suspect you would have to use something like SPA to look back at the history, assuming the outage was long enough to be recorded. Off the top of my head I'm not sure of another way you could get this info from the array. You would likely have to get it from some other piece of infrastructure involved in the link (e.g. FC switch, link extender, etc.)

No Events found!

Top