Start a Conversation

This post is more than 5 years old

Solved!

Go to Solution

2171

February 20th, 2008 10:00

rdf split fails

first event

Feb 15 11:09:55 hostname: /opt/emc/SYMCLI/V6.0.1/bin/symmir -g rdfg05 -rdf split -consistent -i 60 -c 60 -noprompt

Remote 'Split' operation execution is in progress for
device group 'rdfg05'. Please wait...


The Symmetrix configuration could not be loaded for a remotely-attached Symmetrix

from symapi log

02/15/2008 11:10:28.348 8629 1 EMC:SYMCLI issue_syscall_ranges Syscall 109. sts: SYSCALL_NON_ZERO_RC, o
rder 17: SYSCR_REMOTE_FAILED
02/15/2008 11:10:40.056 8629 1 EMC:SYMCLI load_checksum PSCsyscodes_2. sts: SYSCALL_NON_ZERO_RC,
order 17: SYSCR_REMOTE_FAILED
02/15/2008 11:10:41.074 8629 1 EMC:SYMCLI Failed to load information. remote: 1, r
emote_hop_num: 2, retries: 3, sts: SYMAPI_C_FAILED_REMOTE_LOAD - error msg: The Symmetrix configuration could not be loaded f
or a remotely-attached Symmetrix
02/15/2008 11:10:41.529 8629 1 EMC:SYMCLI SymDgBcvControl() BCV Library error failure with sts: SYMA
PI_C_FAILED_REMOTE_LOAD
02/15/2008 11:10:41.529 8629 The BCV 'SPLIT' operation FAILED.
02/15/2008 11:10:41.530 8629 The BCV control operation FAILED.

Second phase

Feb 20 03:47:42 hostname: /opt/emc/SYMCLI/V6.0.1/bin/symmir -g rdfg05 -rdf split -consistent -i 60 -c 60 -noprompt

Remote 'Split' operation execution is in progress for
device group 'rdfg05'. Please wait...


The SRDF/A session could not be successfully suspended

Feb 20 03:49:13 hostname: mail_msg Remote GBCV-split status /opt/emc/lib/maillist /opt/emc/logs/rdfg05_gbcv_status.log
Feb 20 03:49:14 hostname: SRDFA15:MAJOR: Foreground Split failed for rdfg05.
/opt/OV/bin/OpC/opcmsg s=MAJOR a=SRDFA o=rdfg05 n=aded140p msg_g=UNIXTEST msg_t=Feb 20 03:49:14 aded140p: SRDFA15:MAJOR: Fore
ground Split failed for rdfg05.

from symapi log

02/20/2008 03:49:13.756 21955 1 EMC:SYMCLI SymDgBcvControl() BCV Library error failure with sts: SYMA
PI_C_RDFA_SESSION_SUSP_FAILED
02/20/2008 03:49:13.756 21955 The BCV 'SPLIT' operation FAILED.
02/20/2008 03:49:13.757 21955 The BCV control operation FAILED.
02/20/2008 04:01:01.437 21744 1 EMC:SYMCLI issue_syscall_ranges Syscall 109. sts: SYSCALL_NON_ZERO_RC, o
rder 17: SYSCR_REMOTE_FAILED

from symevet log for both times

Wed Feb 20 03:47:53 2008 RE-8C Symm RDF Informational 0x0011
All RDF links in an RDF group are now operational after an 'All Links Down' event

Fri Feb 15 10:57:47 2008 RE-9C Symm RDF Informational 0x0011
All RDF links in an RDF group are now operational after an 'All Links Down' event

the second event/phase try to do suspend which failed. I hope the this suspend might be triggered by the link failure detection. Also i was able to see in the symevent of the target(R2 & GBCV) that the "Not Ready device" from FA7 and FA10 where the GBCV devices are allocated.

now the links are not compliaining..also the "Not Ready device" stopped.

the split here is the R2+GBCV split where both the devices are on the remote DMX(2000S-M2 5671)

le tme know how severe these errors are.

2.8K Posts

February 28th, 2008 03:00

Santhosh let me try to answer your very last post :D

1) Limbo time

When all the physical RDF links goes down, RDF will wait a few seconds before dropping the link. It's called "Limbo time". It's a timeout that you configure on your RDF groups.

2) CACA.20

The DMX logs almost everything .. Obviously every thing that gets logged, have an identifier and a timestap. CACA is the identifier of SRDF/A errors. CACA.20 is an error that translates to the following description "RDF mirror made Not Ready on the RDF link and Tolerance mode is off." (unfortunatyl I can't share the solution with you since it's restricted). So what happened ?? Your RDF links went down and the DMX waited 'till limbo time expired. Since you disabled the so called "tolerance" (your SRDF/A session was running with consistency enabled) ALL the devices in the RDFG dropped to "partitioned".

3) RTS

Usually it's the acronym for a Regional Technical Specialist .. an highly skilled guy that usually gets involved when things are really bad and your CE isn't enough :D .. But I may be wrong.

1.3K Posts

February 28th, 2008 03:00

Let me put this way..

what are Limbo time , RTS and error "CACA.20"?

1.3K Posts

February 28th, 2008 08:00

the device became '"suspended" state.

is the disabling/(off) "tolerance mode" not a good idea on a "SRDF/A with consistency" enabled.?

How about an increase in the limbo time? what is the minimum and what is the maximum value recommended? and what are the impacts due to a change ?

2.8K Posts

February 28th, 2008 09:00

The devices dropped from Consistent to Partitioned (due to unavailable link) and later come back as "Suspended" when the link come back again.

The "tolerance" mode is the other side of the same coin .. When you issue "symcfg -dg async_dg enable" you are "enabling consistency" (if you look at Solution Enabler side of the coin) but at the same time you are "disabling tolerance" (if you look inside the box) .. I understand that can be fuzzy .. but that's the way it works ;-)

Increasing link limbo with SRDF/A isn't a good idea .. It's better to enable the so called "Transmit Idle" feature .. When the link limbo expires and you are not using "transmit idle" your volumes will drop NR on the link (either Suspended or Partitioned depending on the status of the link itself when you run the "symrdf query" command). If you enable Transmit Idle the box will still wait the Link Limbo and then will invoke the Idle feature. The trick is that the R1 volumes will keep on recording updates to tracks still waiting to be sent on the other side (since the link is unavailable). When the link will be available again, R1 devices will push as fast as possible all waiting tracks over the link to R2 devices. R1 devices will eat up as much cache as possible before dropping the SRDF/A session.

With DMX3 and 5772 codes you have also the "DSE" Delta Set Extension feature that will use also a "reserved" pool of devices in the backend to give you even more "cache" where to store tracks waiting to be sent over the RDF link.

I think that your local Solution Architect may help you in understanding and implementing Transmit Idle and (if possible) DSE. Please tell them to contact me if they need further assistence.

2.8K Posts

February 28th, 2008 10:00

Unfortunatyl DSE is NOT available at 5771 :( .. I also have to stay at 5771 at customer site :( .. I asked if DSE will ever be available as a FIX for 5771 but nobody confirmed. I'm sorry.

Transmit Idle can be enabled online .. and since I already did on my RDF groups, I can say that it works ;-) But before enabling it, talk to your local friendly CE / SA / RTS / IS / whatever EMC employee you have handy :D and ask details about TransIdle.

1.3K Posts

February 28th, 2008 10:00

i am at 5771 so DSE wont be applicable to me? Can the "trasmit idle" be enabled online?
No Events found!

Top