Unsolved
This post is more than 5 years old
16 Posts
0
5798
Mirrorview /A over ISCSI Error
Iam experincing an error with mirrorview /A over iscsi when there is a link failure replication doesn't resume automatically once the link comes up. Autorecover policy is set to automatic and this policy doesn't seem to work. When the clariion cx4-120 was running flare code 29 it was working fine but after upgrading to flare code 04.30.000.5.517 we noted this problem. EMC support has looked into this issue but are yet to come up with any tangible solution , The only answer I got was there could be a bug with this flare code and that it will be fixed in next patch release of first quarter 2012 but no official documatation or technical advisories from EMC on thise issue.
What I have noted is the ISCSI Login is not automatic when the link resumes and according to EMC Support the only solution is to disable and enable ISCSI connection from manage connections for the replication to continue which is a manual way not automatic.
Has anyone experienced thise problem ?
DanJost
190 Posts
0
January 17th, 2012 13:00
Same problem here. Instead of disabling/enabling iSCSI you can just run a connectivity test and it seems to come back to life. Hasn't been that big of an issue for me as the link is solid. Perhaps there is a way to run the connectivity test via cli?
Dan
Jsang2
16 Posts
0
January 18th, 2012 00:00
Problem the client wants everything automated , doesn't want manual intervention.
kelleg
4.5K Posts
0
January 24th, 2012 13:00
Do you have an open case with EMC abou this issue?
Have you looked into fixing the network issues - making the link more stable?
glen
Jsang2
16 Posts
0
January 24th, 2012 21:00
Yes there is an open case with EMC , who have looked at the SP collect logs without giving any solution so far . According to them this might a bug which will be fixed by flare code (Zeus 10 ) upgrade to be releases by end of March 2012
khughes8675
6 Posts
0
June 26th, 2012 15:00
We started running into the exact same problem with mirroview after we upgraded to flare 30. The SAN initiating the connection is on flare 04.30.000.5.523. The target SAN is on flare 04.30.000.5.517. Before the upgrade, we never experienced problems with recovery. We would get the mail home, but mirrors never became fractured. It seems the upgrade in the flare made this function more sensitive to network connectivity problems. The network errors are so minimal, we cant even capture them with the tools we are using. the problem is getting worse. Now the SP on both sides is in a "logged out" state. We have to trespass all our mirrors to the other SP. To get the problem SP logged back in, we have to reboot the SP. The when the other SP starts running into the same problems, we have to trespass everything back and reboot that SP. Not A VALID WORKAROUND! Support is trying to tell us it is our network, but I pointed out the fact that this got worse after our flare upgrade. Now SPB is logged out on both SANs. SPA is starting to hiccup now, and only fractures some of our mirrors, not all. We have also been able to restore connectivity in the past by pulling the cable and reseating it. Also not a valid work around.
The original post described using this method to re-establish connectivity:
"What I have noted is the ISCSI Login is not automatic when the link resumes and according to EMC Support the only solution is to disable and enable ISCSI connection from manage connections for the replication to continue which is a manual way not automatic."
How can I do this? The only option I see is to de-register/re-register the port. Also, SPA is currently in this state:
A login session to the iSCSI target device is already active. (0x712b8008)
Jsang2
16 Posts
0
June 27th, 2012 01:00
The problem was sorted after the Flare code was upgraded and a patch named Zeus 10 applied , Now the Recovery Policy works Perfect without any intervention
khughes8675
6 Posts
0
June 27th, 2012 09:00
Thanks Jsang. I was not sure anybody would still be using this post You described a work around disable/enable iscsi connection. Were you de-registering then re-registering?
Jsang2
16 Posts
0
June 27th, 2012 10:00
No just disable and enable iscsi connection under manage connections , initially I used to reboot the remote SP's but was adviced by EMC support otherwise
Jsang2
16 Posts
0
June 27th, 2012 11:00
You will have to disable iscsi connection , disconnect the iscsi port wait for like 5 minutes ,then connect back and enable iscsi connection
khughes8675
6 Posts
0
June 27th, 2012 11:00
Hi Danjost
This used to fix the problem, but has become ineffective. The problem is gradually getting worse. Even rebooting the sp's no longer fixes it.
DanJost
190 Posts
0
June 27th, 2012 11:00
Like I said in a previous post, running the connectivity test on the connections will brings the MV/A connection back up. While manual, it is far less intrusive than rebooting SPs, disabling/reenabling iscsi connections, or pulling cables. Glad to hear that there is a patch that is supposed to fix this.
khughes8675
6 Posts
0
June 27th, 2012 11:00
Thanks JSang. When I try this, I get a message at the end "The specified user account already exists" Any ideas?
khughes8675
6 Posts
0
June 27th, 2012 12:00
Thanks JSang
Our mirrors are fractured at this point and I cannot get anything to replicate between the sans across the iscsi connection for both ports.
When I look under System Management -> Storage Connectivity Status -> Mirrorview Initiators It shows spb is not logged in. It is missing it's attributes, unlike spa which has its attributes. The Deregister button is not ghosted out like it is on spa. Do you know why it is missing its attributes? Is it because it is logged out? If I deregister, is it pretty straight forward to re-register?
Also, can anybody tell me if they have seen this before and what they did to resolve it?
Under iscsi management -> connection between storage systems -> Test shows the below results. Is there a way to log spa out so it re-initiates a new login? Any ideas on how I can get spb logged back in?
A login session to the iSCSI target device is already active. (0x712b8008)
Request timed out.
Test successes: 0
Test failures: 2
khughes8675
6 Posts
0
June 27th, 2012 12:00
It is going to be awhile before I can update the flare code, so anything I can do manually to get us by would really help.
lfmarrero
4 Posts
0
May 27th, 2014 07:00
I know this has been a while ago since posted but I am just curious as to what fixed the issue since I am experiencing something very similar.