257 Posts

June 5th, 2007 05:00

Hi Jake

Is there any pattern to the failures? ie only Windows 2000 hosts affected?
Did you also upgrade the Flare code on the Clariion?

You may want to read the EMC KnowledgeBase solution emc161580 on Powerlink.

If you are still in trouble, please open an EMC SR and we'll investigate for you.

Cheers
James

11 Posts

June 5th, 2007 09:00

James, thanks for the reply.

All the hosts (those that are working and those that are having issues) are Windows 2003 Enterprise Edition SP1.

Yes we did upgrade to Flare 24 back in Feb 07 (as well as all the host based software to .24; admsnap, navicli, naviagent,etc) based on recommendation from Support to resolve a queued I/O problem we were having on the CX.

I took a look at emc161580 and we are running Solutions Enabler 6.4.0.5 on all the hosts, so I don't think that is related even though the log entries look similar. I may try Solutions Enabler 6.3.2.20 on one of the hosts just too see if that makes a difference.

If you think of anything else, please let me know. Otherwise it's off to Support!

Thanks again,

Jake

257 Posts

June 6th, 2007 00:00

Hi Jake,

Funnily enough I just had a call with this exact error for consistently failing Sancopy jobs after upgrading to RM 5.0.2 (and upgrading navi agent/cli, etc). All hosts except this host was working for Incr Sancopy.

I fixed (at least this particular instance of the error) it by addding in the required entries in the Navisphere agent's agent.config file. The host had a blank agent.config.
Typically C:\Program Files\emc\navisphere agent\agent.config
I added the required lines, eg

user (username)@spA
user (username)@spB
user system@spA
user system@spB

(The user will need to be the same as all the other users in the RM configuration to ensure the clariion authorization works.)

Restart Naviagent service and then Replication Manager Client service.
Try your Incr sancopy job again.


Let me know how it goes for you.

Best of luck,
James.

1 Message

June 7th, 2007 11:00

Got exact same issue. Running FLARE 24 and upgraded to RM5 Sp2. Tried agent config file, and solution en 6.3.2.20 - no luck. The odd thing is that jobs are failing RANDOMLY. Meaning the same job may fail and then work again.

11 Posts

June 9th, 2007 19:00

Yep, dima, I am seeing the same thing now that it has been running on 5.0.2 for a week. About 1/3 failure rate randomly.

I've double checked versions and agent.config settings on all hosts.

Do you see the same error everytime it fails? Mine is always the same as my first post.

11 Posts

June 9th, 2007 19:00

Agent.configs all look correct to me. Each host's agent.config includes any CX SPs that the host is zoned to see.

Not all hosts can see the remote SanCopy destination array, so those SPs aren't listed in the agent.config.

257 Posts

June 12th, 2007 10:00

Are ye guys making sure you are using the same username and password for the navi authentication for all RM clients?

The other thing my customer had was 1 client using Username X for navi authentication and username Y for RM authentication to the array.

257 Posts

June 17th, 2007 23:00

Guys,

We had another customer who had this issue and they opened a call with us and engineering have identified the problem in the RM Code, which is due to the new Flare 24 single management interface design.

As long as your issue can be confirmed to be the same, there is a hotfix available for you. If you open a Service Request and ask for Hotfix 31704 for RM 5.0.2, it will resolve your problem.

If you search using the Windows Search option;
for "files containing the following words"
in the RM Client debug logs directory on the production or local mount host (C:\program Files\emc\rm\logs\client\)

and find the following;
main. CLAR Err: Error returned from Agent
main.
CLAR Err: This command was sent to the SP that does not own the lun (0x71008043)

Then you can confirm it is the same issue and the hotfix will resolve this.

Thanks
James.

77 Posts

July 3rd, 2007 07:00

This is also happening to my setup after a new RM 5.0.2 install. I am waiting for the HotFix # 31704 to fix the way the naviseccli interprets the owner information in the listsessions output. If you have the jobs retry about 3 times in 60 second intervals, they will eventually grab the proper SP and the jobs will complete. This shouldn't be a permanent fix but will get you through till you receive the HotFix.

257 Posts

July 5th, 2007 02:00

Hi Drozz,

I put the RM 5.0.2 hotfix on ftp://ftp.emc.com/incoming/31704 for you.

Please ensure to read the readme.

Thanks

7 Posts

August 14th, 2007 18:00

I have the same problem
so how can I have this fix (31704)
my mail is olivier.rousset@fr.netgrs.com
Thanks

257 Posts

August 15th, 2007 01:00

Hey there

You absolutely need to be using Replication Manager 5.0 Sp2 - no other version will work.

I have uploaded the hotfix again for you
ftp://ftp.emc.com/incoming/31704/

Please read the readme :)

James.

7 Posts

August 16th, 2007 00:00

Thanks
do you know this problem
you create 2 tasks : one for clone and one for Full Sancopy, it work separeatly
and when you make the second dependant of the first I have this :
"
2007 08 16 09:36:09 000177 ERROR: Could not find an existing replica associated with job SanCopy CBSW103. Replicas mounted Read/Write without discard changes on unmount are not eligible for selection.
"
do you know this problem?
thanks
Olivier

7 Posts

August 16th, 2007 03:00

Thanks
If I well understand
I must select "Create and mount a snap copy of the replica" of my clone job witch is my first job, and then make the sancopy job
that's it?
because the order of the job is, in first Clone and then Sancopy
Thanks

257 Posts

August 16th, 2007 03:00

Hey Olivier

Yes, this is by design.

"discard changes on unmount are not eligible for selection"
Means when you mount the local Clone copy, you are actually accessing a snapshot of the local clone lun. This allows the local clone lun to stay static - ie no writes - while the SanCopy session is running underneath.

RM cannot do a Sancopy replication of a clone lun which is being accessed directly as read/write.


In RM 5.0 - this option is in the job mount options called "Create and mount a snap copy of the replica" - Make sure this is ticked in the mount options of your clone job and you should be fine :)

Hope this helps
James
No Events found!

Top