Castromotorbox

217 Posts

5818

July 3rd, 2012 02:00

Timeout during Recovery with DataDomain

Hi,

We are experiencing a curious problem:

For some client, when we start a recover Job (from the client itself or directed restore, doesn't matter).

In the console, we see the job running, but he still waiting for some 40-50 minute...

These client are Linux or Windows, There is no Network différence between them.

Backup works perfectly and we are using DDBoost.

Here is a log of one of this timeout restore, its a windows machine.

In the middle of the log (who is pretty long, sorry, level 5), I put the indication of the Timeout.

Thanks for your help.

Greg

1 Attachment

Timed_out_Restore.txt

Responses(16)

C

Castromotorbox

217 Posts

1

August 27th, 2012 04:00

Hi,

I found the reason of this time out. DFA (Direct File Access)... I guess than after version 7.6.2 (?) the client try to go direct to DataDomain by default. What he couldn't do in our environnment. We have to specify, with an empty file named "nodirectfile" withou extention in a debug folder in nsr folder, that this client must NOT going directly on the backup device.

Thanks for your help.

Greg

CarlosRojas

1.7K Posts

0

July 3rd, 2012 02:00

Hi Gregoire,

What is the NetWorker version on client and server?

I found this KB that could be useful:

https://solutions.emc.com/emcsolutionview.asp?id=esg108478

Thank you.

Carlos.

C

Castromotorbox

217 Posts

0

July 3rd, 2012 02:00

Hi Carlos,

So Networker server is a windows machine with 7.6.2 build 697.

Client is a Windows machine too with Networker client 7.6.2.1 build 638

I try these restore with many sort of file, and i use the - noforce option to not write recovered file on the client machine.

Thanks for your help.

Greg

CarlosRojas

1.7K Posts

0

July 3rd, 2012 02:00

Hi,

Forgot to mention, did you try to restore any other files?

Recover daemon.raw should fail and I wouldn't suggest you to test that type of restore, recover production files of NetWorker.

Thank you.

Carlos.

CarlosRojas

1.7K Posts

0

July 3rd, 2012 03:00

Hi Greg,

Could you please try to upgrade the clients to the same build as NetWorekr server and try again?

The data was backed up using NetWorker server devices or any storage node or DFA?

Thank you.

Carlos.

C

Castromotorbox

217 Posts

0

July 3rd, 2012 04:00

Hi Carlos,

Ok I try to update my client in build 697. Server, Storage Node and client have the same version.

Now for this client it work!!!

I'm gonna try with the other one and give you a feedback as soon as possible.

Thanks.

Greg

CarlosRojas

1.7K Posts

1

July 3rd, 2012 04:00

Hi Greg,

Try to have NW server, storage node and client in same version.

Did you configure the client to restore through storage node or directly from server?

Thank you.

Carlos,

C

Castromotorbox

217 Posts

0

July 3rd, 2012 04:00

Hi,

We make the backup on a DataDomain device thru a Storage Node.

I upgrade the client and try again. give you som feedback soon.

Thanks

Greg

CarlosRojas

1.7K Posts

0

July 3rd, 2012 05:00

Hi Greg,

Good to hear that.

Will wait for your feedback, anyway did you restore the same files or something else?

And in the previous attempts, did the files get restored despite the timeout or not?

Thank you.

Carlos.

C

Castromotorbox

217 Posts

0

July 3rd, 2012 05:00

Hi,

I restore the same file. But with noforce option. By the previous attempt, the restore was successfull (with only skipped restored file)

Greg

CarlosRojas

1.7K Posts

0

July 4th, 2012 23:00

Hi Greg,

So is the problem solved now?

Thank you.

Carlos.

C

Castromotorbox

217 Posts

0

July 5th, 2012 07:00

Hi carlos,

Unfortunately, we still have the problem on some client (specially for the linux one).

The version is now the same for networker, Storage Node and client (7.6.2.7 build 697).

I don't know more where I'm supposed to look in?

Thanks for your help

Greg

C

Castromotorbox

217 Posts

0

July 5th, 2012 07:00

And we still blocked at this point of the restore

Getting function address for ddp_filecopy_stop...1eea5b00

Getting function address for ddp_filecopy_status...1eea5bb0

DDCL_INIT: success

83219:recover: DDP LOG: ddcl_vrapid_get_host_ip: host_name bechu-bck0001.media.int has host_ip 0.0.0.0

83219:recover: DDP LOG: ddcl_vrapid_set_host_ip: host_name= bechu-bck0001.media.int with host_ip 10.157.64.77

83219:recover: DDP LOG: ddcl_vrapid_get_host_port: host_name bechu-bck0001.media.int for pgm = 100005 has port 0

After a long time (30 minutes) the restore complete successfully...

Greg

CarlosRojas

1.7K Posts

0

July 8th, 2012 00:00

Hi Greg,

So looks like the "delay" is taking place in the DD itself, correct?

What DD model do you have?

Are you using DD archiving?

Do you have the routing configured correctly on DD and NW server/storage node?

Looks like routing is not working as expected, right?

Thank you.

Carlos.

C

Castromotorbox

217 Posts

0

July 9th, 2012 05:00

Hi Carlos,

Yes its seem like that. We use a DD670 OS 5.0.2 with no DD archiving but DDBoost. I guess aour routing is correctly configured as the restore works fine for other client... A question that I have: On our DD we have a first veth0 who serve for managment and a second one veth1 for backup and restore. Could it be that DD try first the veth0 to restore, and when it see that's not working, it try the second one, and that just for some linux client?

Thanks a lots for your help.

Greg

1
2

View All

No Events found!