Start a Conversation

Unsolved

This post is more than 5 years old

D2

23149

January 7th, 2014 12:00

Backups fail: Server Busy (NW 8.1)

About a month ago we upgraded from 8.0 to 8.1 and since then we have random problems with backups failing to the NW server. All other storage nodes are fine. The jobs eventually time out and just give me the following:

SQL02.domain.com:index 98519:save: Unable to setup direct save with server backup01.domain.com: busy.


Looks like the indexes are timing out. We have a dedicated pool for them with media available. Any idea? Support got them working once by disabling and enabling the storage node for the NW server under Devices. The nsrmmd showed as UNKNOWN. Afterwards it changed to 8.1.0.3.Build.219. Periodically it changes back to UNKNOWN. Now nothing is working.

14.3K Posts

January 8th, 2014 00:00

Did you try to disable/enable them too?

4 Operator

 • 

1.3K Posts

January 8th, 2014 00:00

During the upgrade process did you do a uninstall of 8.0 and then installed 8.1 or did a direct up-gradation? 

February 28th, 2014 01:00

Hi. Same symptoms on Solaris 10/NW 8.1.0.4, upgraded form 7.6.1.2.

We have seen some groups stalled from one or two days before. Group log says:

--- Unsuccessful Save Sets ---

* backup:index suppressed 1122 bytes of output.

* backup:index 98519:save: Unable to setup direct save with server backup: busy.

Any ideas?

4 Operator

 • 

1.3K Posts

February 28th, 2014 02:00

Alberto,

    Did you check if there are any active tape alerts for any other pool ?

14.3K Posts

February 28th, 2014 03:00

Try to disable direct save.  Actually, if this is not index group, then place no index save on group and create index group with clients in and set index only in group properties.  After that you can focus on tests with index group as that one will save index and bootstrap only.

February 28th, 2014 03:00

Thanks! No, there weren't any tape alerts for too many time. There were enough recyclable tapes and available drives.

February 28th, 2014 03:00

Thanks Hrvoje. The sample log that I copied before was from a group that only copies indexes from all clients once a week. But this error also appears in other groups related to standard savesets, not only to index savesets:

NetWorker savegroup: (alert) casigu3_off_dia aborted, Total 1 client(s), 1 Failed, Cloning failed. See group completion details for more information.

Failed: casilda3_redbk

Start time:  Wed Feb 26 15:27:02 2014
End time:    Fri Feb 28 09:59:09 2014

Automatic cloning of save sets to pool vctiunidiapc never started.
Cloned save sets: 0 Total, 0 Failed


--- Never Started Save Sets ---

savegrp: casilda3_redbk:index index never started


--- Unsuccessful Save Sets ---

  casilda3_redbk:/: retried 1 times.
  casilda3_redbk:/var: retried 1 times.
  casilda3_redbk:/var aborted.
  casilda3_redbk:/oracle/client: retried 1 times.
  casilda3_redbk:/oracle/JNE: retried 1 times.
  casilda3_redbk:/oracle/JNE aborted.
* casilda3_redbk:/sapmnt/JNE suppressed 296 bytes of output.
* casilda3_redbk:/sapmnt/JNE 98519:save: Unable to setup direct save with server backup: retry needed.
* casilda3_redbk:/sapmnt/JNE 98519:save: Unable to setup direct save with server backup: retry needed.
...

* casilda3_redbk:/sapmnt/JNE 98519:save: Unable to setup direct save with server backup: retry needed.

* casilda3_redbk:/sapmnt/JNE (interrupted), exiting

* casilda3_redbk:/sapmnt/JNE Termination request was sent to job 12093 as requested; Reason given: Aborted

* casilda3_redbk:/usr/sap/JNE suppressed 296 bytes of output.

* casilda3_redbk:/usr/sap/JNE 98519:save: Unable to setup direct save with server backup: retry needed.

...

* casilda3_redbk:/usr/sap/JNE 98519:save: Unable to setup direct save with server backup: retry needed.

* casilda3_redbk:/usr/sap/JNE (interrupted), exiting

* casilda3_redbk:/usr/sap/JNE Termination request was sent to job 12094 as requested; Reason given: Aborted

* casilda3_redbk:/usr/sap/trans suppressed 296 bytes of output.

* casilda3_redbk:/usr/sap/trans 98519:save: Unable to setup direct save with server backup: retry needed.

14.3K Posts

February 28th, 2014 03:00

All these clients have same error: Unable to setup direct save with server backup.

So, to me logical next step is to disable direct save (it is client properties).  Try to do that and see if that makes any difference.

6 Posts

February 23rd, 2015 10:00

your drives probably unmounted. Go to Devices and right click the appropriate device and select mount.

14.3K Posts

February 27th, 2015 01:00

If you reached max number of streams, then legacy mode over SN won't help neither as those streams would be just rerouted.  If you suspect you reached that, check DD logs or log in /nsr/logs/sg/ to see if there is anything more from DDBoost API there.

Server busy is too general as it might be too many things (including network path).

12 Posts

February 27th, 2015 01:00

Hi Hrvoje

i got a doubt here, we are using client direct save to DD using DDboost. we are getting frequent of these warning/errors.

As per my understating in client direct will send all the data directly DD instead of the sending it to networker or SN. So if DD is occupied with full save streams or reached max throughput, its understandable that DD cant accept any more new sessions and drop off to legacy mode. but this message shows that my backup server is busy.

i wonder , at 1st place why my backup server is that much busy as no backups going through it? and why it is Unable to setup direct save to DD when NW server is busy (where it wont get involved in sending data) ?

Please help me to understand better.

Thanks!!

12 Posts

February 27th, 2015 02:00

I mean in generally, my DD is never occupied with full save streams as per the DD report i got.

i thinking how to avoid server busy issue. will increase in RAM/CPU's of backup server can help me?

i  have even checked that RAM & CPU's level they quite normal from OS level.

is there anyway that can i avoid these server busy issues?

14.3K Posts

February 27th, 2015 03:00

It depends what server busy really is caused by in this case.  It sounds more like it can't serve sessions due to parallelism settings or something else.  I don't think jumping to memory/CPU would help; rather focus on tests to get what is the pattern (eg. is it the same client(s), is it always around same time, check daemon and system logs and similar).

35 Posts

February 7th, 2016 11:00

I guess this is a known issue on 8.1.3.0. Support says that going to 8.2.2.4 would help in fixing the issue.

146 Posts

January 17th, 2017 10:00

This issue isn't related to the version of Networker you are using. Sorry for re-hashing an older thread, but I battle with this issue constantly with our SQL backups! And after two months of battling this problem with a highly skilled EMC resident, multiple SR's, and finally a full upgrade from Networker 8.2.1.4 to 8.2.3.8, the same issue persists. The only difference is this......

Networker 8.2.1.4

Unable to setup direct save with server XXXXX.com: busy.

Networker 8.2.3.8

Unable to setup direct save with server XXXXX.com: retry needed.

So, the issue remains, but they changed it to read "retry needed", from "busy".

Has anyone found out what is causing this? EMC support cannot seem to get to the bottom of it. We have a lot of horsepower running our backup environment, so I should not be pushing anything past its limits. We have submitted many logs, there does not seem to be a particular time that this happens, and is not with the same clients each time. They have even gone through my datadomain support bundles, and we've yet to discover a pattern.

Also, while this is occurring, other backups pretty much seem unaffected, but I do see the occasional failure of an exchange backup, or other backup job while the problem is happening. This leads me to believe that the issue is not specific to SQL, but since SQL transaction backups run much more frequently than other backups, the issue may just manifest itself more with those. Then again, when this does happen, it also kills my SQL full backups that are running. My only method to solve the problem is a full services restart, which kills everything else, too. 

No Events found!

Top