Unsolved
This post is more than 5 years old
9 Posts
0
3595
nsrjobd.exe on networker server crashes when savepnpc oracle backups are running
We have a Windows server 2008 R2 SP1 server with NetWorker version 7.6.5.3.Build.1195 running on it.
2/3 times a month the nsrjobd.exe crashes when running oracle on AIX savepnpc backups.
I have already made several networker updates in the last months and the windows server is patched to the latest level. Nothing works.
Here is the report.wer from last night:
Version=1
EventType=BEX64
EventTime=130222957890058334
ReportType=2
Consent=1
ReportIdentifier=58274b8f-110a-11e3-adaa-f4ce46b94044
IntegratorReportIdentifier=58274b8e-110a-11e3-adaa-f4ce46b94044
Response.type=4
Sig[0].Name=Application Name
Sig[0].Value=nsrjobd.exe
Sig[1].Name=Application Version
Sig[1].Value=7.6.5.3
Sig[2].Name=Application Timestamp
Sig[2].Value=5155db4a
Sig[3].Name=Fault Module Name
Sig[3].Value=StackHash_1dc2
Sig[4].Name=Fault Module Version
Sig[4].Value=0.0.0.0
Sig[5].Name=Fault Module Timestamp
Sig[5].Value=00000000
Sig[6].Name=Exception Offset
Sig[6].Value=0000000000000000
Sig[7].Name=Exception Code
Sig[7].Value=c0000005
Sig[8].Name=Exception Data
Sig[8].Value=0000000000000008
--------------------------
And the daemon.log:
8/30/2013 2:23:09 AM savegrp Session channel closed by nsrjobd, exit code: (unknown) 076641
8/30/2013 2:23:09 AM savegrp Session channel closed by nsrjobd, exit code: (unknown) 039078
8/30/2013 2:23:09 AM savegrp RPC error: Connection lost to nsrjobd76641
8/30/2013 2:23:09 AM savegrp Session channel closed by nsrjobd, exit code: (unknown) 076641
8/30/2013 2:23:09 AM savegrp Session channel closed by nsrjobd, exit code: (unknown) 076641
8/30/2013 2:23:09 AM savegrp Session channel closed by nsrjobd, exit code: (unknown) 076641
8/30/2013 2:23:09 AM savegrp Session channel closed by nsrjobd, exit code: (unknown) 039078
8/30/2013 2:23:09 AM savegrp RPC error: Connection lost to nsrjobd39078
8/30/2013 2:23:09 AM savegrp RPC error: Connection lost to nsrjobd39078
8/30/2013 2:23:09 AM savegrp RPC error: Connection lost to nsrjobd39078
8/30/2013 2:23:09 AM savegrp RPC error: Connection lost to nsrjobd42506
8/30/2013 2:23:09 AM nsrd networker daemons warning: nsrjobd exited with status code 255
39078 8/30/2013 2:23:09 AM savegrp RPC error: Connection lost to nsrjobd0
8/30/2013 2:23:09 AM nsrjobd NetWorker
0 8/30/2013 2:23:09 AM nsrjobd 7.6.5.3.Build.1195
0 8/30/2013 2:23:09 AM nsrjobd 1195
0 8/30/2013 2:23:09 AM nsrjobd Fri Mar 29 10:17:07 2013
0 8/30/2013 2:23:09 AM nsrjobd Build arch.: ntx64
0 8/30/2013 2:23:09 AM nsrjobd DBG=0,OPT=
39074 8/30/2013 2:23:09 AM nsrjobd JOBS notice: Opening RAP database
42506 8/30/2013 2:23:11 AM nsrd networker daemons info: Successfully restarted nsrjobd with PID 6832
39074 8/30/2013 2:23:11 AM nsrd JOBS notice: All scheduled jobs which were in progress were killed
---------------
Iám running out of ideas can someone help ?
Herman Regterschot.
Herman1978
9 Posts
0
August 30th, 2013 02:00
Jobsdb size = 60 MB (was 40 MB till this morning) i have changed it this morning.
Retention = 3 days
Job inactivity timeout: 0
I have deleted the jobsdb and the tmp directories many times but unfortunately it doesnt help.
It happens always when savepnpc oracle/AIX backups are running.
We have never had this problem when windows/NMM (Exchanhe) backup jobs are running.
CarlosRojas
1.7K Posts
0
August 30th, 2013 02:00
Hi Herman,
Are you finding the nsrjobd crash always at around the same time? If so, what time and what processes/operations are running around that time? Maybe nsrim is running?
Could you please attaché a rendered copy of the daemon to see what is going on when the crash occurs?
Thank you,
Carlos
Herman1978
9 Posts
0
August 30th, 2013 02:00
Hi Carlos,
Here the timestamp of the last crashes:
30-8 2:23
7-8 2:18
3-8 2:50
1-8 2:16
17-7 3:38
3-7 3:39
24-6 23:22
3-6 23:22
17-5 2:18
15-5 23:22
and the daemon.log from the last few days.
Thanks.
Herman.
1 Attachment
daemon - Copy.log
CarlosRojas
1.7K Posts
0
August 30th, 2013 02:00
Hello Herman,
What is the jobsdb size and retention values you have set?
Have you tried deleting the jobsdb? To do it just stop NW services on NW server, and delete the folder /nsr/res/jobsdb, and also delete the /nsr/tmp folder while services are stopped.
Thank you,
Carlos
CarlosRojas
1.7K Posts
0
September 9th, 2013 02:00
Hello,
Can you please check with the Oracle DBA the maintenance tasks? for example, how often is he/she running the RMAN cross-checks?
RMAN cross-checks are very resources consuming on the NW server side so, if there are many cross-checks running and many backup/restore jobs running this could be a potential cause for the issue, otherwise I think would be better to raise a case with support for any possible known issue in regards to the crash with nsrjobd crash that could be already addressed either in the shape of a binary o new build.
Thank you,
Carlos