Start a Conversation

This post is more than 5 years old

Solved!

Go to Solution

9426

November 2nd, 2012 05:00

emc networker probe job had unrecoverable failure(s) this job is being abandoned

We run NetWorker 7.5.2 and when trying to run the Oracle backups we get this message 'emc networker probe job had unrecoverable failure(s) this job is being abandoned'

I did a savegrp -vv -p -c ODBPROD -G on the client and on each Oracle savegroup and these are the results:

H:\>savegrp -vv -p -c ODBPROD -G OracleERPProdDailyHotBackup

32451:savegrp: odbprod:/archlogs/arch                    level=1

32451:savegrp: odbprod:/u01/oracle/proddb                level=1

32451:savegrp: odbprod:/u02/backups                      level=1

7236:savegrp: Group will not limit job parallelism

32493:savegrp: odbprod:probe                                 started

savefs -s alcbackup01 -c odbprod -g OracleERPProdDailyHotBackup -p -l full -R -v -F /archlogs/arch /u01/oracle/proddb /u02/backups

40473:savegrp: command ' savefs -s alcbackup01 -c odbprod -g OracleERPProdDailyHotBackup -p -l full -R -v -F /archlogs/arch /u01/o

racle/proddb /u02/backups' for client odbprod exited with return code 1.

32496:savegrp: job (164285) host: odbprod savepoint: odbprod:Probe had SEVERE indication(s) at completion.

savegrp OracleERPProdDailyHotBackup: Warning - the job output is NULL

7340:savegrp: odbprod:probe abandoned.

7076:savegrp: --- Probe Summary ---

odbprod:Probe                        level=full, dn=-1, mx=0, vers=pools, p=1

odbprod:Probe                        level=full, pool=OraclePool, save as of 11/2/2012 7:34:52 AM

odbprod:/archlogs/arch               level=1, dn=-1, mx=0, vers=pools, p=1

odbprod:/archlogs/arch                 level=1, pool=OraclePool, save as of 10/29/2012 8:46:54 AM

odbprod:/u01/oracle/proddb           level=1, dn=-1, mx=0, vers=pools, p=1

odbprod:/u01/oracle/proddb             level=1, pool=OraclePool, save as of 10/29/2012 8:46:57 AM

odbprod:/u02/backups                 level=1, dn=-1, mx=0, vers=pools, p=1

odbprod:/u02/backups                   level=1, pool=OraclePool, save as of 10/29/2012 8:46:56 AM

odbprod:index                        level=1, dn=-1, mx=0, vers=pools, p=1

odbprod:index                           level=1, pool=OraclePool, save as of 11/1/2012 3:16:24 AM

7241:savegrp: nsrim run recently, skipping

H:\>savegrp -vv -p -c ODBPROD -G OracleERPProdDailyHotBackup

32451:savegrp: odbprod:/archlogs/arch                    level=1

32451:savegrp: odbprod:/u01/oracle/proddb                level=1

32451:savegrp: odbprod:/u02/backups                      level=1

7236:savegrp: Group will not limit job parallelism

32493:savegrp: odbprod:probe                                 started

savefs -s alcbackup01 -c odbprod -g OracleERPProdDailyHotBackup -p -l full -R -v -F /archlogs/arch /u01/oracle/proddb /u02/backups

40473:savegrp: command ' savefs -s alcbackup01 -c odbprod -g OracleERPProdDailyHotBackup -p -l full -R -v -F /archlogs/arch /u01/o

racle/proddb /u02/backups' for client odbprod exited with return code 1.

32496:savegrp: job (164285) host: odbprod savepoint: odbprod:Probe had SEVERE indication(s) at completion.

savegrp OracleERPProdDailyHotBackup: Warning - the job output is NULL

7340:savegrp: odbprod:probe abandoned.

7076:savegrp: --- Probe Summary ---

odbprod:Probe                        level=full, dn=-1, mx=0, vers=pools, p=1

odbprod:Probe                        level=full, pool=OraclePool, save as of 11/2/2012 7:34:52 AM

odbprod:/archlogs/arch               level=1, dn=-1, mx=0, vers=pools, p=1

odbprod:/archlogs/arch                 level=1, pool=OraclePool, save as of 10/29/2012 8:46:54 AM

odbprod:/u01/oracle/proddb           level=1, dn=-1, mx=0, vers=pools, p=1

odbprod:/u01/oracle/proddb             level=1, pool=OraclePool, save as of 10/29/2012 8:46:57 AM

odbprod:/u02/backups                 level=1, dn=-1, mx=0, vers=pools, p=1

odbprod:/u02/backups                   level=1, pool=OraclePool, save as of 10/29/2012 8:46:56 AM

odbprod:index                        level=1, dn=-1, mx=0, vers=pools, p=1

odbprod:index                           level=1, pool=OraclePool, save as of 11/1/2012 3:16:24 AM

7241:savegrp: nsrim run recently, skipping

H:\>savegrp -vv -p -c ODBPROD -G OracleERPProdDailyHotBackup

32451:savegrp: odbprod:/archlogs/arch                    level=1

32451:savegrp: odbprod:/u01/oracle/proddb                level=1

32451:savegrp: odbprod:/u02/backups                      level=1

7236:savegrp: Group will not limit job parallelism

32493:savegrp: odbprod:probe                                 started

savefs -s alcbackup01 -c odbprod -g OracleERPProdDailyHotBackup -p -l full -R -v -F /archlogs/arch /u01/oracle/proddb /u02/backups

40473:savegrp: command ' savefs -s alcbackup01 -c odbprod -g OracleERPProdDailyHotBackup -p -l full -R -v -F /archlogs/arch /u01/o

racle/proddb /u02/backups' for client odbprod exited with return code 1.

32496:savegrp: job (164285) host: odbprod savepoint: odbprod:Probe had SEVERE indication(s) at completion.

savegrp OracleERPProdDailyHotBackup: Warning - the job output is NULL

7340:savegrp: odbprod:probe abandoned.

7076:savegrp: --- Probe Summary ---

odbprod:Probe                        level=full, dn=-1, mx=0, vers=pools, p=1

odbprod:Probe                        level=full, pool=OraclePool, save as of 11/2/2012 7:34:52 AM

odbprod:/archlogs/arch               level=1, dn=-1, mx=0, vers=pools, p=1

odbprod:/archlogs/arch                 level=1, pool=OraclePool, save as of 10/29/2012 8:46:54 AM

odbprod:/u01/oracle/proddb           level=1, dn=-1, mx=0, vers=pools, p=1

odbprod:/u01/oracle/proddb             level=1, pool=OraclePool, save as of 10/29/2012 8:46:57 AM

odbprod:/u02/backups                 level=1, dn=-1, mx=0, vers=pools, p=1

odbprod:/u02/backups                   level=1, pool=OraclePool, save as of 10/29/2012 8:46:56 AM

odbprod:index                        level=1, dn=-1, mx=0, vers=pools, p=1

odbprod:index                           level=1, pool=OraclePool, save as of 11/1/2012 3:16:24 AM

7241:savegrp: nsrim run recently, skipping

1.7K Posts

March 19th, 2013 06:00

Hi Alf,

Of course, if server is not up and running NetWorker services won't be either, so no nsrexecd on the client that server can communicate with.

Thank you,

Carlos.

544 Posts

November 2nd, 2012 07:00

Hi,

What is the Oracle version on that client and what is the OS ?

Most probably this is a network connection or naming resolution issue, you have to check the following:

Try to do the following from the backup server to the networker client ( using short name and FQDN name):

  • check hosts-file if you can find the client (IP Address, FQDN, Shortname) and ensure there is no duplicated entries
  • ping
  • nslookup
  • rpcinfo -p

  If this all works, try to do the following from the client to the backup server ( using short name and FQDN name): 

  • check hosts-file if you can find the backup server
  • check server file under the nsr directory if the backup server is added there
  • ping
  • nslookup
  • rpcinfo -p

Also double check the aliases of the client and make sure it includes all possible aliases (shortname, FQDN, IP address)

Hope this helps, Waiting your updates.

Ahmed Bahaa

November 2nd, 2012 08:00

Thank you for the response.

We use Oracle version 11.5.10.2 and the OS is Red Hat Linux.

"check hosts-file if you can find the client (IP Address, FQDN, Shortname) and ensure there is no duplicated entries"

How do I check this?

That was the only one I was unable to figure out, the rest is as follows:

Ping is good and I am able to pull the server and IP address with nslookup ODBPROD, but when I do the rpcinfo -p ODBPROD I receive this message:

H:\>rpcinfo -p ODBPROD

rpcinfo: can't contact lgtomapper: 352:Remote system error - No connection could be made because the target machine actively refused it.

Program          vers          proto          port

100000               2               tcp            111

100000               2               udp           111

100024               1               udp           802

100024               1               tcp            805

100021               1               udp          1026

100021               3               udp          1026

100021               4               udp          1026

100021               1               tcp          32772

100021               3               tcp          32772

100021               4               tcp          32772

November 2nd, 2012 09:00

I just found some additional information, and I am wondering if this could be the cause. I went back to our server room and looked at the Oracle station and it is showing ODBPROD as not fully loaded, see the screen shot.ODBPRODScreenshot1.jpg

Since it is still showing it is loading, then the logical explaination would be that the EMC commands have yet to load. The IT department here is going to reboot ODBPROD again and see if it clears the issue.

96 Posts

March 19th, 2013 05:00

in most cases a probe failure would be due to connection issues to the client and also dns resolutions

No Events found!

Top