index backup not starting

Question

Hi; I am getting backup of a database first to disk with rman, then getting backup files to legato with rman script again. but for a while after getting the backup pieces to legato server no index backup is starting so the 'green ok exclamation' is never seen on group tab. i can't see anything in daemon.log. just says that 'nsrd: write completion notice: Writing to volume TTS.Gunluk.08 complete'. when i run 'savegrp -vvvv TTS ' it starts backup, i see the nsrnmo command that starts the backup but see nothing after that. if i get the backup by writing the the filesystem path not rman script, nothing is wrong. it takes the index backup successfully. because of this i think it is related with rman from client side. some time ago i tried to get the clone of backup by using set dublex command in rman script. i started the backup from client. i sent the backup pieces to clone pool after backup pool as well. it failed anyway. i don't know whether it's related with my problem. what can be the problem ? thanks.

xvan_egmond · Accepted Answer

Kenan,open an case with emc and ask them if the patch available for nsrnmo resolves this 100% cpu usage. If so ask them to provide you with the patch.Xander

ble1 · Answer

Please describe your setup (including OS and application versions) and include RMAN script. An example including daemon.log of what you say would be nice.

If you check process list do you see if index backup initiated? When RMAN triggered from client index won't be saved (only when server initiated backup is done that is when index save is triggered if allowed by pool settings).

kenanerdey · Answer

after i start the group from command line with -vvvv option, daemon.log:

01/24/07 09:22:08 nsrd: savegroup info: starting foobar (with 1 client(s))
01/24/07 09:22:08 nsrd: savegroup info: foobar running on anatolia
01/24/07 09:22:11 nsrd: savegroup info: foobar running on anatolia
01/24/07 09:26:11 nsrd: anatolia:/path/kenan_backup_sbt.rman saving to pool 'TTS' (TTS.Gunluk.02)
01/24/07 09:26:21 nsrd: anatolia:/path/kenan_backup_sbt.rman done saving to pool 'TTS' (TTS.Gunluk.02) 256 KB
01/24/07 09:26:29 nsrd: anatolia:/path/kenan_backup_sbt.rman saving to pool 'TTS' (TTS.Gunluk.02)
01/24/07 09:26:33 nsrd: anatolia:/path/kenan_backup_sbt.rman done saving to pool 'TTS' (TTS.Gunluk.02) 14 MB
01/24/07 09:27:11 nsrd: savegroup info: foobar running on anatolia
01/24/07 09:27:15 nsrd: write completion notice: Writing to volume TTS.Gunluk.02 complete
01/24/07 09:32:12 nsrd: savegroup info: foobar running on anatolia
01/24/07 09:37:12 nsrd: savegroup info: foobar running on anatolia
01/24/07 09:42:12 nsrd: savegroup info: foobar running on anatolia
01/24/07 09:47:12 nsrd: savegroup info: foobar running on anatolia
01/24/07 09:52:12 nsrd: savegroup info: foobar running on anatolia
01/24/07 09:57:12 nsrd: savegroup info: foobar running on anatolia
01/24/07 10:02:12 nsrd: savegroup info: foobar running on anatolia
01/24/07 10:07:12 nsrd: savegroup info: foobar running on anatolia
01/24/07 10:12:12 nsrd: savegroup info: foobar running on anatolia
01/24/07 10:17:13 nsrd: savegroup info: foobar running on anatolia
01/24/07 10:22:13 nsrd: savegroup info: foobar running on anatolia
01/24/07 10:27:13 nsrd: savegroup info: foobar running on anatolia
01/24/07 10:32:13 nsrd: savegroup info: foobar running on anatolia
01/24/07 10:35:09 nsrd: savegroup alert: foobar aborted, total 1 client(s), 0 Hostname(s) Unresolved, 1 Failed, 0 Succeeded. (anatolia Failed)

oracle version : Oracle9i 9.2.0.6
nmo version : 4.1
networker server: version 7.2.2, windows 2003
client : hp-ux 11.11

this is a sample rman script getting online backup:

connect catalog xxx@rman;
connect target xxx@sid;
run {
allocate channel t1 type 'SBT_TAPE' parms 'ENV=(NSR_SERVER=ltoserver, NSR_CLIENT=anatolia, NSR_DATA_VOLUME_POOL=TTS)' CONNECT 'xxx@sid';
backup tablespace tools;
release channel t1;
}

and after backup finished but not ended, this is the ps -ef | grep nsr output from client

root 4304 4303 0 11:33:34 ? 0:00 nsrnmostart -s ltoserver -g foobar -LL -m anatolia -t 116962358
root 4303 4302 0 11:33:34 ? 0:00 /bin/sh /opt/networker/bin/nsrnmo -s ltoserver -g foobar -LL -m
root 3901 1 0 11:33:10 ? 0:00 /opt/networker/bin/nsrexecd
root 4301 3903 0 11:33:34 ? 0:00 /opt/networker/bin/nsrexecd
root 6701 29941 0 11:48:41 pts/3 0:00 grep nsr
root 4302 4301 0 11:33:34 ? 0:00 /bin/sh /opt/networker/bin/nsrnmo -s ltoserver -g foobar -LL -m
root 3903 3901 0 11:33:10 ? 0:00 /opt/networker/bin/nsrexecd

ble1 · Answer

As far as I see it, NMO backup is still running. Please change RMAN script to following and try again:

connect target user/pass@inst
coinnect rcvcat user/pass@inst
 
run {
allocate channel t1 type 'SBT_TAPE' 
send channel t1 "NSR_ENV=(NSR_SERVER=ltoserver, NSR_CLIENT=anatolia, NSR_DATA_VOLUME_POOL=TTS)
allocate channel t2 type 'SBT_TAPE' 
send channel t2 "NSR_ENV=(NSR_SERVER=ltoserver, NSR_CLIENT=anatolia, NSR_DATA_VOLUME_POOL=TTS)
sql 'ALTER SYSTEM ARCHIVE LOG CURRENT';
change archivelog all validate;
backup archivelog all skip inaccessible;
release channel t1;
release channel t2;
resync catalog;
}

Try first to run this from HPUX box (rman cmdfile ) and if it work then try from server. If that works we can proceed to tablespace then.

xvan_egmond · Answer

Hi Kenan,there is a known bug in nmo4.1 for HPUX systems. If you check on the client you will probably see a high cpu-load after the rman backup finishes.If so, open a case with EMC and ask for a patch to resolve this issue. Worked for us.Xander

ble1 · Answer

Hi Xander,Do you have LGTpa for that one? I guess it would be fixed in 4.2 as we have Oracle on HPUX (both RISC and Itanium) we didn't see that one yet.

kenanerdey · Answer

Hrvoje,i tried the script you sent, but nothing was different. again processes in client are still running, but cpu load is normal. we have another hp-ux box with the same oracle, os and nmo version. but nothing is wrong with it.

ble1 · Answer

Try NMO 4.2. Be aware that patch could be required if cluster is running for that version. If not, make sure it is down (cmhaltcl).

kenanerdey · Answer

i upgraded nmo version to 4.2, tried again. and result is the same. processes aren't terminated. can it be because of a wrong permission of a directory or file so processes can't write. i looked at into /nsr directory but i nothing is different from the other hp-ux box.
or can it be a because of a configuration change in rman ?

ble1 · Answer

Could you please try following, go to that client, log as oracle and run that RMAN script again. I wish to see at which point it breaks and if there is any error at RMAN level too (or waiting). This could be also related to Oracle too.

kenanerdey · Answer

if i run from client, it doesn't break, doesn't give any error. if run it from networker server it successfully does until getting index backup.

ble1 · Answer

Well, index backup hasn't started yet and it is unrelated to your issue. Check once again ps output. What we see is that nsrnmostart is still there meaning that index part didn't started yet. If client initiated part works in that case it is either something to do with some environment variable or something else. To see what you will need to enable debug level for scheduled backup. You may wish to proceed alone with that, but I would suggest to involve support.

kenanerdey · Answer

hi hrjove,when i looked at why nsrnmostart is not ending, i found wait status is PIPE in glance plus tool. when i looked at the nsrnmostart.log i saw these lines at the end of the file. perhaps it's related:nwora_run_RMAN: The envp is:nwora_run_RMAN: Spawning the RMAN session.nwora_spawn_RMAN: Creating the communication pipe.nwora_spawn_RMAN: Making the pipe non-blocking.nwora_spawn_RMAN: Spawning the RMAN process.nwora_spawn_RMAN: Spawned the RMAN process 12827.rman returned successLeaving Function nwora_nsrnmostart_rmanPost-processing command succeeded.

ble1 · Answer

That part looks nice and is error free from what I see. That post-processing part - is that your comment or part from logs? Do you use post-processing at all within NMO? Is there anything in RMAN log?

sarpydog · Answer

Dear All,I faced the same problem too.Backup Server can not write the client's index after group backup finished.But Backup Server will write the index immediately if I use the wrong RMAN script.And the job failed.Because my competitor gave me a hard question.My catalog DB is unable to open and no idea to force unregister the catalog DB.So please check the catalog DB is running and working fine.Best Regards,Dennis

NetWorker

Was this post helpful?