Unsolved

This post is more than 5 years old

6 Operator

 • 

14.4K Posts

 • 

56.2K Points

5584

April 18th, 2015 13:00

nsrpush and 8.2.1.2 @ Linux

After updating server from 8.2.1.1 to 8.2.1.2, nsrpush seems to be broken (argh!).

Any operation with it just hang.  For example:

# nsrpush -s -t

91512:nsrpush: Unable to authenticate with server ' ' during RAP bind operation: Timed out

91512:nsrpush: Unable to authenticate with server ' ' during RAP bind operation: Timed out

Trying to list products I have in inventory gives the same so this is pretty much server side issue.

I already tried to restart server with cleanup of /nsr/tmp and /nsr/res/jobsdb but result is the same.  So, next step was to remove cpdb as if corruption happens there then nsrpush won't run.  So, stop NW, remove NSR's tmp, jobdb and cpdb and start again.

Now nsrupush inventory worked, but many hosts reported error during coping scripts plus server itself started to have problems, eg:

savegrp RPC severe 2 %s 1 49 80 80720 9 %s%s - %s 3 24 0  11 23 352:Remote system error 11 18 Connection refused

savegrp RAP critical 94 Unable to return a list of resources from a database: the resource database for query is null. 0

savegrp RAP warning 39 Cannot find the %s resource named '%s'. 2 11 15 33243:NSR group 26 20

savegrp RPC severe 2 %s 1 49 80 80720 9 %s%s - %s 3 24 0  11 23 352:Remote system error 11 18 Connection refused

savegrp RAP critical 94 Unable to return a list of resources from a database: the resource database for query is null. 0

savegrp RAP warning 39 Cannot find the %s resource named '%s'. 2 11 15 33243:NSR group 26 20

nsrd NSR notice 58 Scheduled resource list: %s %s exited with return code %d. 3 0 9 NSR group 0 20 1 1 4

savegrp RPC severe 2 %s 1 49 80 80720 9 %s%s - %s 3 24 0  11 23 352:Remote system error 11 18 Connection refused

savegrp RAP critical 94 Unable to return a list of resources from a database: the resource database for query is null. 0

savegrp RAP warning 39 Cannot find the %s resource named '%s'. 2 11 15 33243:NSR group 26 20

nsrd NSR notice 58 Scheduled resource list: %s %s exited with return code %d. 3 0 9 NSR group 0 20 1 1 4

nsrd NSR notice 58 Scheduled resource list: %s %s exited with return code %d. 3 0 9 NSR group 0 20 1 1 4

I decided to restart server again since nsrwatch would not come back.  At this point, while nsrpush run it was able to blow away the server so perhaps I should do more careful inventory.  Since I had full list of nsrpush inventory output, I decided to extract names of hosts where it fails and run nsrpush individually for each with sleep 30 in between.

Meanwhile I decided to check core area and I found:

total 44740

-rw------- 1 root root 12718080 Apr 18 17:37 core.15294

-rw------- 1 root root 12713984 Apr 18 14:24 core.24283

-rw------- 1 root root 12722176 Apr 18 15:45 core.26429

-rw------- 1 root root 12722176 Apr 18 17:19 core.4171

This is all before - while trying to do inventory or list repository with cpdb from 8.2.1.2.  So it seems like each time this thing would core dump.  I can't say at which time this happened though - was it when I started nsrpush or when I did nsr_shutdown or killed the process with kill.  I just forgot to check it.

Anyway, out of 175 clients 21 failed and I run those with 30 sec delay in between.  In this run 4 failed where 3 I fixed by running again. One persisted again and again - it was backup server itself.  cpdb log shows:

04/18/2015 08:17:48 PM  srcpd SYSTEM critical Unable to open resource database file /nsr/res/cpdb/sd_products.res: No such file or directory

04/18/2015 08:17:48 PM  nsrcpd NSR severe Product resource file "/nsr/res/cpdb/sd_products.res" cannot be opened.

04/18/2015 08:17:48 PM  nsrcpd RAP critical Unable to return a list of resources from a database: the resource database for query is null.

04/18/2015 08:17:48 PM  nsrcpd NSR warning Using hard-coded product list.

04/18/2015 08:17:50 PM  nsrcpd NSR notice Starting Inventory Operation on selected clients

04/18/2015 08:18:00 PM  nsrcpd NSR notice Creating client type job for client .

04/18/2015 08:18:00 PM  nsrcpd NSR severe Copying inventory scripts to client .

04/18/2015 08:18:00 PM  nsrcpd RPC severe Unable to start command "nsrrcopy -t /tmp/cp_ /" on host ' ': RPC send operation failed; peer = 127.0.0.1:7937, errno = Broken pipe

04/18/2015 08:18:00 PM  nsrcpd RPC severe RPC send operation failed; peer = 127.0.0.1:7937, errno = Broken pipe

04/18/2015 08:18:00 PM  nsrcpd RPC severe RPC send operation failed; peer = 127.0.0.1:7937, errno = Broken pipe

04/18/2015 08:18:00 PM  nsrcpd NSR severe Error copying scripts to client .

04/18/2015 08:18:05 PM  nsrcpd NSR warning The operation failed for the following clients...

04/18/2015 08:18:05 PM  nsrcpd NSR warning

Two things I see above.  One is sd_product.res missing.  This one is supposed to be inside cpdb folder which I removed previously.  It is supposed to have file once it was created by nsrpush again so I have no idea why it is not there. However, by looking at the rest of the logs, I can see this is reported for all clients which were successful with nsrpush so that is not it.  Which brings us to error itself where we see nsrrcopy failing.

I have no idea what /tmp/cp_ / is, but I assume some temp scripts to run inventory or something.  For some reason that fails and without having those it is hard to try to reproduce this more in detail.  One may argue that this is not needed for backup since you won't use nsrpush to update backup server anyway.  Fair enough, but it worked so far plus I can always use nsrpush to inventory clients and keep my inventory up to date.

And then, out of the blue, it worked after running it once again during commercial break on TV (as I was watching this as I write this).  If you use nsrpush, you probably noticed that you had to use inventory twice to get modules listed as well (as first run seems to pick up only client's version - this is still the case).

My final test, and reason how I got myself in this in the first place, was to update all those Linux clients I have. This seems work fine as before.

So, if you find yourself using this, you might see (and use) above.

34 Posts

August 18th, 2015 03:00

Hi Hrvoje,

had a similar problem.

Networker Server 8.2.1.6.

During an upgrade of a client the client rebooted. Couldn't tell if the reboot was due to the upgrade or not. The upgrade process timed out.

Since then issuing nsrpush -i XXX failed in that error:

91512:nsrpush: Unable to authenticate with server ' ' during RAP bind operation: Timed out

91512:nsrpush: Unable to authenticate with server ' ' during RAP bind operation: Timed out

Regarding your manual above the following commands helped:

nsr_shutdown

cd /nsr/tmp

rm -R *

cd /nsr/res/cpdb

rm -R *

cd /nsr/repository

rm -R *

reboot

Did not have to remove jobsdb.

Thanks for your good documentation!

Regards,

Jens

2.4K Posts

August 19th, 2015 14:00

Hi Hrvoje,

your observations don't surprise me at all - IMHO nsrpush does not work reliably at all. This is not only due to the product - it also has a lot to do with all the other 'environmental specific' parameters for your clients like

  - internal firewalls (especially on UNIX/Linux)

  - insufficient rights (especially to copy the install files to the Windows clients)

and a bunch more that I do not remember right now. In general there are plenty of potential problems.

So I have a 'golden' rule: Try a second attempt. But if it still fails, simply logon locally and run the install that way.

6 Operator

 • 

14.4K Posts

 • 

56.2K Points

August 31st, 2015 13:00

Hi bingo,

In my case, as noted in first line, issue is with cpdb on the server itself - no client communication involved.

1 Rookie

 • 

116 Posts

August 31st, 2015 22:00

Hi,

I also have many problems with nsrpush and I was not able to find out yet what constellation breaks it..

After server upgrades nsrpush stucks very likely - he is not able to list the products in the repository or he is not able to run inventory on clients. It looks like nsrcpd is waiting for something infinitely.

I used to remove the repo, restart the server (if possible), kill the stucked jobs (jobkill) and then I can add packages again - if I'm lucky enough.. then it may break again after some push installs or inventories.

It's totally unrealible. Next time I'll try the cpdb removal too..

Best regards,

Istvan

5 Posts

October 25th, 2015 10:00

Hi,

The process suggested by Jens worked perfect for me as well.

Something is wrong with the nsrcpd process. I used it on two environments where the server was Linux and Windows.

The same issue....

Hope EMC will take care of it. This is unacceptable when you need to recreate the repository again and again.

Mati

6 Operator

 • 

14.4K Posts

 • 

56.2K Points

October 26th, 2015 02:00

This has changed a bit in NW9, but I didn't test it yet - I suspect focus of engineering will be more on NW9 side due to changes made.

1 Message

July 7th, 2017 12:00

Having the same issue with nsrpush:

     nsrpush: Unable to authenticate with server ' ' during RAP bind operation: Timed out

What worked for me was:

1. kill nsrcpd process (/usr/sbin/nsrcpd)

2. rename /nsr/res/cpdb or delete it.

After that, you only need to start over building your software repository and inventory of clients.

Note: Do not need to restart /usr/sbin/nsrcpd or recycle networker server.


Scott

2.4K Posts

July 10th, 2017 05:00

btw ... if you run a NW/Windows server:

  Since 8.2.4.5 you can directly load the unpacked Linux packages from a local Windows drive.

  There is no longer need to 'import' them via a 'pseudo backup' from a 'proxy' client of the same OS family.

  I guess it will also be possible in the other direction but I have not verified it yet.

Very well done.

2 Intern

 • 

203 Posts

July 12th, 2017 15:00

except that it does not appear to work, even though it would be a huge improvement not requiring any cross-platform proxy anymore... haven't tested it myself on multiple systems but the first one tried fails to add any Windows win_64 packages to its repository on a Linux redhat backupserver running nw8.2.4.5:

# nsrpush -a -W -p NetWorker -v 8.2.4.5 -P win_x64 -m /nsr/nwpackages/nw8245/win_x64/

Failed to add product from location /nsr/nwpackages/nw8245/win_x64: Error spawning the uasm process. 4901:uasm: Unable to get size of extended attribute value : No such file or directory

4896:uasm: Warning: Problem getting value of extended attribute

Add to repository status: retryable

# nsrpush -a -W -p NetWorker -v 8.2.4.5 -P win_x64 -m /nsr/nwpackages/nw8245/win_x64/

Failed to add product from location /nsr/nwpackages/nw8245/win_x64: Error spawning the uasm process. 4901:uasm: Unable to get size of extended attribute value : No such file or directory

4896:uasm: Warning: Problem getting value of extended attribute

Add to repository status: failed

2.4K Posts

July 13th, 2017 12:00

I just tried it using the wizard. And it failed.linux.jpg

Obviously due to issues with the extended attributes.

2 Intern

 • 

203 Posts

July 13th, 2017 14:00

I wonder how much this has to do with the method of unpacking. I unzip'ed the nw8245_win_x64.zip file on Linux Redhat 6.8. Possibly that unzip does not pick up the extended attributes. Tried unpacking on Windows itself first and then repacking it with 7zip as a .tar file, to be able to unpack it again on linux with "tar -xvpf", however don't know yet if I'll have to pack it on Windows with 7zip in a specific way to also save any extended attributes. So more or less, I'm reintroducing the Windows cross platform client again (or proxy as they call it in the new nsrpush man page) just to try to fulfill the requirements that I did not find (yet) to be described anywhere. So just a bit of trial&error.

0 events found

No Events found!

Top