NetWorker multithreading?

Question

Hi, There are a few networker processes that would be nice to have multithreaded to increase overall performance. Are there any plans on making NetWorker multithreaded in the near future? Thanks Rickard

ble1 · Answer

Last time I checked no such plans existed yet.

Siobhan1 · Answer

One of the best processes to make Multi-threaded, and taking a leaf from Celerra, would be to make the save process multi-threaded. This isn't exactly new technology either, OpenVMS introduced it many years ago.

You have one thread to find the next file to back up, and teh other thread to move the data. That would really solve many of the issues around million+ onject file systems and timeouts because the incremental takes so long to find another file.

Does bugger all for restore though :-(

R_Friberg · Answer

That's a very good idea. There could be plenty of time saved while doing PAX backups of Celerra.

In this case I was more looking for the server processes like nsrd, nsrmmdbd and nsrjobd. When I asked a year ago there were no such plans, but rather improvments to the processes being planned. I guess nothing has changed since then.

Vlado1_ca98b7 · Answer

Starting with NW73 we have moved in direction of multithreaded development and some NetWorker processes like nsrjobd and nsrexecd are multithreaded while some others are not.
Also, some processes like nsrindexd or savegrp use multiprocess approach to handle parallel requests.

There is no clear advantage of making a process multithreaded if threads have to wait for each other all the time.
That is the reason why MMDB is not multithreaded as each update must be atomic, so having multiple threads wait for lock would result in slowdown, not performance gain.
To make save multithreaded would not create gain as testing on current OS platforms show that handling frequent thread locks is more expensive than any possible gain. Platforms like OpenVMS are an exception as they handle filesystem calls very differently.

Making NW behave better in high-concurrency situations is the current goal and a lot work is being done on that field, but just making a process multithreaded does not make it faster.

Regarding Celerra - number of PAX threads is defined on Celerra side and by default its limited to 4 which also defines maximum number of parallel backup sessions per datamover before performance starts to drop. This is configurable by the user.
Do note that there is no save process (or any NW process for that matter) running inside Celerra - its up to filer to prepare the data and send it to NW.

R_Friberg · Answer

Thanks for a good answer. Will NW be optimized for the Sun T2 processor architechture or will it be more suited for less threads/cores in the upcoming releases?

When will the new high concurrency enhancements be available for end customers? Is it mainly going in to 7.6.1 or will it be a later release?

Thanks
Rickard

Vlado1_ca98b7 · Answer

For very high-density filesystems, there is some gain by using multiple threads to walk the filesystem, but its not consistent.

I agree that high-density filesystems are a big problem for customers and we are looking at how to solve that problem most efficiently in the near future. Sorry I cannot go into details on whats planned, but good thing is that work is already ongoing.

Re: T2 - Yes, when I ment T2 is the best Solaris platform I was referring to T5xxx series. Original T1000 and bigger T2000 defintely had problems on the backplane side thus limiting actual work possible.

Siobhan1 · Answer

So are you saying that having a thread find the next file to backup whilst another thread is moving data is not going to improve performance?

What about high density file systems where you spend mor etimlots of time finding a file than you do moving data? Surely multi-threading would help there? That must be better than find-a-file, does it need to be backed up, find the next file, does it need backup, find a file does it need backup, find a file does it need backup, find a file does it need backup, Yes - back it up!

At for the T2000 we found they had a couple of issues.

1) The individual cores themselves couldn't shift much data so large file systems became slow

2) The back plane isn't so hot, so when you move a large amount of data through the network, the TCP/IP stack consumed loads of CPU. (The T5220 and T5240 are much better)

Vlado1_ca98b7 · Answer

NW server behaves very well on T2 architecture - when it comes to Solaris, it is the best platform available.

Regarding optimizations - a big part of the optimizations will be in NW 7.6 SP1, but we don't plan to stop there.

R_Friberg · Answer

Alright, that sounds good. We're using a couple of T5220 (8 cores and 8 threads per core) as backup servers and what we see is a problem with the heavier networker processes like nsrd, nsrmmdbd and nsrjobd as they each only can get up to 1/64th of the whole processor. That means that 1.56% is the maximum a process can get. We are during backups very close to that although the server in total is not that heavy loaded. Also, when doing clones (large jobs with 500 save sets or more) the nsrmmd's are also reaching 1.5% so there is a limiting factor on the clone performance too. (Small clone jobs with few save sets does not have this issue though.) Therefore I believe that the T2 architechture might not be the best for NetWorker and maybe M-series servers would be a better server.

Vlado1_ca98b7 · Answer

What you describe surprises me a little bit since with NetWorker on T5xxx hardware we're commonly seeing quite uniform CPU usage.

Especially in daemons which are already heavily-multithreaded like nsrjobd (although thread architecture changed in NW752 compared to earlier releases) which directly scale to entire CPU core and not just to thread within it.

On nsrmmd, normally its I/O bound and its CPU usage is due to interrupt processing and doesn't relate to number of savesets. On interrupt bound processes there is some tuning possible by binding interrupts to specific CPU cores, but thats going a bit too far for normal usage.

Also, the Niagara-2 CPU architecture improves even typical single-threaded performance by performing more efficient instructure pre-fetching compared to standard UltraSparc.

I would suggest to look at your Solaris settings more closely to make sure that nothing is blocking application from taking full advantage of the system.

Also, for proper Niagara-2 optimizations you do need Solaris 10 Update 5 minimum.

koltean · Answer

heavily-multithreaded like nsrjobd? nsrjobd is often at 100% CPU. I don't see any multiple processes like nsrmmd.

Networker 7.5.3 on a M3000 with T5220 SN running 1000 clients. nsrjobd and nsmmdb are causing bottlenecks.

Jason20 · Answer

nsrjobd is multithreaded (ps -efL etc.. on Solaris) but I have seen a single thread running 100% continuously on one CPU and bottlenecking everything else.

Vlado1_ca98b7 · Answer

That would be the purging thread.

If you have issues where single thread is taking all CPU time, I would suggest to open a support case and requested a newer NW version – it has been recently redesigned in both NW7533 and NW7611.

ble1 · Answer

Hi Vlado,I'm using 7.5.3.3 and it still breaks down with 100% during purge.&#xa0; We checked jobd fix which is in 7.5.3.4, but this does not help.Another extremely annoying issue with nsrjobdb is that NMC get drive it to 100% when it is open and used to monitor sessions (and this has been issue since nsrjodb has been implemented).&#xa0; This issue is on engineering table now

NetWorker

Was this post helpful?