2.4K Posts

August 12th, 2013 11:00

Comparing the headline with your issues ...

  - Long before the index disk will become full, you should see a message like 'file system for client ... is getting full'.

    It appears whenever your largest CFI can not be copied to the same disk any more. This usually happens long before

    the disk itself will be filled. Just do not ignore the message.

  - Yes, there is now one "nsrindexd ADD" sub-process for each active backup stream.

    Each of them of course needs at least RAM.

  - Increasing parallelism does not necessarily improve performance. This also depends on a lot of other factors which you

    also have to observe/adjust. But at least more streams need more RAM.

  - The retention time just defines how long the info shall remain in the db - i usually set it to 2 weeks (1 week is the usual

    backup cycle). The reason is that i can easily compare the last and the previous backup (very easy with NW 8).

  - More frequent purges of a db's obsolete data is better as the tasks will finish earlier.

    In NW 8, purging of the jobs db is run once every hour.

Meanwhile we are running NW 8.0.1 but as far as i remember, we had no issues with NW 7.6.4.

Before you do anything else, monitor your RAM on server and storage nodes and ensure you have plenty.

38 Posts

August 12th, 2013 18:00

Thanks Bingo for the detailed reply...

However, frequent purging of JobsDB puts in a high number of IOPS, doesnt it??

I think there is a tech note in the EMC community which mentions that on NW-8, the jobsdb purge related IOPS have decreased about 79% compared to previous versions and that explains how u can afford to run that purge every hour

Also would like to highlight that the said nsrindexd ADD processes were topping out.

Therefore, I increased server parallelism but ironically there were more and more nsrindexd sessions being established but no corresponding save streams were being created to account for.. any clue on this??

Besides, there seems to be one group containing about 15 clients that stalls the whole system, never observed it first hand but as per the discussion I had it stalls all the other backups and we need to stop that group so that other groups run...

Interestingly, this is the first group which kicks off our backup window...

2.4K Posts

August 12th, 2013 21:00

To give you some numbers: our system creates about 4.8k save sets every day. So each purge would delete data of about 200 save jobs. With no impact on our backups. (Again, this is NW 8.0.1).

"Also would like to highlight that the said nsrindexd ADD processes were topping out.

Therefore, I increased server parallelism but ironically there were more and more nsrindexd sessions being established but no corresponding save streams were being created to account for.. any clue on this??"

What do you expect? - allowing more streams will usually result in more processes. However, i cannot answer why you might have 'orphans'. But if the system is idle, they all should have gone.

With respect to the group: There is the "savegroup parallelism" which allows you to control how many streams this group can open. This ensures that another one will get some left when it starts later.

If this does not help - move clients (one by one) to another group to check which one might be responsible.

4 Operator

 • 

14.4K Posts

August 26th, 2013 12:00

I have 40k ssids created per day (so at least so many records in job db even I suspect more).  I would say most of performance issues were addressed by adding faster disk to /nsr and keeping jobdb retention 8h (this is NW 7.6.5.x, but same applies to earlier versions). The impact of fast disk is simply incredible (I started to use fast tier from VMAX while before I used regular FC on DMX).

No Events found!

Top