NetWorker: busy Linux NetWorker server reports the message "nsrd RPC critical Unable to accept client connection: Too many open files"
Summary: A busy Linux NetWorker server reports the message "nsrd RPC critical Unable to accept client connection: Too many open files"
Symptoms
The NetWorker server becomes unresponsive including:
- The NMC console hangs at a progress bar
- The
nsradmincommand does not return - The
nsrwatchdoes not return - The clients lose connections
The daemons on a NetWorker for Linux server start with too low an open file limit. This is due to the daemons not inheriting per-process limits at start up time. The limit used falls back to 1024. This may be insufficient on larger data zones.
Cause
The open files limit on the NetWorker server is too low.
- Get the PID for the nsrd
$ ps aux | grep nsrd | grep -v grep | grep -v disp | awk '{ print $2; }'
4021
-
The PID is part of the path in
/proc/<nsrd_PID>/limitsto review with thecatcommand:
$ cat /proc/4021/limits Limit Soft Limit Hard Limit Units Max cpu time unlimited unlimited seconds Max file size unlimited unlimited bytes Max data size unlimited unlimited bytes Max stack size 10485760 unlimited bytes Max core file size unlimited unlimited bytes Max resident set unlimited unlimited bytes Max processes 63833 63833 processes Max open files 1024 1024 files Max locked memory 32768 32768 bytes Max address space unlimited unlimited bytes Max file locks unlimited unlimited locks Max pending signals 63833 63833 signals Max msgqueue size 819200 819200 bytes Max nice priority 0 0 Max realtime priority 0 0 $
Resolution
Create a separate startup script for the NetWorker servers with heavy loads by enabling the following environment variable before the NetWorker services start:
To set at complete OS level:
Open file descriptors: Change the open file descriptors parameter to a minimum of:
- 8192 (small NetWorker environment)
- 16384 (medium NetWorker environment)
- 32768 (large NetWorker environment)
The definitions of a small, medium, or large NetWorker server can be found in the NetWorker Performance Optimization and Planning Guide.
Max Open Files
On a Linux NetWorker server, add ulimit -n 8192 in the .bash_profile file and restart the current session.
To set minimum and maximum file descriptors per process: Red Hat 7, SLES 12, SLES 15
prlimit --pid <pid_of_the_process> --nofile=<min_limit>:<max_limit>Example:prlimit --pid 12345 --nofile=1024:4096
To set minimum and maximum file descriptors per process: Red Hat 6
echo -n "Max open files=min_limit:max_limit" > /proc/pid_of_the_process/limitsExample:echo -n "Max open files=4096:16384" > /proc/1208/limits
TCP Parameters
Add the following TCP parameters when the NetWorker server runs with a heavy load (concurrent runs with many socket requests being made on the server application ports):
- On a Linux NetWorker server, add the following TCP parameters in the
/etc/sysctl.conffile and run thesysctl --systemcommand:net.ipv4.tcp_fin_timeout = 30 net.ipv4.ip_local_port_range = 15000 65535 net.core.somaxconn = 1024 - On a Linux NMC server, update the file-max value to 65536 to ensure Postgres database connectivity when the NetWorker server runs with heavy loads:
echo 65536 > /proc/sys/fs/file-max