NetWorker:网络配置的最佳实践
Summary: 本文旨在为 NetWorker 主机的理想和标准网络可调参数提供一个简单的基线。
This article applies to
This article does not apply to
This article is not tied to any specific product.
Not all product versions are identified in this article.
Symptoms
- 与网络或主机连接相关的错误,包括但不限于:
- 看似已完成实际数据传输的备份失败
- 资源普遍枯竭或通信崩溃
GSS warning Session information (number hex:hex) registered by user for nsrexecd has expired because a NetWorker daemon had not requested it after 120 minutesGSS error Session information (number hex:hex) was requested by nsrmmd but the session has expiredRPC severe Unable to query NSR database for list of configured devices: RPC receive operation failed; peer = ip_addr:port, errno = Connection timed outRPC severe Unable to query NSR database for list of configured devices: RPC send operation failed; peer = ip_addr:port, errno = Broken pipeNSR notice Chunking ssid ssid failed, because saveset was abortedddp_open_file_ext() failed for File: //mtree/vol_dir/nn/nn/long_ssid, Err: 5004-nfs lookup failed (nfs: No such file or directory) ).NSR critical Connectivity check request is failed for: SN_CONN_REPORT_DD type data_domain deviceRPC error RPC client handle: No route to host.RPC error RPC client handle: Connection refused.RPC error Unable to create the connection with 'portmapper' to host 'hostname' with address 'ip_addr' at port number 7938.RPC critical Aborting client connection from ip_addr: Connection timed out.RPC critical Check whether the firewall is blocking the client ports on the host 'hostname'.RPC critical Check whether the client services are running on the host 'hostname'.
Cause
NetWorker 是一种应用程序,可在常规作期间在本地和远程主机上创建许多套接字。虽然服务器和存储节点通常会创建更多内容,但客户端配置也可能影响作业成功。
保留者:NetWorker 调用方进程创建的每个套接字都会连接到侦听器守护程序进程,如果网络设备尝试回收资源的时间过长,则这些套接字可能会中断。通常,这要求默认情况下为 NetWorker 服务器和节点以及遇到问题的客户端启用 keepalive。对于某些(但不是全部)二进制文件,NetWorker 有自己的内部 keepalive 处理。默认情况下,作系统还具有应启用的 keepalive。
端口可用性:NetWorker 着手建立的每个套接字都需要一个位于临时范围内的端口进行通信,但默认情况下,此范围在所有作系统上都受到限制,应尽可能打开,以免人为限制通信。在默认情况下启用 nsrauth 的情况下,单个所需插槽所需的端口数将至少为 3 个,每次失败都可能会快速重新尝试,使端口处于TIME_WAIT状态,直到连接成功。因此,应提高最大可用端口数,最好降低TIME_WAIT状态。
其他长时间运行的插槽也可以使用特定的内部软件变量进行强化,以实现更高的弹性或改进缓冲。
保留者:NetWorker 调用方进程创建的每个套接字都会连接到侦听器守护程序进程,如果网络设备尝试回收资源的时间过长,则这些套接字可能会中断。通常,这要求默认情况下为 NetWorker 服务器和节点以及遇到问题的客户端启用 keepalive。对于某些(但不是全部)二进制文件,NetWorker 有自己的内部 keepalive 处理。默认情况下,作系统还具有应启用的 keepalive。
端口可用性:NetWorker 着手建立的每个套接字都需要一个位于临时范围内的端口进行通信,但默认情况下,此范围在所有作系统上都受到限制,应尽可能打开,以免人为限制通信。在默认情况下启用 nsrauth 的情况下,单个所需插槽所需的端口数将至少为 3 个,每次失败都可能会快速重新尝试,使端口处于TIME_WAIT状态,直到连接成功。因此,应提高最大可用端口数,最好降低TIME_WAIT状态。
其他长时间运行的插槽也可以使用特定的内部软件变量进行强化,以实现更高的弹性或改进缓冲。
Resolution
以下是作系统和主机类的常见建议设置及其实现命令。总是,适用性各不相同;那些被认为普遍可取的不予注释,而那些适用性变化较大的则被注释,但可在需要时使用。这些设置作为一般建议真诚地提供,但在实施之前应由作系统管理员进行检视。这些被视为服务器和存储节点的最佳默认最佳实践。客户端适用性可能会因任何给定环境中的配置和角色而异,在这种情况下,应在使用之前仔细考虑,因为不同的应用程序服务器角色可能与建议的设置冲突 - 在这些情况下,角色所需的设置应优先。
Linux:所有适当的设置都应在 /nsr/nsrrc 文件,必须具有全局读取/执行权限 (755) 才能在服务启动时执行。默认的标准条目未注释,注释非标准或间接选项。在相关行上使用 # 前缀更改设置的可用性。根据您将部署文件的位置,剪裁文件使其与 NetWorker 客户端、节点或服务器相关。进行更改后,将需要重新启动服务。
### LINUX - For all NetWorker hosts - Clients, Nodes and Server
NSR_KEEPALIVE_WAIT=10
export NSR_KEEPALIVE_WAIT
NSR_EXEC_MAX_AUTH_THREADS=50
export NSR_EXEC_MAX_AUTH_THREADS
# NSR_SOCK_BUF_SIZE=65536 # (262144 for 10 Gb ETH NICs)
# export NSR_SOCK_BUF_SIZE
# NetWorker internal keepalive settings for some, but not all binaries - 4.5 minutes to ensure keepalives are passed before the increasingly common 5 minute router idle socket kill timer
NW_TCP_KEEPIDLE_SECS=270
export NW_TCP_KEEPIDLE_SECS
NW_TCP_KEEPINTVL_SECS=30
export NW_TCP_KEEPINTVL_SECS
NW_TCP_KEEPCNT=10
export NW_TCP_KEEPCNT
# OS-level keepalive values - also set to 4.5 minutes for the same reason
sysctl -w "net.ipv4.tcp_keepalive_intvl=30"
sysctl -w "net.ipv4.tcp_keepalive_probes=10"
sysctl -w "net.ipv4.tcp_keepalive_time=270"
# Set kernel limits to ensure core dump generation
ulimit -Sn 262144
ulimit -Sc unlimited
### For NetWorker Storage Nodes and Server
# Set kernel limits to provide maximum file descriptor availability
ulimit -Hn 262144
ulimit -Hc unlimited
# Globally disable IPv6, if it is not necessary for operation:
# sysctl -w "net.ipv6.conf.all.disable_ipv6=1"
# Disable dynamic TCP window scaling - requires compatible equipment in the data path, as well as ECN
sysctl -w "net.ipv4.tcp_window_scaling=0"
sysctl -w "net.ipv4.tcp_ecn=0"
# Raise connection backlog (hash tables) to the maximum value allowed if desired
# sysctl -w "net.ipv4.tcp_max_syn_backlog=8192"
# sysctl -w "net.core.netdev_max_backlog=8192" # (For 10 Gb Eth use the value = 30000)
# Raise memory size available for TCP buffers as needed
# sysctl -w "net.core.rmem_default=262144"
# sysctl -w "net.core.wmem_default=262144"
# sysctl -w "net.core.rmem_max=16777216"
# sysctl -w "net.core.wmem_max=16777216"
# sysctl -w "net.ipv4.tcp_rmem=8192 524288 16777216"
# sysctl -w "net.ipv4.tcp_wmem=8192 524288 16777216"
# Increase shared memory pool if required - particularly for immediate mode on Storage Nodes
# sysctl -w kernel.shmmax = 2147483648 # - e.g. 2 GB
# sysctl -w kernel.shmall = 2147483648 # - e.g. 2 GB
# Available TCP client ephemeral port range increase from default:
sysctl -w "net.ipv4.ip_local_port_range=10000 64000"
# Enable TCP Time Wait Reuse for very high load servers and nodes to increase socket reuse availability
sysctl -w "net.ipv4.tcp_tw_recycle=0"
sysctl -w "net.ipv4.tcp_tw_reuse=2"
# Lower TIME_WAIT delay to close connections more quickly. This may not be necessary in concert with tw_reuse.
# sysctl -w "net.ipv4.tcp_fin_timeout=30"
# NFS I/O concurrency:
sysctl -w "sunrpc.tcp_slot_table_entries=128"
sysctl -w "sunrpc.udp_slot_table_entries=128"
### For NetWorker Server only
# Settings to increase device resilience for cloud operations or other potentially high-latency devices
# NSR_DEVOP_TIMEOUT=3600
# export NSR_DEVOP_TIMEOUT
# NSR_DEVOP_POLLING_INTERVAL=600
# export NSR_DEVOP_POLLING_INTERVAL
# NSR_DEVOP_INQUIRY_TIMEOUT=900
# export NSR_DEVOP_INQUIRY_TIMEOUT
### Media database tunables
# NSR_TCP_READ_LONG_WAIT=Y
# export NSR_TCP_READ_LONG_WAIT
# NSR_MAX_MEDIADB_RETRY=10
# export NSR_MAX_MEDIADB_RETRY
# MMDB_SQLITE_CONFIGURE_MEMORY=1
# export MMDB_SQLITE_CONFIGURE_MEMORY
# MMDB_SQLITE_PAGECACHE_SIZE=65536
# export MMDB_SQLITE_PAGECACHE_SIZE
# MMDB_SQLITE_PAGE_COUNT=65536
# export MMDB_SQLITE_PAGE_COUNT
# MMDB_SQLITE_HEAP_SIZE=1073741824
# export MMDB_SQLITE_HEAP_SIZE
# MDB_SQLITE_HEAP_MIN_ALLOC_SIZE=128
# export MDB_SQLITE_HEAP_MIN_ALLOC_SIZE
Windows:由于 /nsr/nsrrc 文件当前不存在,必须使用批处理文件执行更改,例如 nsrrc.bat 或其他部署方法。此处提供了命令,其中存在命令驱动的选项。这些更改是全局性的,不需要重复运行。就像 Linux 的 nsrrc 文件中,默认的标准条目未注释,非标准或间接选项注释。使用以下命令更改设置的可用性: REM 相关行上的前缀。根据您将部署文件的位置,剪裁文件使其与 NetWorker 客户端、节点或服务器相关。进行更改后,将需要重新启动服务。
REM ### WINDOWS - For all NetWorker hosts - Clients, Nodes and Server REM # TCP window size tuning - greater throughput / Data Domain REM reg add HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\AFD\Parameters /v DefaultSendWindow /t REG_DWORD /d 262144 /f REM reg add HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\AFD\Parameters /v DefaultReceiveWindow /t REG_DWORD /d 262144 /f REM reg add HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters /v GlobalMaxTcpWindowSize /t REG_DWORD /d 262144 /f REM reg add HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters /v TcpWindowSize /t REG_DWORD /d 262144 /f REM # Global keepalive registry settings - 270s to fall below common idle socket timer kills of 300s reg add HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters /v KeepAliveTime /t REG_DWORD /d 270000 /f reg add HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters /v KeepAliveInterval /t REG_DWORD /d 10000 /f reg add HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters /v TcpMaxDataRetransmissions /t REG_DWORD /d 10 /f REM # Global NetWorker keepalive and connectivity variables setx /m NW_TCP_KEEPIDLE_SECS 270 setx /m NW_TCP_KEEPINTVL_SECS 30 setx /m NW_TCP_KEEPCNT 10 setx /m NSR_KEEPALIVE_WAIT 10 setx /m NSR_EXEC_MAX_AUTH_THREADS 50 REM setx /m NSR_SOCK_BUF_SIZE=65536 # (262144 for 10 Gb Eth NICs) REM ### For NetWorker Storage Nodes and Server REM # Standard TCP features - disable in case of disconnections REM netsh interface tcp set global rss=disabled REM netsh interface tcp set global autotuning=disabled REM netsh interface tcp set global ecncapability=disabled REM netsh interface tcp set global timestamps=default REM # Port range availability for TCP client callers netsh int ipv4 set dynamicport tcp start=10000 num=54000 netsh int ipv4 set dynamicport udp start=10000 num=54000 netsh int ipv6 set dynamicport tcp start=10000 num=54000 netsh int ipv6 set dynamicport udp start=10000 num=54000
REM # Global port maximum (deprecated) and TIME_WAIT window REM reg add HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters /v MaxUserPort /t REG_DWORD /d 65535 /f reg add HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters /v TcpTimedWaitDelay /t REG_DWORD /d 30 /f
REM # Disable IPv6 if not required
REM reg add HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip6\Parameters /v DisabledComponents /t REG_DWORD /d 0x000000ff /f REM ### For NetWorker Server only REM # Settings to increase device resilience for cloud operations or other potentially high-latency devices REM setx /m NSR_DEVOP_TIMEOUT 3600 REM setx /m NSR_DEVOP_POLLING_INTERVAL 600 REM setx /m NSR_DEVOP_INQUIRY_TIMEOUT 900
REM ### Settings for media database tuning
REM setx /m NSR_TCP_READ_LONG_WAIT Y REM setx /m NSR_MAX_MEDIADB_RETRY 10
REM setx /m MDB_SQLITE_HEAP_MIN_ALLOC_SIZE 128
REM setx /m MMDB_SQLITE_CONFIGURE_MEMORY 1
REM setx /m MMDB_SQLITE_HEAP_SIZE 1073741824
REM setx /m MMDB_SQLITE_PAGE_COUNT 65536
REM setx /m MMDB_SQLITE_PAGECACHE_COUNT 65536
REM setx /m MMDB_SQLITE_TMP path_to_temp_dir
Additional Information
Affected Products
NetWorkerArticle Properties
Article Number: 000218894
Article Type: Solution
Last Modified: 07 Oct 2025
Version: 3
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.