NVP vProxy:VMware View 不刷新且所有虚拟机备份均失败
Summary: NetWorker VMware Protection (NVP) 持续或间歇性失败,并显示“End of file or no input:操作中断“或”超时(3600 秒接收延迟)(3600 秒发送延迟)“记录在备份会话日志中。NetWorker Management Console (NMC) VMware View 刷新一致或间歇性失败,并显示相同的“End of file:Operation interrupted or timed out“错误在 VMware 清点 (nsrvim) 操作期间,NetWorker 服务器daemon.log中也会出现相同的错误。 ...
Symptoms
- VMware vCenter 已添加到 NetWorker 服务器,用于执行 NetWorker VMware Protection vProxy 备份。
- NetWorker Management Console>的“保护>VMware View”选项卡无法刷新:

Error fetching vCenter information for: vCenter_Name Reason(s): Unable to fetch data from vCenter: End of file or no input: Operation interrupted or timed out (3600 s receive delay) (3600 s send delay).
- 自动化
nsrvim用于查询 vCenter 清单的进程失败。NetWorker 服务器上的 daemon.log 报告:
106637 MM/DD/YYYY HH:MM:SS AM/PM 1 3 0 6508 3964 0 NetWorker_Hostname nsrdisp_nwbg RAP notice job 'nsrvim' progress message: nsrvim starting on NetWorker_Hostname (process 6252). 106637 MM/DD/YYYY HH:MM:SS AM/PM 1 3 0 6508 3964 0 NetWorker_Hostname nsrdisp_nwbg RAP notice job 'nsrvim' progress message: Connecting to NetWorker on 'NetWorker_Hostname'. 106637 MM/DD/YYYY HH:MM:SS AM/PM 1 3 0 6508 3964 0 NetWorker_Hostname nsrdisp_nwbg RAP notice job 'nsrvim' progress message: Querying NSR hypervisor resource 'vCenter_Hostname' 106637 MM/DD/YYYY HH:MM:SS AM/PM 1 3 0 6508 3964 0 NetWorker_Hostname nsrdisp_nwbg RAP notice job 'nsrvim' progress message: Connecting to service at https://vCenter_Hostname/sdk 106637 MM/DD/YYYY HH:MM:SS AM/PM 1 3 0 6508 3964 0 NetWorker_Hostname nsrdisp_nwbg RAP notice job 'nsrvim' progress message: Starting session with infrastructure services daemon. 106637 MM/DD/YYYY HH:MM:SS AM/PM 1 3 0 5648 7592 0 NetWorker_Hostname nsrdisp_nwbg RAP notice job 'nsrvim' progress message: Querying for inventory at https://vCenter_Hostname/sdk 106637 MM/DD/YYYY HH:MM:SS AM/PM 1 3 0 7952 8524 0 NetWorker_Hostname nsrdisp_nwbg RAP notice job 'nsrvim' progress message: Querying for inventory at https://vCenter_Hostname/sdk 106637 MM/DD/YYYY HH:MM:SS AM/PM 1 3 0 3624 4728 0 NetWorker_Hostname nsrdisp_nwbg RAP notice job 'nsrvim' progress message: Querying for inventory at https://vCenter_Hostname/sdk 106637 MM/DD/YYYY HH:MM:SS AM/PM 1 3 0 6508 3964 0 NetWorker_Hostname nsrdisp_nwbg RAP notice job 'nsrvim' progress message: Querying for inventory at https://vCenter_Hostname/sdk 106637 MM/DD/YYYY HH:MM:SS AM/PM 1 3 0 2228 7760 0 NetWorker_Hostname nsrdisp_nwbg RAP notice job 'nsrvim' progress message: End of file or no input: Operation interrupted or timed out (3600 s receive delay) (3600 s send delay)
Linux: /nsr/logs/daemon.raw Windows: C:\Program Files\EMC NetWorker\nsr\logs\daemon.raw NetWorker: NetWorker: How to use nsr_render_log NetWorker: NetWorker: How to automatically render daemon.raw to daemon.log in real time
- 来自 vCenter 的虚拟机 (VM) 备份失败,并在备份会话日志中报告类似的错误消息:
MM/DD/YYYY HH:MM:SS AM/PM Failed to run nsrvim, error: Unable to fetch data from vCenter: End of file or no input: Operation interrupted or timed out (3600 s receive delay) (3600 s send delay). MM/DD/YYYY HH:MM:SS AM/PM Failed to get work items. Will retry in 360 seconds. MM/DD/YYYY HH:MM:SS AM/PM Starting nsrvim. MM/DD/YYYY HH:MM:SS AM/PM Calling the nsrvim program to collect the inventory data. MM/DD/YYYY HH:MM:SS AM/PM Setting default timeout 1800. MM/DD/YYYY HH:MM:SS AM/PM Using a timeout of 1800 seconds for the nsrvim request. Minimum timeout is 360 seconds. Maximum timeout is 3600 seconds. MM/DD/YYYY HH:MM:SS AM/PM Failed to run nsrvim, error: Unable to fetch data from vCenter: End of file or no input: Operation interrupted or timed out (3600 s receive delay) (3600 s send delay). MM/DD/YYYY HH:MM:SS AM/PM Unable to fetch data from vCenter: End of file or no input: Operation interrupted or timed out (3600 s receive delay) (3600 s send delay) MM/DD/YYYY HH:MM:SS AM/PM Action backup vmware-vproxy 'backup' with job id 1769899 is exiting with status 'failed', exit code 1 MM/DD/YYYY HH:MM:SS AM/PM Action has finished with failures.
Linux: /nsr/log/policy/Policy_Name/Workflow_Name Windows: C:\Program Files\EMC NetWorker\nsr\logs\policy\Policy_Name\Workflow_Name
- NetWorker 服务器可以访问 vCenter Server 上的端口 443:
Windows (PowerShell): tnc vCenter_Hostname -port 443 Linux: curl -v vCenter_Hostname:443 NetWorker command: nsrports -t vCenter_Hostname -p 443
- 所描述的错误和症状可能不一致或间歇性出现。
Cause
返回的错误是操作中断或超时。但是,超时为 3600 秒(1 小时),并且在超过 1 小时阈值之前出现错误。此过程正在中断。
- 网络路由或防火墙问题。
- vCenter Server 正在关闭连接,然后
nsrvim资源清册过程完成。
案例 1 场景:防火墙配置了自适应规则,允许 NetWorker 服务器通过 443 连接到 vCenter,但在 NetWorker 的 nsrvim 进程正在清点 vCenter。
情况 2 场景:vCenter 服务器在 nsrvim 应用程序 PDU。
NetWorker 服务器的 nsrvim process 用于查询 vCenter Server 中是否有 VMware 资源。默认情况下,每当在 NMC 的VMware View中执行“刷新”时或虚拟机保护作业启动时,此过程在 NetWorker 服务器上每 15 分钟运行一次。
Resolution
网络管理员或防火墙管理员必须确认是否有任何防火墙规则阻止或停止 NetWorker 服务器与 vCenter Server 之间通过端口 443 的连接。如果存在任何规则,请暂时禁用它们,以查看问题是否在 NetWorker 中得到解决。如果禁用规则允许 VMware View 刷新和完成 VMware 备份,则必须更改防火墙或路由规则,以免中断 NetWorker 服务器与 vCenter 之间的连接。
NetWorker VMware 集成指南中详细介绍了所需的端口和网络拓扑图,该指南可从戴尔支持网站的 NetWorker 支持信息获取。
网络管理员还可以使用数据包捕获工具 (tcpdump、Wireshark)和 NetWorker 服务器和 vCenter。问题重现后,请查看数据包捕获,查看 vCenter Server 是否正在关闭资源清册会话。
Windows: https://www.wireshark.org/
Linux NetWorker 服务器和 vCenter Server: https://www.tcpdump.org/manpages/tcpdump.1.html
tcpdump 命令示例:
nohup tcpdump -i any -s 0 -C 500 -w /tmp/`hostname`_`date -I`.pcap &
nohup选项指示该命令在后台运行,直到 PID 终止kill相同名称。-i指定接口,可以使用any,或指定系统网络接口名称,例如 eth0。-s0 指定快照长度为 65535(捕获整个帧)。-C 500选项指示文件大小为 500,000,000 字节。-w选项指示输出文件位置。显示的输出文件会自动生成,其中包含系统主机名和运行的 YYYY-MM-DD。可以在 Wireshark 中分析 .pcap 文件。
2.在 NetWorker 中重现问题时,启用
nsrdispd 调试并运行 nsrvim 命令和调试。
dbgcommand -n nsrdispd Debug=9
nsrvim -D7 -d vCenter_Hostname > {Path_to_output_file} 2<&1
nve:~ # dbgcommand -n nsrdispd Debug=9 Process ID List : 14600 Processing PID:14600 nve:~ # nsrvim -D7 -d vcsa.amer.lan > /tmp/nsrvim.out 2<&1 nve:~ # ls -l /tmp | grep nsrvim -rw------- 1 root root 60025 May 22 10:18 nsrvim.out nve:~ #
3.NetWorker 服务器的daemon.raw中报告该错误。
Linux: /nsr/logs/daemon.raw Windows: C:\Program Files\EMC NetWorker\nsr\logs\daemon.raw NetWorker: NetWorker: How to use nsr_render_log NetWorker: NetWorker: How to automatically render daemon.raw to daemon.log in real time
4.禁用 nsrdispd 调试并停止 tcpdump:
dbgcommand -n nsrdispd Debug=0 ps -ef | grep tcpdump kill -9 PID_of_tcpdump
nve:~ # dbgcommand -n nsrdispd Debug=0 Process ID List : 14600 Processing PID:14600 nve:~ # ps -ef | grep tcpdump root 29439 29267 0 10:01 pts/0 00:00:00 tcpdump -i any -s 0 -C 500 -w /tmp/nve_2024-05-22.pcap root 29882 29267 0 10:13 pts/0 00:00:00 grep --color=auto tcpdump nve:~ # kill -9 29439 nve:~ # ps -ef | grep tcpdump root 29890 29267 0 10:13 pts/0 00:00:00 grep --color=auto tcpdump [1]+ Killed nohup tcpdump -i any -s 0 -C 500 -w /tmp/`hostname`_`date -I`.pcap nve:~ # ps -ef | grep tcpdump root 29893 29267 0 10:13 pts/0 00:00:00 grep --color=auto tcpdump nve:~ # ls -l /tmp | grep pcap -rw------- 1 root root 5464064 May 22 10:13 nve_2024-05-22.pcap nve:~ #
查看数据包捕获,查看 vCenter Server 或网络设备是否正在关闭连接。
Additional Information
NetWorker 19.10 及更高版本允许备份管理员更改 nsrvim 间隔。默认间隔为 15 分钟;但是,这可以延长到最多 60 分钟的任意间隔。NVP vProxy:NetWorker nsrvim 进程每 15 分钟运行一次,导致 vCenter Server 上的高工作负载和潜在的 VPXD 不可用。