NetWorker:RHEL pacemaker 群集上的 nws 资源无法启动“nsrd NSR critical Can't start nsrd...”
Summary: NetWorker 使用 pacemaker 部署在 RedHat 高可用性群集上。NetWorker 服务器 (nsrd) 服务无法启动,说明 /nsr 是本地的,需要由群集管理器管理。
This article applies to
This article does not apply to
This article is not tied to any specific product.
Not all product versions are identified in this article.
Symptoms
- /nsr_share/nsr/logs/daemon.raw 在服务启动过程中记录以下错误:
73248 01/31/2023 12:57:48 PM 5 5 0 926312256 966299 0 nwrhelnodef.emclab.local nsrd NSR critical Can't start nsrd because /nsr/res (/nsr) is local, and NetWorker is configured as a cluster server. Use cluster manager to check NetWorker service status.
144354 01/31/2023 12:57:48 PM 1 5 0 130900160 963225 0 nwrhelnodef.emclab.local nsrctld NSR notice Daemon nsrd terminated.
- 节点可以使用以下命令查看群集资源: lcmap
root@NWrhelNodeF:~# lcmap
type: NSR_CLU_TYPE;
clu_type: NSR_LC_TYPE;
interface version: 1.0;
type: NSR_CLU_VIRTHOST;
hostname: 192.168.25.28;
local: FALSE;
owned paths: /nsr_share;
- lcmap 输出与 pcs 资源配置相匹配: pcs 资源配置
root@NWrhelNodeF:~# pcs resource config Group: NW_group Resource: fs (class=ocf provider=heartbeat type=Filesystem) Attributes: device=/dev/sdb1 directory=/nsr_share fstype=ext4 Operations: monitor interval=20 timeout=300 (fs-monitor-interval-20) start interval=0s timeout=60s (fs-start-interval-0s) stop interval=0s timeout=60s (fs-stop-interval-0s) Resource: ip (class=ocf provider=heartbeat type=IPaddr) Attributes: cidr_netmask=24 ip=192.168.25.28 nic=ens192 Operations: monitor interval=15 timeout=120 (ip-monitor-interval-15) start interval=0s timeout=20s (ip-start-interval-0s) stop interval=0s timeout=20s (ip-stop-interval-0s) Resource: nws (class=ocf provider=EMC_NetWorker type=Server) Meta Attrs: is-managed=true Operations: meta-data interval=0 timeout=10 (nws-meta-data-interval-0) migrate_from interval=0 timeout=120 (nws-migrate_from-interval-0) migrate_to interval=0 timeout=60 (nws-migrate_to-interval-0) monitor interval=100 timeout=1000 (nws-monitor-interval-100) start interval=0 timeout=300 (nws-start-interval-0) stop interval=0 timeout=300 (nws-stop-interval-0)
Cause
IP 地址未解析为 NetWorker 群集配置使用的名称:
root@NWrhelNodeF:~# nslookup 192.168.25.28
** server can't find 28.25.168.192.in-addr.arpa: NXDOMAIN
IP 应解析为 /usr/lib/ocf/resource.d/EMC_NetWorker/Server 文件中的NSR_SERVERHOST值:
root@NWrhelNodeF:~# cat /usr/lib/ocf/resource.d/EMC_NetWorker/Server | grep SERVERHOST
echo "q" | nsradmin -s ${NSR_SERVERHOST} -i - > /dev/null 2>&1
echo "q" | nsradmin -s ${NSR_SERVERHOST} -i - > /dev/null 2>&1
NSR_SERVERHOST=NWrhelClusD.emclab.local
在运行 /usr/sbin/networker.cluster 脚本时设置此值。Resolution
修复 VIP 的名称解析。管理员可以在 DNS 配置中更正此问题,也可以在群集中涉及的每个节点上使用 /etc/hosts 文件条目进行更正。
root@NWrhelNodeF:~# nslookup 192.168.25.28
28.25.168.192.in-addr.arpa name = NWrhelClusD.emclab.local.
正确解析名称后,可以启动 NetWorker 服务:
root@NWrhelNodeF:~# pcs resource cleanup nws
Cleaned up fs on NWrhelNodeF.emclab.local
Cleaned up fs on NWrhelNodeE.emclab.local
Cleaned up ip on NWrhelNodeF.emclab.local
Cleaned up ip on NWrhelNodeE.emclab.local
Cleaned up nws on NWrhelNodeF.emclab.local
Cleaned up nws on NWrhelNodeE.emclab.local
root@NWrhelNodeF:~# pcs resource
* Resource Group: NW_group:
* fs (ocf::heartbeat:Filesystem): Started NWrhelNodeF.emclab.local
* ip (ocf::heartbeat:IPaddr): Started NWrhelNodeF.emclab.local
* nws (ocf::EMC_NetWorker:Server): Started NWrhelNodeF.emclab.local
Additional Information
如果 lcmap 未返回主机名或拥有的路径值,请参阅:
Affected Products
NetWorkerProducts
NetWorker Family, NetWorker SeriesArticle Properties
Article Number: 000208093
Article Type: Solution
Last Modified: 30 Apr 2025
Version: 5
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.