Avamar: Checkpoint failed with result "MSG_ERR_BADTIMESYNC."
摘要: Checkpoint failed with result "MSG_ERR_BADTIMESYNC."
本文适用于
本文不适用于
本文并非针对某种特定的产品。
本文并非包含所有产品版本。
症状
Checkpoint failed with result "MSG_ERR_BADTIMESYNC."
avmaint cpstatus shows the following error:
Every 2.0s: avmaint cpstatus 20:16:23 2022 <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <cpstatus generation-time="1663935384" tag="cp.20220923121551" status="error" stripes-completed="0" stripes-total="0" start-time="1663935351" end-time="1663935351" result="MSG_ERR_BADTIMESYNC" refcount="1"/>
mapall --parallel 'date' shows a node out of sync:
admin@utility:~/>: mapall --parallel 'date' Using /usr/local/avamar/var/probe.xml (0.0) ssh -q -x -o GSSAPIAuthentication=no admin@192.168.255.2 'date' (0.1) ssh -q -x -o GSSAPIAuthentication=no admin@192.168.255.3 'date' (0.2) ssh -q -x -o GSSAPIAuthentication=no admin@192.168.255.4 'date' (0.3) ssh -q -x -o GSSAPIAuthentication=no admin@192.168.255.5 'date' (0.4) ssh -q -x -o GSSAPIAuthentication=no admin@192.168.255.6 'date' (0.7) ssh -q -x -o GSSAPIAuthentication=no admin@192.168.255.9 'date' (0.6) ssh -q -x -o GSSAPIAuthentication=no admin@192.168.255.8 'date' (0.5) ssh -q -x -o GSSAPIAuthentication=no admin@192.168.255.7 'date' Fri Sep 23 13:05:21 UTC 2022 Fri Sep 23 13:05:21 UTC 2022 Fri Sep 23 13:07:17 UTC 2022 <---- out of sync node Fri Sep 23 13:05:20 UTC 2022 Fri Sep 23 13:05:22 UTC 2022 Fri Sep 23 13:05:20 UTC 2022 Fri Sep 23 13:05:22 UTC 2022 Fri Sep 23 13:05:21 UTC 2022
Verifying ntp with ntpq -pn shows connection refused on the suspect node.
Output edited to show only the affected node:
admin@utility:~/>: mapall --noerror '/usr/sbin/ntpq -pn' Using /usr/local/avamar/var/probe.xml (0.3) ssh -q -x -o GSSAPIAuthentication=no admin@192.168.255.5 '/usr/sbin/ntpq -pn' /usr/sbin/ntpq: read: Connection refused
When verifying the status directly on the affected node we see that ntpd is Active: activating (auto-restart) (Result: resources):
root@node03:~/>: systemctl status ntpd ● ntpd.service - NTP Server Daemon Loaded: loaded (/usr/lib/systemd/system/ntpd.service; enabled; vendor preset: disabled) Drop-In: /run/systemd/generator/ntpd.service.d └─50-insserv.conf-$time.conf Active: activating (auto-restart) (Result: resources) since Fri 2022-09-23 13:22:35 UTC; 1min 58s ago
However the status should reflect Active: active (running):
root@node03:~/#: systemctl status ntpd ● ntpd.service - NTP Server Daemon Loaded: loaded (/usr/lib/systemd/system/ntpd.service; enabled; vendor preset: disabled) Drop-In: /run/systemd/generator/ntpd.service.d └─50-insserv.conf-$time.conf Active: active (running) since Fri 2022-09-23 14:04:37 UTC; 26s ago
Attempting to start ntpd fails:
root@node03:~/#: systemctl start ntpd.service Job for ntpd.service failed because a configured resource limit was exceeded. See "systemctl status ntpd.service" and "journalctl -xe" for details.
During the review of journalctl -xe, 'No space left on the device' messages are seen.
df shows that /var is at 100%:
admin@node03:~/#: df -kh Filesystem Size Used Avail Use% Mounted on devtmpfs 16G 8.0K 16G 1% /dev tmpfs 16G 0 16G 0% /dev/shm tmpfs 16G 50M 16G 1% /run tmpfs 16G 0 16G 0% /sys/fs/cgroup /dev/sda5 9.8G 2.4G 7.0G 26% / /dev/sdg1 183G 8.3G 165G 5% /ssd01 /dev/sda1 979M 50M 878M 6% /boot /dev/sdd1 1.9T 236G 1.6T 13% /data04 /dev/sdc1 1.9T 240G 1.6T 13% /data03 /dev/sde1 1.9T 236G 1.6T 13% /data05 /dev/sdf1 1.9T 238G 1.6T 13% /data06 /dev/sdb1 1.9T 238G 1.6T 13% /data02 /dev/sda7 2.0G 2.0G 0 100% /var <------- /dev/sda3 1.8T 267G 1.6T 15% /data01
原因
ntpd reliies on /var/lib/ntp/drift/ntp.drift which contains the latest estimate of clock frequency error. If /var is at 100% then ntpd cannot update or create the ntp.drift file and ntp will not function correctly.
解决方案
On the affected node, investigate and resolve 100% usage on /var. Once corrected restart ntpd:
Check the status of ntpd:
Results similar to the following should be observed:
Verify ntp with ntpq:
Results similar to the following should be observed:
Test the resolution by running a manual checkpoint from the utility node.
root@node03:~/#: systemctl restart ntpd
NOTE: A successful restart will not generate any output.
Check the status of ntpd:
root@node03:~/#: systemctl status ntpd
Results similar to the following should be observed:
root@node03:~/#: systemctl status ntpd ● ntpd.service - NTP Server Daemon Loaded: loaded (/usr/lib/systemd/system/ntpd.service; enabled; vendor preset: disabled) Active: active (running) since Tue 2022-09-27 21:21:42 UTC; 37s ago Docs: man:ntpd(1) Process: 29442 ExecStart=/usr/sbin/start-ntpd start (code=exited, status=0/SUCCESS) Main PID: 29463 (ntpd) Tasks: 2 CGroup: /system.slice/ntpd.service ├─29463 /usr/sbin/ntpd -p /var/run/ntp/ntpd.pid -g -u ntp:ntp -c /etc/ntp.conf └─29464 ntpd: asynchronous dns resolver Sep 27 21:21:42 node03 ntpd[29463]: Listen normally on 3 bond0 10.241.169.52:123 Sep 27 21:21:42 node03 ntpd[29463]: Listen normally on 4 bond1 192.168.255.22:123 Sep 27 21:21:42 node03 ntpd[29463]: Listen normally on 5 lo [::1]:123 Sep 27 21:21:42 node03 ntpd[29463]: Listen normally on 6 bond0 [fe80::260:16ff:feaa:2a10%11]:123 Sep 27 21:21:42 node03 ntpd[29463]: Listen normally on 7 bond1 [fe80::260:16ff:fea9:b182%12]:123 Sep 27 21:21:42 node03 ntpd[29463]: Listening on routing socket on fd #24 for interface updates Sep 27 21:21:42 node03 start-ntpd[29442]: Starting network time protocol daemon (NTPD) Sep 27 21:21:42 node03 systemd[1]: Started NTP Server Daemon.
Verify ntp with ntpq:
root@node03:~/#: /usr/sbin/ntpq -pn
Results similar to the following should be observed:
root@node03:~/#: /usr/sbin/ntpq -pn remote refid st t when poll reach delay offset jitter ============================================================================== *10.241.216.209 10.233.131.242 2 u 966 1024 377 0.558 1.559 0.600 +192.168.255.21 10.241.216.209 3 u 401 1024 377 0.152 0.521 0.420
Test the resolution by running a manual checkpoint from the utility node.
受影响的产品
Avamar, Avamar Data Store, Avamar Data Store Gen3, Avamar Data Store Gen4, Avamar Data Store Gen4S, Avamar Data Store Gen4T, Avamar Data Store Gen5A, Avamar Server, Avamar Virtual Edition文章属性
文章编号: 000203791
文章类型: Solution
上次修改时间: 18 7月 2023
版本: 3
从其他戴尔用户那里查找问题的答案
支持服务
检查您的设备是否在支持服务涵盖的范围内。