Avamar: Checkpoint failed with result "MSG_ERR_BADTIMESYNC."
Summary: Checkpoint failed with result "MSG_ERR_BADTIMESYNC."
This article applies to
This article does not apply to
This article is not tied to any specific product.
Not all product versions are identified in this article.
Symptoms
Checkpoint failed with result "MSG_ERR_BADTIMESYNC."
avmaint cpstatus shows the following error:
Every 2.0s: avmaint cpstatus 20:16:23 2022 <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <cpstatus generation-time="1663935384" tag="cp.20220923121551" status="error" stripes-completed="0" stripes-total="0" start-time="1663935351" end-time="1663935351" result="MSG_ERR_BADTIMESYNC" refcount="1"/>
mapall --parallel 'date' shows a node out of sync:
admin@utility:~/>: mapall --parallel 'date' Using /usr/local/avamar/var/probe.xml (0.0) ssh -q -x -o GSSAPIAuthentication=no admin@192.168.255.2 'date' (0.1) ssh -q -x -o GSSAPIAuthentication=no admin@192.168.255.3 'date' (0.2) ssh -q -x -o GSSAPIAuthentication=no admin@192.168.255.4 'date' (0.3) ssh -q -x -o GSSAPIAuthentication=no admin@192.168.255.5 'date' (0.4) ssh -q -x -o GSSAPIAuthentication=no admin@192.168.255.6 'date' (0.7) ssh -q -x -o GSSAPIAuthentication=no admin@192.168.255.9 'date' (0.6) ssh -q -x -o GSSAPIAuthentication=no admin@192.168.255.8 'date' (0.5) ssh -q -x -o GSSAPIAuthentication=no admin@192.168.255.7 'date' Fri Sep 23 13:05:21 UTC 2022 Fri Sep 23 13:05:21 UTC 2022 Fri Sep 23 13:07:17 UTC 2022 <---- out of sync node Fri Sep 23 13:05:20 UTC 2022 Fri Sep 23 13:05:22 UTC 2022 Fri Sep 23 13:05:20 UTC 2022 Fri Sep 23 13:05:22 UTC 2022 Fri Sep 23 13:05:21 UTC 2022
Verifying ntp with ntpq -pn shows connection refused on the suspect node.
Output edited to show only the affected node:
admin@utility:~/>: mapall --noerror '/usr/sbin/ntpq -pn' Using /usr/local/avamar/var/probe.xml (0.3) ssh -q -x -o GSSAPIAuthentication=no admin@192.168.255.5 '/usr/sbin/ntpq -pn' /usr/sbin/ntpq: read: Connection refused
When verifying the status directly on the affected node we see that ntpd is Active: activating (auto-restart) (Result: resources):
root@node03:~/>: systemctl status ntpd ● ntpd.service - NTP Server Daemon Loaded: loaded (/usr/lib/systemd/system/ntpd.service; enabled; vendor preset: disabled) Drop-In: /run/systemd/generator/ntpd.service.d └─50-insserv.conf-$time.conf Active: activating (auto-restart) (Result: resources) since Fri 2022-09-23 13:22:35 UTC; 1min 58s ago
However the status should reflect Active: active (running):
root@node03:~/#: systemctl status ntpd ● ntpd.service - NTP Server Daemon Loaded: loaded (/usr/lib/systemd/system/ntpd.service; enabled; vendor preset: disabled) Drop-In: /run/systemd/generator/ntpd.service.d └─50-insserv.conf-$time.conf Active: active (running) since Fri 2022-09-23 14:04:37 UTC; 26s ago
Attempting to start ntpd fails:
root@node03:~/#: systemctl start ntpd.service Job for ntpd.service failed because a configured resource limit was exceeded. See "systemctl status ntpd.service" and "journalctl -xe" for details.
During the review of journalctl -xe, 'No space left on the device' messages are seen.
df shows that /var is at 100%:
admin@node03:~/#: df -kh Filesystem Size Used Avail Use% Mounted on devtmpfs 16G 8.0K 16G 1% /dev tmpfs 16G 0 16G 0% /dev/shm tmpfs 16G 50M 16G 1% /run tmpfs 16G 0 16G 0% /sys/fs/cgroup /dev/sda5 9.8G 2.4G 7.0G 26% / /dev/sdg1 183G 8.3G 165G 5% /ssd01 /dev/sda1 979M 50M 878M 6% /boot /dev/sdd1 1.9T 236G 1.6T 13% /data04 /dev/sdc1 1.9T 240G 1.6T 13% /data03 /dev/sde1 1.9T 236G 1.6T 13% /data05 /dev/sdf1 1.9T 238G 1.6T 13% /data06 /dev/sdb1 1.9T 238G 1.6T 13% /data02 /dev/sda7 2.0G 2.0G 0 100% /var <------- /dev/sda3 1.8T 267G 1.6T 15% /data01
Cause
ntpd reliies on /var/lib/ntp/drift/ntp.drift which contains the latest estimate of clock frequency error. If /var is at 100% then ntpd cannot update or create the ntp.drift file and ntp will not function correctly.
Resolution
On the affected node, investigate and resolve 100% usage on /var. Once corrected restart ntpd:
Check the status of ntpd:
Results similar to the following should be observed:
Verify ntp with ntpq:
Results similar to the following should be observed:
Test the resolution by running a manual checkpoint from the utility node.
root@node03:~/#: systemctl restart ntpd
NOTE: A successful restart will not generate any output.
Check the status of ntpd:
root@node03:~/#: systemctl status ntpd
Results similar to the following should be observed:
root@node03:~/#: systemctl status ntpd ● ntpd.service - NTP Server Daemon Loaded: loaded (/usr/lib/systemd/system/ntpd.service; enabled; vendor preset: disabled) Active: active (running) since Tue 2022-09-27 21:21:42 UTC; 37s ago Docs: man:ntpd(1) Process: 29442 ExecStart=/usr/sbin/start-ntpd start (code=exited, status=0/SUCCESS) Main PID: 29463 (ntpd) Tasks: 2 CGroup: /system.slice/ntpd.service ├─29463 /usr/sbin/ntpd -p /var/run/ntp/ntpd.pid -g -u ntp:ntp -c /etc/ntp.conf └─29464 ntpd: asynchronous dns resolver Sep 27 21:21:42 node03 ntpd[29463]: Listen normally on 3 bond0 10.241.169.52:123 Sep 27 21:21:42 node03 ntpd[29463]: Listen normally on 4 bond1 192.168.255.22:123 Sep 27 21:21:42 node03 ntpd[29463]: Listen normally on 5 lo [::1]:123 Sep 27 21:21:42 node03 ntpd[29463]: Listen normally on 6 bond0 [fe80::260:16ff:feaa:2a10%11]:123 Sep 27 21:21:42 node03 ntpd[29463]: Listen normally on 7 bond1 [fe80::260:16ff:fea9:b182%12]:123 Sep 27 21:21:42 node03 ntpd[29463]: Listening on routing socket on fd #24 for interface updates Sep 27 21:21:42 node03 start-ntpd[29442]: Starting network time protocol daemon (NTPD) Sep 27 21:21:42 node03 systemd[1]: Started NTP Server Daemon.
Verify ntp with ntpq:
root@node03:~/#: /usr/sbin/ntpq -pn
Results similar to the following should be observed:
root@node03:~/#: /usr/sbin/ntpq -pn remote refid st t when poll reach delay offset jitter ============================================================================== *10.241.216.209 10.233.131.242 2 u 966 1024 377 0.558 1.559 0.600 +192.168.255.21 10.241.216.209 3 u 401 1024 377 0.152 0.521 0.420
Test the resolution by running a manual checkpoint from the utility node.
Affected Products
Avamar, Avamar Data Store, Avamar Data Store Gen3, Avamar Data Store Gen4, Avamar Data Store Gen4S, Avamar Data Store Gen4T, Avamar Data Store Gen5A, Avamar Server, Avamar Virtual EditionArticle Properties
Article Number: 000203791
Article Type: Solution
Last Modified: 18 Jul 2023
Version: 3
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.