Avamar:RMCP 未删除检查点

Summary: 本文介绍当检查点验证成功后未从 Avamar 中删除检查点时观察到的行为。

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Symptoms

在维护活动期间,不会删除检查点。此外,如果 Avamar 与 Data Domain 集成,快照也不会过期。

admin@av-srv-prod:~/>: cplist --full
cp.20241021171415 Mon Oct 21 13:14:15 2024   valid --- del  nodes   1/1 stripes    277
cp.20241022164600 Tue Oct 22 12:46:00 2024   valid rol del  nodes   1/1 stripes    277
cp.20241022171838 Tue Oct 22 13:18:38 2024   valid --- del  nodes   1/1 stripes    277
cp.20241022193333 Tue Oct 22 15:33:33 2024   valid rol del  nodes   1/1 stripes    277
cp.20241024164621 Thu Oct 24 12:46:21 2024   valid rol ---  nodes   1/1 stripes    277
cp.20241024171054 Thu Oct 24 13:10:54 2024   valid --- ---  nodes   1/1 stripes    277
admin@av-srv-prod:~/>:

使用 mccli 命令,多个经过验证的检查点(滚动 HFS 检查)显示为“失败”:

admin@av-srv-prod:~/>: mccli checkpoint show --verbose
0,23000,CLI command completed successfully.
Tag               Time                    Validated Deletable Nodes Stripes Validation Start Time   Validation Finished Time Errors
----------------- ----------------------- --------- --------- ----- ------- ----------------------- ------------------------ ------
cp.20241021171415 2024-10-21 13:14:15 EDT           No        1     277     Not Validated           Not Validated            N/A
cp.20241022164600 2024-10-22 12:46:00 EDT Failed    No        1     277     2024-10-22 12:53:44 EDT 2024-10-22 13:09:46 EDT  1
cp.20241022171838 2024-10-22 13:18:38 EDT           No        1     277     Not Validated           Not Validated            N/A
cp.20241022193333 2024-10-22 15:33:33 EDT Failed    No        1     277     2024-10-22 15:42:07 EDT 2024-10-22 15:56:48 EDT  1
cp.20241024164621 2024-10-24 12:46:21 EDT Failed    No        1     277     2024-10-24 12:53:09 EDT 2024-10-24 13:08:04 EDT  1
cp.20241024171054 2024-10-24 13:10:54 EDT           No        1     277     Not Validated           Not Validated            N/A
admin@av-srv-prod:~/>: 

执行删除检查点 (rmcp) 命令时,不会删除任何检查点。

admin@av-srv-prod:~/>: avmaint rmcp --full --ava
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<checkpointrmlist has-approved-checkpoint="false">
  <checkpoint
    tag="cp.20241021171415"
    deleted="false"
    ddr-deleted="false"/>
  <checkpoint
    tag="cp.20241022164600"
    deleted="false"
    ddr-deleted="false"/>
  <checkpoint
    tag="cp.20241022171838"
    deleted="false"
    ddr-deleted="false"/>
  <checkpoint
    tag="cp.20241022193333"
    deleted="false"
    ddr-deleted="false"/>
  <checkpoint
    tag="cp.20241024164621"
    deleted="false"
    ddr-deleted="false"/>
  <checkpoint
    tag="cp.20241024171054"
    deleted="false"
    ddr-deleted="false"/>
</checkpointrmlist>

在 Data Domain 上,快照不会自动到期。它们必须手动过期:

avboost@dd-srv-prod# snapshot list mtree /data/col1/avamar-1234567890
Snapshot Information for MTree: /data/col1/avamar-1234567890
----------------------------------------------
Name                Pre-Comp (GiB)   Create Date         Retain Until        Status
-----------------   --------------   -----------------   -----------------   -------
cp.20241015171741          69287.4   Oct 15 2024 13:19   Oct 22 2024 13:13   expired
cp.20241015194118          69287.4   Oct 15 2024 15:43   Oct 22 2024 13:13   expired
...
...
cp.20241020164654          65247.4   Oct 20 2024 12:49
cp.20241020171602          65262.9   Oct 20 2024 13:18
cp.20241021164757          65257.4   Oct 21 2024 12:50
cp.20241021171415          65272.9   Oct 21 2024 13:16
cp.20241022164600          65280.0   Oct 22 2024 12:48
-----------------   --------------   -----------------   -----------------   -------
...
avboost@dd-srv-prod# 

观察到的另一个行为是在 Avamar Server 上运行命令时速度缓慢。尽管服务器未运行任何任务或备份,但平均负载仍然很高。

 

 

Cause

有几个因素可能会导致此行为。在全面分析 Avamar Server 上运行的进程(使用 top 或 ps -ef)后,确定了所有问题。部分情形包括:

  • 旧的 Perl 进程
  • 过时的自定义复制
  • 自定义报告
  • 旧 Avtar 进程

在某些情况下可以找到证据:

admin    15007  0.0  0.0   9664  2812 ?        Ss    2023   0:00 bash -c export TERM=${TERM:-dumb} ; /usr/bin/ssh-agent /tmp/dpnctl-run-self.14963.aux
admin    15042  0.0  0.0   9528  2192 ?        S     2023   0:00  \_ /bin/bash /tmp/dpnctl-run-self.14963.aux
admin    15043  0.0  0.0  30792   680 ?        Ss    2023   0:52      \_ /usr/bin/ssh-agent /tmp/dpnctl-run-self.14963.aux
admin    15049 99.6  0.1  81996 39340 ?        R     2023 272656:21      \_ /usr/bin/perl /usr/local/avamar/bin/dpnctl --rerun --mcs_user=root stop 
admin    26975     1  0  80   0 -  3440 -      Oct08 ?        00:00:00 bash -c ./avReplication.40 --report --csv --quiet
admin    27290 25935  0  80   0 -  3440 -      Oct08 ?        03:55:24 bash -c ./avReplication.40 --quiet --report --short-status
admin    27761 26975  0  80   0 -  3440 -      Oct08 ?        03:50:39 bash -c ./avReplication.40 --report --csv --quiet
root      9046  0.0  0.0 314212  6792 ?        SNl  Nov08   0:00 /usr/local/avamar/bin/avtar.bin --vardir=/usr/local/avamar/var --bindir=/usr/local/avamar/bin --sysdir=/usr/local/avamar/etc --sysdir="/usr/l
root     20385  0.0  0.0 314212  6624 ?        SNl  Nov08   0:00 /usr/local/avamar/bin/avtar.bin --vardir=/usr/local/avamar/var --bindir=/usr/local/avamar/bin --sysdir=/usr/local/avamar/etc --sysdir="/usr/l
root     22784  0.0  0.0 314212  6544 ?        SNl  Nov08   0:00 /usr/local/avamar/bin/avtar.bin --vardir=/usr/local/avamar/var --bindir=/usr/local/avamar/bin --sysdir=/usr/local/avamar/etc --sysdir="/usr/l

 

Resolution

1.以管理员交换机身份登录 Avamar Server,并将其根目录:

su -

2.运行以下命令以全面分析进程:

top
ps -aux --forest
ps -ef

 

警告:如有任何疑问,请勿终止任何流程。

 

3.标识进程后,使用进程 ID (PID) 终止它:

kill <pid>

4.如果进程未终止,请强制执行以下作:

kill -9 <pid>

5.命令应再次开始更快地工作。

6.运行 RMCP:

avmaint rmcp --full --ava

7.以下两个命令再次正确显示检查点:

cplist --full
mccli checkpoint show --verbose

示例:

admin@av-srv-prod:~/>: cplist --full
cp.20241024164621 Thu Oct 24 12:46:21 2024   valid rol ---  nodes   1/1 stripes    277
cp.20241024171054 Thu Oct 24 13:10:54 2024   valid --- ---  nodes   1/1 stripes    277
admin@av-srv-prod:~/>: 
admin@av-srv-prod:~/>: mccli checkpoint show --verbose
0,23000,CLI command completed successfully.
Tag               Time                    Validated Deletable Nodes Stripes Validation Start Time   Validation Finished Time Errors
----------------- ----------------------- --------- --------- ----- ------- ----------------------- ------------------------ ------
cp.20241024164621 2024-10-24 12:46:21 EDT Validated No        1     277     2024-10-24 12:53:09 EDT 2024-10-24 13:08:04 EDT  0
cp.20241024171054 2024-10-24 13:10:54 EDT           No        1     277     Not Validated           Not Validated            N/A
admin@av-srv-prod:~/>: 

8.确保 Data Domain 上的快照显示“已过期”状态。

Affected Products

Avamar, Avamar Server
Article Properties
Article Number: 000255751
Article Type: Solution
Last Modified: 16 Apr 2025
Version:  3
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.