Avamar: Checkpoint fails with result MSG_ERR_DDR_ERROR due to Data Domain capacity issues

Summary: Data Domain (DD) Space usage in Data Collection has exceeded the 100% threshold causing Avamar checkpoints to fail with MSG_ERR_DDR_ERROR.

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Symptoms

Both scheduled and manual checkpoints are failing with MSG_ERR_DDR_ERROR.

For example:

status.dpn
Mon Aug 19 13:33:53 WEST 2019  [AV-XXX] Mon Aug 19 12:33:53 2019 UTC (Initialized Wed Feb 11 12:42:51 2015 UTC)
Node   IP Address     Version   State   Runlevel  Srvr+Root+User Dis Suspend Load UsedMB Errlen  %Full   Percent Full and Stripe Status by Disk
0.0     10.xxx.x.xx  19.1.0-38  ONLINE fullaccess mhpu+0hpu+0hpu   1 false   0.3 0 31802 52675327  11.8%  12%(onl:4148) 11%(onl:4133) 11%(onl:4135) 11%(onl:4140)  11%(onl:4138) 11%(onl:4147)
Srvr+Root+User Modes = migrate + hfswriteable + persistwriteable + useraccntwriteable

System ID: XXXXXXXX@00:1E:67:87:C4:6B

All reported states=(ONLINE), runlevels=(fullaccess), modes=(mhpu+0hpu+0hpu)
System-Status: ok
Access-Status: full

Checkpoint failed with result MSG_ERR_DDR_ERROR : cp.20190819054530 started Mon Aug 19 06:45:31 2019 ended Mon Aug 19 06:58:24 2019, completed 24840 of 24841 stripes
Last GC: finished Mon Aug 19 05:21:35 2019 after 20m 58s >> recovered 71.89 MB (MSG_ERR_DDR_ERROR)
Last hfscheck failed with result MSG_ERR_DDR_ERROR : started Mon Aug 19 05:34:23
Note: As seen in the example above, this may not be limited to just checkpoints.
 
 

The Data Domain log (/usr/local/avamar/var/ddrmaintlogs/ddrmaint.log) reports an "I/O error":

grep -i Error /usr/local/avamar/var/ddrmaintlogs/ddrmaint.log|grep -v -i "Error not set"
Aug 19 05:15:50 av-XXX ddrmaint.bin[49665]: Error: <4710>Datadomain garbage collect operation failed.
Aug 19 05:34:18 av-XXX ddrmaint.bin[52434]: Warning: Calling DDR_CREATE_SNAPSHOT returned result code:5009 message:I/O error
Aug 19 05:34:18 av-XXX ddrmaint.bin[52434]: Error: cp-create::execute_create_checkpoint - Failed to create checkpoint for avamar-XXXXXXX to snapshot cp.20190819042140 on ddXXX, DDR result code: 5009, desc: I/O error
Aug 19 05:34:18 av-XXX ddrmaint.bin[52434]: Error: <4760>Datadomain checkpoint create operation failed.

Cause

The Data Domain (DD) space has reached capacity.

This can be verified by doing the following:

1. Connect to the Data Domain. Use Avamar: How To Access a Data Domain System as a reference as required.

2. Check for any alerts:

alerts show current
Id      Post Time                  Severity   Class        Object          Message
-----   ------------------------   --------   ----------   -------------   ----------------------------------------------------------------------------
p0-87   Fri Aug 16 21:55:33 2019   CRITICAL   Filesystem   FilesysType=2   EVT-SPACE-00004: Space usage in Data Collection has exceeded 100% threshold.
-----   ------------------------   --------   ----------   -------------   ----------------------------------------------------------------------------
There is 1 active alert.
 

3. Run the "df" command:

df
Active Tier:
Resource           Size GiB   Used GiB   Avail GiB   Use%   Cleanable GiB*
----------------   --------   --------   ---------   ----   --------------
/data: pre-comp           -       15.1           -      -                -
/data: post-comp    30731.1    30608.8       122.3   100%             13.1
/ddvar                 49.1        8.7        37.9    19%                -
/ddvar/core           158.3        0.1       150.2     0%                -
----------------   --------   --------   ---------   ----   --------------
 * Estimated based on last cleaning of 2019/07/15 06:04:40.

Cloud Tier
Resource           Size GiB   Used GiB   Avail GiB   Use%   Cleanable GiB
----------------   --------   --------   ---------   ----   -------------
/data: pre-comp           -       19.0           -      -               -
/data: post-comp   33487.7*       20.4     33467.4     0%             0.0
----------------   --------   --------   ---------   ----   -------------
* Post-comp size is based on CLOUDTIER-CAPACITY license and might not be same as the cloud storage.

Total:
Resource           Size GiB   Used GiB   Avail GiB   Use%   Cleanable GiB
----------------   --------   --------   ---------   ----   -------------
/data: pre-comp           -       34.1           -      -               -
/data: post-comp    33829.9       36.4     33793.4     0%             0.0
/ddvar                 49.1        8.7        37.9    19%               -
/ddvar/core           158.3        0.1       150.2     0%               -
----------------   --------   --------   ---------   ----   -------------

Resolution

1. On Data Domain:

a. Check the file system cleaning status:

filesys clean status 
 

Example outputs:

Cleaning is not running:

Cleaning finished at 2019/08/19 21:37:46
 

Cleaning is running:

Cleaning started at 2019/08/19 06:00:02: phase 3 of 6 (pre-enumeration)
  1.6% complete,     0 GiB free; time: phase  1:26:05, total  1:48:11
 

b. If cleaning is running, wait for it to complete and then check the capacity using the df command.

c. If cleaning is not running, check the file system cleaning schedule:

filesys clean show schedule
 

Example output:

Filesystem cleaning is scheduled to run "Tue" at "0700".
 

d. If required, start a manual file system clean, and monitor to completion using the "fileysys clean watch" command as indicated in the output below.

filesys clean start
Cleaning started.  Use 'filesys clean watch' to monitor progress.
 
Note: If the issue remains after file system cleaning, engage a Data Domain file system engineer to assist.
 
 

2. On Avamar:

a. Once the capacity issues on Data Domain have been resolved, perform a manual checkpoint:

mccli checkpoint create --override_maintenance_scheduler
Note: The mccli command takes longer to complete, but includes a Management Console Server (MCS) backup (aka flush).
 

b. Monitor to completion, and verify that it is successful.

c. Monitor the grid through the next maintenance window to verify that all Avamar maintenance tasks (checkpoint, checkpoint validation, (hfscheck) and garbage collection complete successfully.

Affected Products

Avamar

Products

Avamar, Avamar Server
Article Properties
Article Number: 000046232
Article Type: Solution
Last Modified: 23 Jul 2025
Version:  6
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.