Avamar: Garbage collection failing with MSG_ERR_TRYAGAINLATER
Summary: From Avamar version 7, backups are allowed during the Garbage Collection maintenance job. This can occasionally result in the message "MSG_ERR_TRYAGAINLATER" being seen.
Symptoms
The Avamar Garbage Collection (GC) maintenance job terminates with the error MSG_ERR_TRYAGAINLATER.
To verify the issue:
1. Run the status.dpn command:
admin@avamarhost:~/>: status.dpn
...
Last GC: finished Mon Dec 23 06:08:00 2013 after 03m 05s >> recovered 0.00 KB (MSG_ERR_TRYAGAINLATER)
-- Or --
2. Run the dumpmaintlogs command:
admin@avamarhost:~/>: dumpmaintlogs --types=gc --days=1
2013/12/23-12:08:00.9673 {0.0} <4202> failed garbage collection with error MSG_ERR_TRYAGAINLATER
-- Or --
3. Create a Service Request and have the Dell Technologies Avamar Support team confirm the issue.
Cause
This is expected behavior and occurs as new data is added to Avamar from backups.
When storage containers, or "stripes" on Avamar split into two, it is called "Index Stripe Splitting."
This occurs on rare occasions rarely and only after certain capacity intervals are reached depending on node size, count, version, so forth. This maintenance task cannot occur during GC.
If an index stripe is split when certain GC operations are attempted, GC exits with MSG_ERR_TRYAGAINLATER.
If an index stripe is running GC and it must split, it waits until GC operations have completed.
Index stripes on a grid tend to split around the same time period as each other on the various nodes. Sometimes this may take a few days to complete.
Resolution
Avamar is working as designed. When index stripe splitting completes, garbage collection resumes.
The workaround is to not run backups during GC.
Additional Information
- This behavior does not occur on a grid which is in "steady state" (has steady or decreasing capacity utilization) since all the stripes which must exist already do exist.
- This behavior does not occur on a grid which has become full and has since reduced in capacity (without having been expanded with new nodes). This is because all the stripes which can be created on a grid already exist.
- The behavior may occur after a node has been added and additional capacity exists to split stripes further.
- The issue might recur from time to time and is more likely to be seen on Avamar grids which are experiencing sustained data growth or which have recently been expanded with additional nodes.
- The behavior may persist over a series of days.