Avamar: How to understand the output generated by the cplist command
Shrnutí: How to understand the output generated by the Avamar cplist command.
Pokyny
An Avamar checkpoint is a set of read-only directories on the Avamar data nodes.
It is like a point in time snapshot of the Avamar grid. It may be useful for rollback purposes should Avamar experience a serious issue that cannot be corrected.
cplist tool creates a list of checkpoints which exist on an Avamar grid.
-
The tool can be run by anyone with access to the Avamar Utility Node.
-
Incorrect assumptions about the state of checkpoints can result in data loss or an irrecoverable Avamar system.
This article assists the reader to interpret the tool's output.
Here is sample output from the cplist command:
cp.20130915110057 Sun Sep 15 12:00:57 2013 valid rol --- nodes 3/3 stripes 3530
cp.20130915110654 Sun Sep 15 12:06:54 2013 valid --- --- nodes 3/3 stripes 3530
From the first checkpoint in the list, the following discusses the meaning of each data field:
-
- This is the identification tag for the checkpoint and corresponds to the time the checkpoint was started.
cp.YYYYMMDDHHMMSS
- This is the identification tag for the checkpoint and corresponds to the time the checkpoint was started.
-
- The day, date, and time when the checkpoint was created. This always corresponds with the checkpoint tag
-
- If this field shows 'valid', the checkpoint is 'wholesome'
- Validity denotes whether the checkpoint is useful for rollback purposes
- If this field shows 'valid', it does not mean that the checkpoint has undergone
HFScheckvalidation - The validity field is superfluous when running "
cplist" since by default the command shows usable checkpoints - Running "
cplist --full" shows all checkpoints on the grid, including any which are not usable for rollback purposes
rol "
-
- This field shows the type of
HFScheckvalidation which was run on the checkpoint - Possible types are:
'hfs', 'rdc', 'par', 'rol'hfs or full- means that validation was run on all stripes in the checkpointrol- means that validation has checked all new or modified stripes in the checkpoint- Research has shown that when data integrity issues occur, usually the affected stripes are those which are newly created or recently modified
- For this reason, Avamar engineering recommends that rolling validation is considered practically as reliable as a lengthier full
HFScheckvalidation - Depending on the Avamar data ingestion rate, a rolling
HFScheckmay also check a proportion of a checkpoint's unmodified stripes. - This means that eventually, all stripes, even those which have not modified, may be integrity checked
rdc- means that validation has been completed but that one node did not participate in the validation. The validation type is not specified
- The integrity of the data cannot be guaranteed for checkpoints tagged as
rdc.- Such a check provides better confidence in the data integrity than no validation at all
- This field shows the type of
-
- This field indicates whether the checkpoint can be deleted, according to checkpoint retention settings in force on the Avamar server
- Checkpoint retention is controlled by the "
cphfschecked" and "cpmostrecent" parameters - Checkpoint retention should be left as default unless advised by a Support Engineer
- Incorrect checkpoint retention settings can put an Avamar grid at risk of data loss, or can cause operating system capacity issues
-
- The first number is the
refcount- This reports the number of nodes that responded to the
cplistcommand. - This value does not necessarily mean the number of nodes which are online
- This reports the number of nodes that responded to the
- The second number is the
nodecount.- This refers to the number of nodes which participated when the checkpoint was originally taken.
- In other words, how many data nodes contain that particular checkpoint directory
- Carefully note the state of the grid (total number of nodes and number of nodes online) and how
cplistwas run, before contemplating the meaning of the output of these two fields
- The first number is the
-
- This field displays the total number of stripes captured in the checkpoint.
- A rolling checkpoint validation validates a subset of this number of stripes.
- A full checkpoint validation validates all of them.
Examples of cplist output
Example 1:
cp.20130914190119 Sat Sep 14 20:01:19 2013 valid rol --- nodes 1/1 stripes 1401
cp.20130914192153 Sat Sep 14 20:21:53 2013 valid --- --- nodes 1/1 stripes 1401
- This is a single node grid
- There are two 'wholesome' or usable checkpoints
- cp.20130914190119 was validated with a rolling
HFScheck, the other checkpoint has not been validated - Both of these checkpoints captured 1401 stripes
Example 2:
cp.20130911150620 Wed Sep 11 11:06:20 2013 valid rol --- nodes 9/9 stripes 121107
cp.20130911160421 Wed Sep 11 12:04:21 2013 valid --- --- nodes 9/9 stripes 121107
cp.20130912151051 Thu Sep 12 11:10:51 2013 valid --- --- nodes 8/9 stripes 121107
- During each checkpoint, nine nodes participated in the checkpoint creation process
- The older of the three checkpoints (cp.20130911150620) has been validated with a rolling
HFScheck - The most recent checkpoint (September 12) is currently inaccessible on one of the nine nodes which form that checkpoint
Example 3:
cp.20130915110057 Sun Sep 15 12:00:57 2013 valid rol --- nodes 3/3 stripes 3530
cp.20130915110654 Sun Sep 15 12:06:54 2013 valid --- del nodes 3/3 stripes 3530
cp.20130916053830 Mon Sep 16 06:38:30 2013 valid --- --- nodes 3/3 stripes 3530
cp.20130916060236 Mon Sep 16 07:02:36 2013 valid --- --- nodes 2/2 stripes 3530
- cp.20130915110654 is eligible to be deleted according to checkpoint retention rules
- cp.20130916060236 was taken while one of the three nodes was offline
Example 4:
If an Avamar grid is integrated with Data Domain (DD), the cplist output can show checkpoints to be invalid if the Data Domain becomes unavailable.
For example:
If the DD is online:
cplist
cp.20130830173413 Fri Aug 30 10:34:13 2013 valid hfs --- nodes 1/1 stripes 82
cp.20130831000113 Fri Aug 30 17:01:13 2013 valid hfs --- nodes 1/1 stripes 82
If the DD is offline:
cplist
cplist: ERROR: ddrmaint: <4750>Datadomain get checkpoint list operation failed.
2013/09/17-14:28:06.79970 [cplist] ERROR: <0001> ddrmaint: <4750>Datadomain get checkpoint list operation failed.
cp.20130830173413 Fri Aug 30 10:34:13 2013 invalid --- --- nodes 1/1 stripes 82
cp.20130831000113 Fri Aug 30 17:01:13 2013 invalid --- --- nodes 1/1 stripes 82 Další informace
cplist command.
cplist --lscp(which calls "avmaint lscp")cplist
cplist is that with the --lscp flag, the "avmaint --lscp" command is called.
-
"
avmaint --lscp" queries checkpoint information from the runningGSANprocesses on the data nodes. -
If a node is unresponsive, it cannot be queried, and the
refcountvalue does not include it among the nodes counted.
Below is an example where a node is offline, and both variations are run:
cplist
cp.20130915110057 Sun Sep 15 12:00:57 2013 valid rol --- nodes 3/3 stripes 3530
cp.20130915110654 Sun Sep 15 12:06:54 2013 valid --- del nodes 3/3 stripes 3530
cp.20130916053830 Mon Sep 16 06:38:30 2013 valid --- --- nodes 3/3 stripes 3530
cp.20130916060236 Mon Sep 16 07:02:36 2013 valid --- --- nodes 2/2 stripes 3530
cplist --lscp
cp.20130915110057 Sun Sep 15 12:00:57 2013 valid rol --- nodes 2/3 stripes 3530
cp.20130915110654 Sun Sep 15 12:06:54 2013 valid --- del nodes 2/3 stripes 3530
cp.20130916053830 Mon Sep 16 06:38:30 2013 valid --- --- nodes 2/3 stripes 3530
cp.20130916060236 Mon Sep 16 07:02:36 2013 valid --- --- nodes 2/2 stripes 3530
-
avmaint lscp -
avmaint cpstatus -
avmaint hfscheckstatus