Start a Conversation

Unsolved

This post is more than 5 years old

12265

August 2nd, 2016 01:00

Talking about Avamar monitor

Talking about Avamar monitor

Share: Twitter Icon.png

Please click here for all contents shared by us.

Introduction

This article will show you some simple methods which can be used to monitor Avamar conditions in real time.

Detailed Information

Required equipment:

  • Avamar Gen4s-M1200, version 7.0.2, including three storage nodes, hanging DD860;
  • Avamar Gen4-3.9TB, version 6.0.2, including three storage nodes.

1.    Use Avamar GUI monitoring system


We can use Avamar GUI to monitor the status of Avamar, as shown in Figures 1 and 2. Figure 1 is Avamar Gen4s-M1200, version 7.0.2, Figure 2 is Avamar Gen4-3.9TB, 6.0.2.


11.png

Figure 1


12.png

Figure 2

Click “All Failures” on the figure 1, will get the all failure of Avamar backup as shown in figure 3.


13.png

Figure 3

Clicking “Critical Events” in the figure 1, will get the other Avamar activities as shown in figure 4. Such as CheckPoint, HFScheck, Garbage Collection, and hardware related errors.


14.png

Figure 4

Here are some common errors and solutions:


1)     A CheckPoint of server data is overdue.

It means there is no any new checkpoint created within 24 hours. In general, we need to see whether the CheckPoint is completed within a few hours. If yes, this error can be ignored. Because, Avamar's daily workflow is

Backup -> Garbage Collection -> CheckPoint -> HFScheck -> CheckPoint -> Backup

It can be seen, that each work is carried out sequentially. We also have a pre-set time for each part of the work window, but there always be some point of the work that is not necessarily completed on time. If Avamar’s work is delayed due to some special circumstances, this will lead to CheckPoint not being generate within 24 hours. Often, CheckPoint will be completed within the next 2-3 hours. So, this error can be ignored.




2)     Data Integrity Alerts


This error is about HFScheck. We often encounter some errors like MSG_ERR_HFSCHECKERRORSMSG_ERR_DDR_ERRORMSG_ERR_CGSAN_FAILED

MSG_ERR_TIMEOUT



  • MSG_ERR_HFSCHECKERRORS is caused by stripes problem on the storage node. Details can be found in KB 127269 (https://support.emc.com/kb/127269). We can implement solutions according to the detailed error on GSAN error log (/ data01 / cur / err. log), HFScheck error log (/data01/hfscheck/err.log), CheckPoint log (/ data01 / checklogs / cp.xxxxxxxxxxxxxx / err.log).


  • MSG_ERR_DDR_ERROR is caused by DD connection problems. Details can be found in KB 120996 (https://support.emc.com/kb/120996). Implement a resolution according to the above-mentioned HFScheck error analysis and DDR log (/ usr / local / avamar / var / ddrmaintlogs / ddrmaint.log).


  • MSG_ERR_CGSAN_FAILED is caused by GSAN process issues. Details can be found in KB 165409 (https://support.emc.com/kb/165409). This involves hardware, ASCD process, and Time synchronization between each node, if licenses are properly configured, whether there is an RMCP process on Gen3.3 etc issues. These issues will result in HFScheck and display MSG_ERR_CGSAN_FAILED error.


  • MSG_ERR_TIMEOUT, it similar with the MSG_ERR_CGSAN_FAILED error, it is caused by hardware issues, if there is an RMCP process on Gen3.3, thread exhaustion on single node etc issues . Details can be found in KB 172518 (https://support.emc.com/kb/172518)


For the above Data Integrity Alerts, either automatically resolved by server or technical support engineers, these alerts are available to clear through the following actions:


      1)    Clear by command line:

          mccli event clear-data-integrity-alerts --reset-code = AVAMARDATAOK


      2)     Clear by GUI

            a.     Login to GUI, click Administration

            b.    Click the Event Management

            c.     Click Unacknowledged Events

            d.    Click Actions> Event Management> Clear Data Integrity Alert

            e.     Enter the code : AVAMARDATAOK

Well, we have introduced how to use the GUI to monitor Avamar Server, especially maintenance jobs. Through this article, you should know how to use the GUI to monitor Avamar maintenance and be familiar with some of the error messages





1 Message

April 17th, 2021 03:00

MSG_ERR_DDR_ERROR kb article leads to a wrong link

No Events found!

Top