Article Number: 539997

printer Print mail Email

RecoverPoint with VMAX3/PowerMax - All Consistency Groups in error/paused by system state

Primary Product: RecoverPoint

Product: RecoverPoint more...

Last Published: 21 Apr 2020

Article Type: Break Fix

Published Status: Online

Version: 3

RecoverPoint with VMAX3/PowerMax - All Consistency Groups in error/paused by system state

Article Content

Issue


All Consistency Groups replicating from a specific VMAX3/PowerMax array go into Error or Paused by System state.

Error in CLI (get_system_status) and GUI alerts:

    Items: ERROR: One or more links of group CG1 are set to replicate snaps and an error occurred in the snap-based replication process. The following errors were received from the storage: Link = Site1 -> Site2, error = [An internal Snap or Clone error has occurred.  Please see the symapi log file]


On the Site Control RPA Symapi log (/var/symapi/log/symapi-xxxxxx.log):

01/02/2020 18:20:26.504   4002   STARTING a SNAPVX 'ESTABLISH' operation for Device List, snapshot 'RPc786dd98_xxxx_000xxxxxxxxxxxxxxx'.
01/02/2020 18:20:26.505   4002     Symm 0001xxxxxxxx  Number of Devices: xx
01/02/2020 18:20:26.505   4002       Source Devices: [ 0xxx-0xxx ]
01/02/2020 18:20:26.513   4002      6 EMC:RecoverPoint          execute_9244_syscall syscall: 0x81, 00xxx(S), SYMAPI: SYMAPI_C_SNAP_INTERNAL_ERROR, Order17: SNAPVX_FAILURE(0x4D), Extended RC: Unknown(0x000800C5), flag: 0x00
01/02/2020 18:20:26.686   4002   STARTING a SNAPVX 'ESTABLISH' operation for Device List, snapshot 'RPc786dd98_xxxx_000xxxxxxxxxxxxxxx'.
01/02/2020 18:20:26.687   4002     Symm 0001xxxxxxxx  Number of Devices: 1
01/02/2020 18:20:26.687   4002       Source Devices: [ 0xxx ]
01/02/2020 18:20:26.692   4002      7 EMC:RecoverPoint          execute_9244_syscall syscall: 0x81, 00xxx(S), SYMAPI: SYMAPI_C_SNAP_INTERNAL_ERROR, Order17: SNAPVX_FAILURE(0x4D), Extended RC: Unknown(0x000800C5), flag: 0x00
01/02/2020 18:20:27.520   4002      6 EMC:RecoverPoint          execute_9244_syscall syscall: 0x81, 00xxx(S), SYMAPI: SYMAPI_C_SNAP_INTERNAL_ERROR, Order17: SNAPVX_FAILURE(0x4D), Extended RC: Unknown(0x000800C5), flag: 0x00
...
01/02/2020 18:20:29.532   4002   The SNAPVX 'ESTABLISH' operation FAILED.
01/02/2020 18:20:29.532   4002   The Snapvx operation FAILED.
01/02/2020 18:20:29.702   4002   The SNAPVX 'ESTABLISH' operation FAILED.
01/02/2020 18:20:29.702   4002   The Snapvx operation FAILED.

 

Cause

RecoverPoint uses SnapVX operations on VMAX3/PowerMax arrays, and the Storage Resource Pool (SRP) on the array filled up, causing all snapshots to fail.
RecoverPoint is suffering from the SRP filling up on the array, similar to KBA 
VMAX3, VMAX All Flash, & PowerMax: Storage Resource Pool (SRP) threshold alert

Resolution

Workaround:

As discussed on 
KBA VMAX3, VMAX All Flash, & PowerMax: Storage Resource Pool (SRP) threshold alert the first step is to make some space available for the SRP.

Due to the issues in the SnapVX snapshots, in order for RP replication to work, the Cgs will need to be disabled and later re-enabled.
The action of disabling the CGs should free up some space on the SRP since the local snapshots will be cleared, however the same snapshots will be recreated when the CGs are re-enabled, so more actions should be taken to allow for space on the SRP other than the disable/enable.

Resolution:

In order to prevent this issue from occurring, there should always be space available on the local array SRP.
Monitor the spaced used on the SRP with VMAX tools (Solutions Enabler/ Unisphere for VMAX)
The steps mentioned in K
BA VMAX3, VMAX All Flash, & PowerMax: Storage Resource Pool (SRP) threshold alert can be used before the space is consumed to prevent the issue from occurring.

Notes

Issue


All Consistency Groups replicating from a specific VMAX3/PowerMax array go into Error or Paused by System state.

Error in CLI (get_system_status) and GUI alerts:

    Items: ERROR: One or more links of group CG1 are set to replicate snaps and an error occurred in the snap-based replication process. The following errors were received from the storage: Link = Site1 -> Site2, error = [An internal Snap or Clone error has occurred.  Please see the symapi log file]


On the Site Control RPA Symapi log (/var/symapi/log/symapi-xxxxxx.log):

01/02/2020 18:20:26.504   4002   STARTING a SNAPVX 'ESTABLISH' operation for Device List, snapshot 'RPc786dd98_xxxx_000xxxxxxxxxxxxxxx'.
01/02/2020 18:20:26.505   4002     Symm 0001xxxxxxxx  Number of Devices: xx
01/02/2020 18:20:26.505   4002       Source Devices: [ 0xxx-0xxx ]
01/02/2020 18:20:26.513   4002      6 EMC:RecoverPoint          execute_9244_syscall syscall: 0x81, 00xxx(S), SYMAPI: SYMAPI_C_SNAP_INTERNAL_ERROR, Order17: SNAPVX_FAILURE(0x4D), Extended RC: Unknown(0x000800C5), flag: 0x00
01/02/2020 18:20:26.686   4002   STARTING a SNAPVX 'ESTABLISH' operation for Device List, snapshot 'RPc786dd98_xxxx_000xxxxxxxxxxxxxxx'.
01/02/2020 18:20:26.687   4002     Symm 0001xxxxxxxx  Number of Devices: 1
01/02/2020 18:20:26.687   4002       Source Devices: [ 0xxx ]
01/02/2020 18:20:26.692   4002      7 EMC:RecoverPoint          execute_9244_syscall syscall: 0x81, 00xxx(S), SYMAPI: SYMAPI_C_SNAP_INTERNAL_ERROR, Order17: SNAPVX_FAILURE(0x4D), Extended RC: Unknown(0x000800C5), flag: 0x00
01/02/2020 18:20:27.520   4002      6 EMC:RecoverPoint          execute_9244_syscall syscall: 0x81, 00xxx(S), SYMAPI: SYMAPI_C_SNAP_INTERNAL_ERROR, Order17: SNAPVX_FAILURE(0x4D), Extended RC: Unknown(0x000800C5), flag: 0x00
...
01/02/2020 18:20:29.532   4002   The SNAPVX 'ESTABLISH' operation FAILED.
01/02/2020 18:20:29.532   4002   The Snapvx operation FAILED.
01/02/2020 18:20:29.702   4002   The SNAPVX 'ESTABLISH' operation FAILED.
01/02/2020 18:20:29.702   4002   The Snapvx operation FAILED.

 

Cause

RecoverPoint uses SnapVX operations on VMAX3/PowerMax arrays, and the Storage Resource Pool (SRP) on the array filled up, causing all snapshots to fail.
RecoverPoint is suffering from the SRP filling up on the array, similar to KBA 
VMAX3, VMAX All Flash, & PowerMax: Storage Resource Pool (SRP) threshold alert

Resolution

Workaround:

As discussed on 
KBA VMAX3, VMAX All Flash, & PowerMax: Storage Resource Pool (SRP) threshold alert the first step is to make some space available for the SRP.

Due to the issues in the SnapVX snapshots, in order for RP replication to work, the Cgs will need to be disabled and later re-enabled.
The action of disabling the CGs should free up some space on the SRP since the local snapshots will be cleared, however the same snapshots will be recreated when the CGs are re-enabled, so more actions should be taken to allow for space on the SRP other than the disable/enable.

Resolution:

In order to prevent this issue from occurring, there should always be space available on the local array SRP.
Monitor the spaced used on the SRP with VMAX tools (Solutions Enabler/ Unisphere for VMAX)
The steps mentioned in K
BA VMAX3, VMAX All Flash, & PowerMax: Storage Resource Pool (SRP) threshold alert can be used before the space is consumed to prevent the issue from occurring.

Notes

Article Attachments

Attachments

Attachments

Article Properties

First Published

Fri Jan 03 2020 22:35:04 GMT

First Published

Fri Jan 03 2020 22:35:04 GMT

Rate this article

Accurate
Useful
Easy to understand
Was this article helpful?
0/3000 characters