RecoverPoint for VMs: Consistency Groups in Error State

Summary: RecoverPoint for Virtual Machines: Consistency Groups in Error state or are paused by system, init, or error loops.

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Symptoms

Consistency Groups are in an Error state or paused by system, init, and error loops.

Sample state from the get_group_state command:

CG:
    Enabled: YES
    Transfer source: Source
    Copy:
      Target:
        Enabled: YES
        Regulation Status: REGULATED
        Active primary RPA: RPA 1
        Journal: LONG RESYNC
        Storage access: NO ACCESS
        Max journal size: 1.09 TB
      Source:
        Enabled: YES
        Active primary RPA: RPA 1
        Storage access: DIRECT ACCESS (marking data)
    Link:
      Source->Target:
        Data Transfer: ERROR
 
Events in the UI or CLI get_events_log
Time:                 Mon Jan 16 15:29:30 2017
  Topic:                GROUP
  Scope:                DETAILED
  Level:                ERROR
  Event ID:             4009
  Cluster:              Target_Site_vRPA
  Global links:         None
  Groups:               [CG, CG_Copy]
  Links:                [CG, CG_Prod->CG_Copy]
  Summary:              Pausing data transfer for group
  Details:              Reason=distributor error.

  Time:                 Mon Jan 16 15:31:03 2017
  Topic:                GROUP
  Scope:                DETAILED
  Level:                WARNING
  Event ID:             4001
  Cluster:              Source_Site
  Global links:         None
  Groups:               [CG, CG_Prod]
  Summary:              Minor problem in group capabilities
  Details:              Copies are linked.

RPA1:
Marking (GlobalCopy(SiteUID(0x1207d88a9b552bf9) 0) ) = Yes - side defined as source. Site=Source_Site.
Source backlog mirroring (GlobalCopy(SiteUID(0x1207d88a9b552bf9) 0) ) = NOT NEEDED
Transfer (GlobalCopy(SiteUID(0x7b8822840a7e317f) 0) ) = NO - can't maintain history - not paused on snapshot and  user volume problem - Volume issue. Site=Target_Site, RPA1, Device=[CG, CG_Copy, CG_RSET_CG_3_0_scsi].
Box and VM share ESX = SAME_ESX_NOT_SAME

RPA2:
Marking (GlobalCopy(SiteUID(0x1207d88a9b552bf9) 0) ) = Yes - side defined as source. Site=Source_Site.
Source backlog mirroring (GlobalCopy(SiteUID(0x1207d88a9b552bf9) 0) ) = NOT NEEDED
Transfer (GlobalCopy(SiteUID(0x7b8822840a7e317f) 0) ) = Yes
Journal (GlobalCopy(SiteUID(0x7b8822840a7e317f) 0) ) = Yes
Target backlog mirroring (GlobalCopy(SiteUID(0x7b8822840a7e317f) 0) ) = NOT NEEDED
Preferred = NO
Box and VM share ESX = SAME_ESX_SAME
Number of VMs in same ESX is 1


Error in replication logs (extracted.*/files/home/kos/replication/result.log):

2017/01/16 13:03:30.187 - #1 - 5508/5482 - DataCommIoRequest: Got NACK from splitter, Error code = 2 *m_kboxDataCommMessage = KboxDataCommMessage, DataCommMessage, m_multiIoId: 1804479667 m_msgId: 17968710 m_type: 1 m_lbaAndLens: 1  lengthInBlocks: 1024  m_guid: 0x69f6a49648317aec m_version: 0 m_isFastPath: 1 m_hostId: ESX 0x10f9536584e11e30 m_priority: 6


Errors in splitter logs in host seen in replication logs above (m_hostId: ESXi 0x10f9536584e11e30):

2017/01/16 13:11:13.714 - #2 - 570188/570154 - KS: krnl:[13:11:13.594] 0/0 #0 - IoEsx_ToStorage_v_isSucceeded_i: IO 0x41132aa45240 Failed. Host_Status = 0x0, Device_Status = 0x8, dataLength = 393216
krnl:[13:11:13.594] 0/0 #0 - IoEsx_ToStorage_v_isSucceeded_i: numFSSRetries = 0, numFDSRetries = 0, absTimeoutMS = 250463060, startTC = 0
krnl:[13:11:13.594] 0/0 #0 - CommandIoDataCommWrite_v_storageEndIo_i: Failed write to storage. io_index = 0. Io status 0. Failing DataComm Write.
2017/01/16 13:11:14.828 - #2 - 570188/570154 - KS: krnl:[13:11:14.289] 0/0 #0 - IoEsx_ToStorage_v_isSucceeded_i: IO 0x41132aa41210 Failed. Host_Status = 0x0, Device_Status = 0x8, dataLength = 393216
krnl:[13:11:14.289] 0/0 #0 - IoEsx_ToStorage_v_isSucceeded_i: numFSSRetries = 0, numFDSRetries = 0, absTimeoutMS = 250463777, startTC = 0
krnl:[13:11:14.289] 0/0 #0 - CommandIoDataCommWrite_v_storageEndIo_i: Failed write to storage. io_index = 0. Io status 0. Failing DataComm Write.
2017/01/16 13:11:15.937 - #2 - 570188/570154 - KS: krnl:[13:11:15.118] 0/0 #0 - IoEsx_ToStorage_v_isSucceeded_i: IO 0x41132aa44c90 Failed. Host_Status = 0x0, Device_Status = 0x8, dataLength = 393216
krnl:[13:11:15.118] 0/0 #0 - IoEsx_ToStorage_v_isSucceeded_i: numFSSRetries = 0, numFDSRetries = 0, absTimeoutMS = 250464606, startTC = 0
krnl:[13:11:15.118] 0/0 #0 - CommandIoDataCommWrite_v_storageEndIo_i: Failed write to storage. io_index = 0. Io status 0. Failing DataComm Write.
krnl:[13:11:15.676] 0/0 #0 - IoEsx_ToStorage_v_isSucceeded_i: IO 0x4113298f6108 Failed. Host_Status = 0x0, Device_Status = 0x8, dataLength = 393216
krnl:[13:11:15.676] 0/0 #0 - IoEsx_ToStorage_v_isSucceeded_i: numFSSRetries = 0, numFDSRetries = 0, absTimeoutMS = 250465165, startTC = 0
krnl:[13:11:15.676] 0/0 #0 - CommandIoDataCommWrite_v_storageEndIo_i: Failed write to storage. io_index = 0. Io status 0. Failing DataComm Write.

 

Cause

The Data Store on the target side for these Consistency Groups had no free space, so RecoverPoint could not write to them.
 

Resolution

Review the target side ESXi to see that all target Data Stores (for target hosts and journals) have free space.
For Data Stores without space, free up space or increase the Data Store size, so RecoverPoint can write to it.

Affected Products

RecoverPoint for Virtual Machines

Products

RecoverPoint for Virtual Machines
Article Properties
Article Number: 000054855
Article Type: Solution
Last Modified: 20 Dec 2025
Version:  4
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.