RecoverPoint: Replication Process Crashes When Phase1 Cache Memory Becomes Insufficient

Summary: Replication will crash with phase1 cache memory not sufficient assertion causing reboot regulation.

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Check out other resources

Symptoms

The state of the Consistency Group continues to be in initialization, but normal distribution never seems to start and the CG does not transition into an active state. A replication process crash with an assertion logged, when Phase1 cache memory is not sufficient and the target side RecoverPoint Appliances are unable to write to the target journal. Symptoms found in /home/kos/replication logs: Assertion: XXXX/XX/XX 18:59:25.693 - #2 - 17936/16776 - AssertLogSender: sending log: topic=DistributorGroupHandler, msg=Assertion failed: bIsPhase1CacheMemorySufficient Line 1825 File DistributorGroupHandlerPhase1.cc PID: 16776 Info: regular phase1 cache memory not sufficient m_GroupGridCopyRID = (groupCopyRID=(kVolSlot=XXXXXXXXXX,globalCopyID=GlobalCopy(SiteUID(0xXXXXXXXXXXXXXX) 0) ),gridCopyID=0) XXXX/XX/XX 18:59:25.694 - #2 - 16911/16776 - RemoteLogSender: got event (uniqueId=0, eventTime=1584471565693987), EventID_KBOX_ASSERTION_FAILED(3031), SiteUID(0xxxxxxxxxxxxxxxxx), seDetails=Sender=replication, Topic=DistributorGroupHandler, msg=Assertion failed: bIsPhase1CacheMemorySufficient Line 1825 File DistributorGroupHandlerPhase1.cc PID: 16776 Info: regular phase1 cache memory not sufficient m_GroupGridCopyRID = (groupCopyRID=(kVolSlot=XXXXXXXXXX,globalCopyID=GlobalCopy(SiteUID(0xXXXXXXXXXXX) 0) ),gridCopyID=0) Statistics which shows high data flow: XXXX/XX/XX 18:52:41.520 - #2 - 7676/7665 - AccumulatorFormatManager::printStatistics: Group Statistics for group Option( kVolSlot = XXXXXXXXXX groupUID = GroupCopy(1346840554 SiteUID(0xXXXXXXXXXXX) 0) gridID = 0): { STATISTICS: name=InitNCOnePhaseSpeed kVolSlot = 1346840554 groupUID = GroupCopy(1346840554 SiteUID(0xXXXXXXXXXXXXX) 0) gridID = 0 description: init nc one phase speed . STATISTICS: name=InitNCOnePhaseSpeed kVolSlot = 1346840554 groupUID = GroupCopy(1346840554 SiteUID(0xXXXXXXXXXXXXX) 0) gridID = 0 8 sec window: Average: 1.14e+03 MB/sec STATISTICS: name=InitNCOnePhaseSpeed kVolSlot = 1346840554 groupUID = GroupCopy(1346840554 SiteUID(0xXXXXXXXXXXXXX) 0) gridID = 0 77 sec window: Average: 1.06e+03 MB/sec A Consistency Group is in Initialization state: 2020/03/17 18:56:05.070 - #2 - 7954/7665 - InitNCState::DistributeOnePhase: distributing one phase m_groupID = (groupCopyRID=( kVolSlot=XXXXXXXXXX,globalCopyID=GlobalCopy(SiteUID(0xXXXXXXXXXXXX) 0) ),gridCopyID=0) Phase1 consumer for this Consistency Group is showing high consumption on the assertion: XXXX/XX/XX 18:56:05.241 - #2 - 7954/7665 - MemoryManager: viscus on assert + countdowns = 2413/390 + min memory demand = 433429 (fixed 329537 flexible 103892) + flexible usage space = 37977/3864963 + pool space usage = 37985/4194500 (max 143544) >> 1160635626647715840 :phase1#22 >> (groupTaskID=(sessionID=1817723153,replicationLinkID=(kVolSlot=XXXXXXXXX,srcCopyID=GlobalCopy(SiteUID(0xXXXXXXXXXXXX) >> 0) ,destCopyID=GlobalCopy(SiteUID A Replication StackTrace is also experienced: 2020/03/17 18:56:05.278 - #0 - 7954/7665 - StackTrace: errno=0 3: /home/kos/kashya/archive/lib/libreplication_libsrelease.so(_ZNK6Kashya23DistributorGroupHandler21waitForMemoryIfNeededEv+0x5b2) [0xxxxxxxxxxxxx] 2020/03/17 18:56:05.278 - #0 - 7954/7665 - StackTrace: errno=0 4: /home/kos/kashya/archive/lib/libreplication_libsrelease.so(_ZN6Kashya23DistributorGroupHandler25addSequencesToPhase1CacheENS_9SequencesERNS_15ReplicationModeE+0x939) 2020/03/17 18:56:05.278 - #0 - 7954/7665 - StackTrace: errno=0 5: /home/kos/kashya/archive/lib/libreplication_libsrelease.so(_ZN6Kashya23DistributorGroupHandler23handleSplittedSequencesENS_9SequencesERKNS_15ReplicationModeERKb+0x20a) 2020/03/17 18:56:05.278 - #0 - 7954/7665 - StackTrace: errno=0 6: /home/kos/kashya/archive/lib/libreplication_libsrelease.so(_ZN6Kashya23DistributorGroupHandler15handleSequencesENS_9SequencesERKNS_15ReplicationModeERKb+0x577) 2020/03/17 18:56:05.278 - #0 - 7954/7665 - StackTrace: errno=0 7: /home/kos/kashya/archive/lib/libreplication_libsrelease.so(_ZN6Kashya19Distributor_AO_IMPL23continueHandleSequencesENS_9SequencesENS_15ReplicationModeEbRKNS_10GridCopyIDE+0xf7) 2020/03/17 18:56:05.278 - #0 - 7954/7665 - StackTrace: errno=0 8: /home/kos/kashya/archive/lib/libreplication_libsrelease.so(_ZN6Kashya16SequencesRequest21continueHandleRequestERNS_28JournalRegulationRequestBase14RequestHandlerE+0x30b) 2020/03/17 18:56:05.278 - #0 - 7954/7665 - StackTrace: errno=0 9: /home/kos/kashya/archive/lib/libreplication_libsrelease.so(_ZN6Kashya31JournalRegulationThread_AO_IMPL9process_iERKNS_16GroupGridCopyRIDE+0x36f)

Cause

The memory manager is unable to scale up the allocation of memory to Phase1 cache and this causes a temporary condition where Phase1 cache has no space left for incoming sequences, thus the assertion.

Resolution

Workaround: Change value of tweak t_phase1CacheMemoryThreadSleepTime to 5000. (Increasing waiting time from 10 micro seconds to 5 milliseconds).This will ensure that we will not assert until the thread waits for memory for 5 milliseconds. If the issue is still occurring: 1. Please collect the production site logs as well. Since it will let us know the amount of data being sent from the production at the time of the issue. 2. Change value of tweak t_maxNoOfTriesToWaitForPhase1CacheMemory to 10. Note: These tweaks are only relevant to versions 5.1.3 and above. If the code version is not 5.1.3 or higher, RecoverPoint MUST be upgraded to the latest code to utilize these tweaks. Resolution: Dell EMC engineering is currently investigating this issue. A permanent fix is still in progress. Contact the Dell EMC Customer Support Center or your service representative for assistance and reference this solution ID.

Affected Products

RecoverPoint

Products

RecoverPoint, RecoverPoint EX

Article Number: 000174142

Article Type: Solution

Last Modified: 10 Jul 2025

Version: 5

Check if your device is covered by Support Services.

RecoverPoint: Replication Process Crashes When Phase1 Cache Memory Becomes Insufficient

Summary: Replication will crash with phase1 cache memory not sufficient assertion causing reboot regulation.

Symptoms

Cause

Resolution

Affected Products

Products

Article Properties

Find answers to your questions from other Dell users

Support Services

Article Properties

Find answers to your questions from other Dell users

Support Services

RecoverPoint: Replication Process Crashes When Phase1 Cache Memory Becomes Insufficient

Summary: Replication will crash with phase1 cache memory not sufficient assertion causing reboot regulation.

Detailed Article

Symptoms

Cause

Resolution

Affected Products

Symptoms

Cause

Resolution

Affected Products

Products

Article Properties

Find answers to your questions from other Dell users

Support Services

Article Properties

Find answers to your questions from other Dell users

Support Services