RecoverPoint: 當第 1 階段快取記憶體不足時,複製過程崩潰

Summary: 複寫將當機,而第 1 階段快取記憶體的斷言不足,導致重新開機法規。

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Symptoms



一致性組的狀態繼續處於初始化狀態,但正態分發似乎從未開始,並且 CG 不會轉換為活動狀態。 當第 1 階段快取記憶體不足,且目標端 RecoverPoint Appliance 無法寫入目標日誌時,複寫程序會崩潰並記錄聲明。 在 /home/kos/replication 記錄中找到的症狀: 斷言: XXXX/XX/XX 18:59:25.693-#2-17936/16776-AssertLogSender: send log: topic=DistributorGroupHandler, msg=Assertion failed: bIsPhase1CacheMemoryEnough 行 1825 檔案 DistributorGroupHandlerPhase1.cc PID:16776 資訊:一般 phase1 快取記憶體不足 m_GroupGridCopyRID = (groupCopyRID=(kVolSlot=XXXXXXXXXX,globalCopyID=GlobalCopy(SiteUID(0xXXXXXXXXXXXXXX) 0) ),gridCopyID=0) XXXX/XX/XX 18:59:25.694-#2-16911/16776-RemoteLogSender: got event (uniqueId=0, eventTime=1584471565693987), EventID_KBOX_ASSERTION_FAILED(3031), SiteUID(0xxxxxxxxxxxxxxxxxx), seDetails=Sender=replication, Topic=DistributorGroupHandler,msg=Assertion failed: bIsPhase1CacheMemoryEnough 行 1825 檔案 DistributorGroupHandlerPhase1.cc PID:16776 資訊: 常規 phase1 高速緩存不足 m_GroupGridCopyRID = (groupCopyRID=(kVolSlot=XXXXXXXXXX,globalCopyID=GlobalCopy(SiteUID(0xXXXXXXXXXXX) 0) ),gridCopyID=0) 顯示高資料流量的統計資料: XXXX/XX/XX 18:52:41.520-#2-7676/7665-累加器格式管理員::p rintStatistics:群組的群組統計資料 Option( kVolSlot = XXXXXXXXXX groupUID = GroupCopy(1346840554 SiteUID(0xXXXXXXXXXXX) 0) gridID = 0):{ STATISTICS: name=InitNCOnePhaseSpeed kVolSlot = 1346840554 groupUID = GroupCopy(1346840554 SiteUID(0xXXXXXXXXXXXXX) 0) gridID = 0 描述: init nc 一相速度 . STATISTICS: name=InitNCOnePhaseSpeed kVolSlot = 1346840554 groupUID = GroupCopy(1346840554 SiteUID(0xXXXXXXXXXXXXX) 0) gridID = 0 8 秒視窗:平均:1.14E+03 MB/秒 STATISTICS: name=InitNCOnePhaseSpeed kVolSlot = 1346840554 groupUID = GroupCopy(1346840554 SiteUID(0xXXXXXXXXXXXXX) 0) gridID = 0 77 秒視窗:平均:1.06e+03 MB/秒 一致性群組處於初始化狀態: 2020/03/17 18:56:05.070 - #2 - 7954/7665 - InitNCState::D istributeOnePhase: distributing one phase m_groupID = (groupCopyRID=( kVolSlot=XXXXXXXXXX,globalCopyID=GlobalCopy(SiteUID(0xXXXXXXXXXXXX) 0) ),gridCopyID=0) 此一致性組的第 1 階段消費者在斷言上顯示高消耗: XXXX/XX/XX 18:56: 05.241-#2-7954/7665-MemoryManager:判斷提示時的 viscus + 倒計時 = 2413/390 + 最小記憶體需求 = 433429(固定329537彈性103892) + 靈活使用空間 = 37977/3864963 + 泳池空間使用量 = 37985/4194500 (最大143544) >> 1160635626647715840 :p hase1#22 >> (groupTaskID=(sessionID=1817723153,replicationLinkID=(kVolSlot=XXXXXXXXX,srcCopyID=GlobalCopy(SiteUID(0xXXXXXXXXXXXX) >> 0) ,destCopyID=GlobalCopy(SiteUID 也會發生複寫堆疊追蹤: 2020/03/17 18:56:05.278-#0-7954/7665-StackTrace: errno = 0 3: /home/kos/kashya/archive/lib/libreplication_libsrelease.so(_ZNK6Kashya23DistributorGroupHandler21waitForMemoryIfNeededEv+0x5b2) [0xxxxxxxxxxxxxxx] 2020/03/17 18:56: 05.278-#0-7954/7665-StackTrace: errno = 0 4: /home/kos/kashya/archive/lib/libreplication_libsrelease.so(_ZN6Kashya23DistributorGroupHandler25addSequencesToPhase1CacheENS_9SequencesERNS_15ReplicationModeE+0x939) 2020/03/17 18:56:05.278-#0-7954/7665-StackTrace: errno = 0 5: /home/kos/kashya/archive/lib/libreplication_libsrelease.so(_ZN6Kashya23DistributorGroupHandler23handleSplittedSequencesENS_9SequencesERKNS_15ReplicationModeERKb+0x20a) 2020/03/17 18:56:05.278-#0-7954/7665-StackTrace: errno = 0 6: /home/kos/kashya/archive/lib/libreplication_libsrelease.so(_ZN6Kashya23DistributorGroupHandler15handleSequencesENS_9SequencesERKNS_15ReplicationModeERKb+0x577) 2020/03/17 18:56: 05.278-#0-7954/7665-StackTrace: errno = 0 7: /home/kos/kashya/archive/lib/libreplication_libsrelease.so(_ZN6Kashya19Distributor_AO_IMPL23continueHandleSequencesENS_9SequencesENS_15ReplicationModeEbRKNS_10GridCopyIDE+0xf7) 2020/03/17 18:56:05.278-#0-7954/7665-StackTrace: errno = 0 8: /home/kos/kashya/archive/lib/libreplication_libsrelease.so(_ZN6Kashya16SequencesRequest21continueHandleRequestERNS_28JournalRegulationRequestBase14RequestHandlerE+0x30b) 2020/03/17 18:56: 05.278-#0-7954/7665-StackTrace: errno = 0 9: /home/kos/kashya/archive/lib/libreplication_libsrelease.so(_ZN6Kashya31JournalRegulationThread_AO_IMPL9process_iERKNS_16GroupGridCopyRIDE+0x36f)

Cause

記憶體管理器無法將記憶體分配擴展到第 1 階段緩存,這會導致臨時情況,其中第 1 階段緩存沒有空間留給傳入序列,因此斷言。

Resolution

因應措施:將「調整 t_phase1CacheMemoryThreadSleepTime 」的值變更為 5000。(等待時間從 10 微秒增加至 5 毫秒)。這將確保在線程等待記憶體 5 毫秒之前我們不會斷言。如果問題仍然存在:1.請同時收集生產現場記錄。因為它會讓我們知道在問題發生時從生產部門傳送的資料量。2.將調整t_maxNoOfTriesToWaitForPhase1CacheMemory的值變更為 10。注意:這些調整僅與版本 5.1.3 及更新版本相關。如果程式碼版本不是 5.1.3 或更高版本,則必須將 RecoverPoint 升級至最新程式碼,以使用這些調整功能。解決方案:Dell EMC 工程部門目前正在調查此問題。永久修正方法仍在進行中。如需技術協助,請聯絡 Dell EMC 客戶支援中心或您的服務代表,並引用此解決方案 ID。

Affected Products

RecoverPoint

Products

RecoverPoint, RecoverPoint EX
Article Properties
Article Number: 000174142
Article Type: Solution
Last Modified: 10 Jul 2025
Version:  5
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.