PowerFlex 3.x: MDM Panics at Function rpl_transmit_mgr.c

Summary: Mobile Device Management (MDM) process continuously panics due to replication

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Symptoms

In this case, the replication site code level was at 3.x and the destination site code level was at 4.x, however, the issue may impact any 3.x systems.

No changes have been made on the storage side.

The MDM process continuously panics with the following stack trace:

2024/11/24 05:51:06.186359 Panic in file /data/build/workspace/ScaleIO-Common-Job/src/mdm/replication/consistency_engine/rpl_transmit_mgr.c, line 833, function rplTransmitManager_ProcessRequestsForTimelinesRFD, PID 19477.Panic Expression ALWAYS_ASSERT .
/opt/emc/scaleio/mdm/bin/mdm-3.6.400.107(mosDbg_PanicPrepare+0x13a) [0xabf1ba]
/opt/emc/scaleio/mdm/bin/mdm-3.6.400.107(rplTransmitManager_ProcessRequestsForTimelinesRFD+0x1f0) [0x880da0]
/opt/emc/scaleio/mdm/bin/mdm-3.6.400.107(consistencyEngine_AnalyzeTimelines+0x7b) [0x7f2ebb]
/opt/emc/scaleio/mdm/bin/mdm-3.6.400.107(consistencyEngine_AnalayzerUmtIteration+0x3c) [0x60d96c]
/opt/emc/scaleio/mdm/bin/mdm-3.6.400.107(consistencyEngine_AnalayzerUmtRoutine+0x33) [0x60da43]
/opt/emc/scaleio/mdm/bin/mdm-3.6.400.107(mosUmt_StartFunc+0x7a) [0x69a9fa] /lib64/libc.so.6(+0x48190) [0x7ff82e834190]
/opt/emc/scaleio/mdm/bin/mdm-3.6.400.107(mosUmt_Init+0x129) [0x8f5e89]
[(nil)]

Impact:
MDM cluster is down which results in data unavailable (DU).

Cause

The issue was identified as a software code defect in version 3.x, which caused the MDMs to panic. Due to this defect, the transmitted data exceeded the enforced limit of 200 GiB during replication. Due to excessive requests, the MDMs struggled to process them, resulting in instability and ultimately panic.

In this specific case, the highly transmitted data was a result of a Windows SDC trim command, however, the issue could be seen due to any large data transmission.

Resolution

This software issue has been resolved in the latest versions. To permanently resolve the issue, the recommendation is upgrading to 4.5.x or later to ensure stability before resuming replication:

  1. Stop SDRs on all nodes.
    This temporarily resolves the panic.
  2. Pause or Stop all Replication Consistency Groups (RCGs) and replication pairs.
  3. Upgrade the system to the latest 4.5.x version or later.
  4. Resume the replication after completing the upgrade.

Impacted Versions:
PowerFlex 3.x

Fixed In Version:
PowerFlex 4.5

Products

PowerFlex rack RCM Software
Article Properties
Article Number: 000278514
Article Type: Solution
Last Modified: 29 Jul 2025
Version:  2
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.