PowerFlex: Excessive Data Role-switching Causes IO Latency and Errors

Summary: This article explains how excessive data Role-switching causes IO latency and errors.

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Symptoms

Under certain cluster state transitions, the MDM role-balance logic can produce rapid, repeated primary/secondary role switches across a large number of combs (internal data structures that track which SDS nodes store each piece of volume data). Each role switch invalidates client-side (SDC) comb maps and forces IO retries. When enough combs are affected simultaneously, the cumulative retry overhead causes IO latency spikes and IO errors on SDC hosts. Depending on the host environment, this can result in application IO timeouts, VMs entering read-only state, or filesystem unavailability.

This behavior has been observed under multiple trigger scenarios and is not limited to any single operational procedure.

 

Common Indicators

  • MDM_DATA_DEGRADED event followed by sustained IO latency lasting 1-15+ minutes
  • SDC hosts report IO errors and/or IO timeouts during the degraded window
    • VMware (ESXi): VMFS heartbeat timeouts, SCSI hardware errors (sense data: 0x4 0x0 0x0), VMs entering read-only state, potential HA failover
    • Linux: IO errors in system logs (/var/log/messages, dmesg), applications may experience IO timeouts or filesystems remounting read-only
  • MDM event logs show the system in DEGRADED state longer than expected for a single SDS loss
  • System eventually self-recovers to NORMAL state without manual intervention (in most cases)

Note: As of this writing, this issue has only been reported in environments with VMware (ESXi) and Linux SDC hosts. There are no known reports of this behavior impacting Windows SDC hosts, though the underlying defect is in MDM core logic and is not OS-specific.

 

Scenario 1: Ungraceful SDS Loss (No Maintenance Mode)

When it can happen:

  • This is a very rare scenario. For the rapid, repeated role-switch events to occur during an ungraceful SDS loss, several specific conditions must be present simultaneously:

  • Large-scale environment — Significant number of SDS nodes and volumes
  • Heavy production IO load — Significant IO activity at the moment the SDS fails
  • Rebuild workload exceeds processing capacity — The number of metadata rows requiring rebuild exceeds the MDM balancer's per-cycle limit of 1,024 rows

Each rebalance cycle can process up to 1,024 metadata rows. When more rows need rebuilding, the balancer cannot finish the current plan before generating the next one.

What happens:

  • The SDS abruptly decouples from the MDM (event SDS_DECOUPLED)
  • All SDCs that were connected to that SDS lose their connections → SDC disconnect events
  • The MDM marks the cluster DEGRADED (event MDM_DATA_DEGRADED)
  • Because the number of rows to rebuild is > 1,024, the MDM balancer cannot finish the current rebalance plan
  • The balancer starts a new plan while the previous one is still running, producing rapid, repeated role-switch events
  • Client SDCs see continuous IO failures (IO_FAULT_NOT_PRI, SCSI sense 0x4). After retries are exhausted, the host OS reports IO errors, timeouts, or a read-only filesystem

  • MDM trace evidence:

When the rebalance workload exceeds the 1,024-row limit, the MDM trace shows the threshold being crossed:

2026/03/28 22:43:53.246702 MED:7f1f984aedb0:balanceExec_HandleDegradedRows:00343: BALANCER: Storage Pool: 1193844800000000 - 1024 rows processed out of 1098 degraded rows. 0 allocation failures. 0 cumulative allocation failures.

 

This indicates that 1,098 rows required rebuilding, but only 1,024 could be processed in the current cycle. The remaining rows trigger a new rebalance plan before the previous one completes, starting the feedback loop.

  • Chain of events: 
    Log Source     Event / Pattern
    MDM events     SDS_DECOUPLED — SDS formally declared dead
    MDM events     MDM_DATA_DEGRADED — Cluster enters DEGRADED state
    SDS traces     Flood of IO_FAULT_NOT_PRI — SDS received IO for a comb it is no longer primary for
    ESXi vmkernel  SCSI sense data: 0x4 0x0 0x0 — Hardware error
    MDM events     MULTIPLE_SDC_CONNECTIVITY_CHANGES — Mass SDC connectivity storm
    MDM events     SDC_DISCONNECTED_FROM_SDS_IP — SDCs losing contact with the failed SDS

 

  • Example MDM event sequence:
SDC_DISCONNECTED_FROM_SDS_IP    SDC disconnected from SDS <name>
SDS_DECOUPLED                   SDS <name> decoupled
MDM_DATA_DEGRADED               The system is now in DEGRADED state

 

 

Scenario 2: SDS Power-Off During PMM Entry

  • When it can happen:

This is a very rare scenario that requires two simultaneous events:

  • An SDS is in the process of entering Protected Maintenance Mode (PMM)
  • The SDS fails or is powered off before the PMM transition completes

  • What happens: 

  • The MDM receives the PMM entry command and records it as succeeded
  • The SDS unexpectedly decouples while the PMM entry is still in progress
  • The MDM marks the cluster DEGRADED
  • The role balancer enters a sustained role-switch loop throughout the PMM enter phase
  • Non-PMM data rows are repeatedly role-switched across the entire storage pool
  • The storm persists until the SDS re-joins the cluster and completes the maintenance mode transition

  • Chain of events:
    Log Source  Event / Pattern
    MDM events  CLI_COMMAND_SUCCEEDED — enter_protected_maintenance_mode command succeeded
    MDM events  SDS_DECOUPLED — SDS decoupled before maintenance mode started
    MDM events  MDM_DATA_DEGRADED — Cluster enters DEGRADED state
    SDS traces  Repeated role-switch operations across non-PMM rows

 

  • Example MDM event sequence:
    CLI_COMMAND_SUCCEEDED            Command enter_protected_maintenance_mode succeeded
    SDS_DECOUPLED                    SDS <name> decoupled
    MDM_DATA_DEGRADED               The system is now in DEGRADED state

 

  • When the SDS re-joins and PMM completes:
    SDS_MAINTENANCE_MODE_STARTED     SDS maintenance mode started
    MDM_DATA_NORMAL                 The system is now in NORMAL state

 

 

Scenario 3: SDS in Instant Maintenance Mode (IMM)

  • When it can happen:

An SDS enters or exits Instant Maintenance Mode (IMM). This scenario occurs when a single SDS is in maintenance mode and the system cannot decide which SDS should handle IO for specific data.

  • What happens:

  • The system repeatedly changes which SDS is responsible for serving the same data
  • These constant changes mean applications don't know where to send their IO requests
  • IO is sent to the wrong SDS, causing retries and delays
  • Applications experience latency or timeouts while trying to access the affected data

  • Impact:

  • Customer impact: Applications report latency and timeouts while the SDS is in IMM
  • Duration: Continues while the SDS is in IMM state
  • Recovery: Automatic — resolves when the SDS exits IMM

  • Chain of events:
    Log Source  Event / Pattern
    SDS traces  Repeated role-switch operations on the same data
    SDS traces  Primary and secondary role switches on identical data

 

Scenario 4: SDS Exit from Protected Maintenance Mode (PMM)

  • When it can happen:

An SDS exits Protected Maintenance Mode (PMM). This scenario occurs during every PMM exit — it is not a rare event, but the severity depends on how long the maintenance mode operation lasted.

  • What happens:

  • As the SDS exits PMM, the role balancer must reassign data segments to include the returning SDS
  • The rebalance process affects the entire storage pool, not just data on the returning SDS
  • Role switches occur across many data segments during the reintegration
  • Applications may experience brief IO errors or latency as the role assignments stabilize

  • Impact:

  • Customer impact: For short maintenance windows (less than 5 seconds), the impact is barely noticeable. For extended maintenance with active IO, thousands of role switches can occur, causing sustained IO stalls
  • Duration: Continues during the reintegration phase until the rebalance completes
  • Recovery: Automatic

  • Chain of events:
    Log Source  Event / Pattern
    MDM events  Role-switch operations across the storage pool during exit
    SDS traces  Repeated role-switch operations during reintegration

 

  • Example MDM event sequence:
    SDS_MAINTENANCE_MODE_EXIT_STARTED    SDS maintenance mode exit started
    SDS_MAINTENANCE_MODE_EXIT_COMPLETED   SDS maintenance mode exit completed

 

  • Log Outputs:

Note: The log examples below are generic representations of the patterns observed across multiple occurrences of this issue. They are not tied to any specific scenario listed above — the same patterns appear regardless of the trigger.

MDM Event Logs: The MDM event log shows the cluster-level sequence. The key indicators are role-switch operations during the maintenance mode exit.

SDS Trace Logs: On SDS nodes, trace logs will show repeated role-switch operations during reintegration:

raidComb_SetPriTgtGenNum: combId <id> combGenNum: cur <gen> new <gen>
contCmd_SetCombState: CombId <id> devId <id> PRI->SEC Switch roles
contCmd_SetCombState: CombId <id> devId <id> SEC->PRI Switch roles

 

A high volume of Switch roles entries in a short time window (thousands or more within seconds) is the definitive SDS-side indicator of this issue.

SDC / Host Logs: VMware (ESXi) SDC IO retries showing the comb, target SDS, and fault code:

vmkernel log
PowerFlex mapVolIO_Do_CK:1496 :Mit: <addr>. Retrying IO Type WRITE. Failed comb: <id>. SDS_ID <id>. Comb Gen <gen>. Head Gen <gen>.
PowerFlex mapVolIO_Do_CK:1510 :Mit: <addr>. Vol ID <id>. Last fault Status IO_FAULT_NOT_PRI(12). Retry count (1)

 

If retries exhaust, SCSI errors are returned: 

sense data: 0x4 0x0 0x0   -- SCSI Hardware Error

Diagnostic tip: If you see IO errors on multiple SDS nodes (not just the one that had an issue), this may indicate a role-switch storm rather than normal degraded-state behavior. If IO errors are isolated to a single SDS, this is expected degraded-state behavior.

 

Scenario 5: Maintenance Mode Phase Transitions 

  • When it can happen:

During the transition when an SDS enters or exits maintenance mode (IMM or PMM) — specifically at the moment the state changes from normal to MM, or from MM back to normal.

  • What happens:

  • The role balancer redistributes data responsibilities to accommodate the change
  • Brief bursts of role switches occur as the system settles into the new arrangement
  • Applications may experience short latency spikes during the transition

  • Impact:

  • Customer impact: Brief latency spikes lasting seconds to a few minutes. Usually below application timeout thresholds
  • Duration: Lasts seconds to a few minutes, then settles
  • Recovery: Automatic

  • Chain of events:
    Log Source  Event / Pattern
    SDS traces  Brief role-switch operations during phase transitions

 

Cause

A software defect in the MDM role-balance logic causes a feedback loop when the cluster transitions state due to an SDS loss or maintenance mode operation. Under certain conditions, the MDM repeatedly reassigns which SDS nodes are responsible for serving IO to affected combs. Each reassignment invalidates the SDC's cached view of where data is located, forcing IO retries. When a large number of combs are affected simultaneously, the volume of reassignments outpaces the SDCs' ability to update, resulting in sustained IO errors across multiple hosts. The storm is typically self-limiting. It resolves once the cluster stabilizes, but the duration depends on the size of the protection domain and IO load at the time of the event.

Resolution

This issue is fixed in PowerFlex Core version 4.5.6. Upgrade to this version once available. Contact Dell Support for release timeline information.

-For planned maintenance operations:
  • Do not power-cycle or reboot an SDS until the MDM logs SDS_MAINTENANCE_MODE_STARTED. Verify the SDS has fully entered maintenance mode before proceeding with physical maintenance.
  • Monitor for latency spikes when entering or exiting maintenance mode.

-For unplanned SDS outages:

  • The storm is self-limiting and typically resolves within minutes as the cluster stabilizes. If the issue is observed, collect getinfo logs from all SDS nodes in the protection domain, from the all manager MDMs as soon as possible after the event, and contact Dell Support.

-In rare cases where the issue does not self-resolve, temporarily disabling and re-enabling rebuild can allow the MDM to stabilize:

scli --set_rebuild_mode --protection_domain_name <pd_name> --storage_pool_name <sp_name> --disable_rebuild

 

# Wait 5-10 seconds, then enable rebuild: 

scli --set_rebuild_mode --protection_domain_name <pd_name> --storage_pool_name <sp_name> --enable_rebuild

 

Important: The cluster has reduced redundancy while rebuild is disabled. Only disable rebuild long enough for the system to stabilize, then re-enable immediately. It is recommended to perform this action with Dell Support guidance. 
Note: Rebuild is managed at the Storage Pool level. If the affected SDS has devices in multiple Storage Pools, apply this action to each affected Storage Pool. Storage Pools that do not contain devices from the affected SDS are not impacted. The Protection Domain, Storage Pool, and SDS-to-device mapping can be identified from the scli --query_all command output. 

 

Affected Products

PowerFlex rack, PowerFlex Appliance, PowerFlex rack connectivity, PowerFlex Software
Article Properties
Article Number: 000450312
Article Type: Solution
Last Modified: 01 مايو 2026
Version:  4
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.