PowerFlex: Excessive Data Role-switching Causes IO Latency and Errors

Сводка: Under certain cluster state transitions, the MDM role-balance logic can produce rapid, repeated primary/secondary role switches across a large number of combs (internal data structures that track which SDS nodes store each piece of volume data). Each role switch invalidates client-side (SDC) comb maps and forces IO retries. When enough combs are affected simultaneously, the cumulative retry overhead causes IO latency spikes and IO errors on SDC hosts. Depending on the host environment, this can result in application IO timeouts, VMs entering read-only state, or filesystem unavailability. This behavior has been observed under multiple trigger scenarios and is not limited to any single operational procedure. ...

Затронутые продукты

Данная статья применяется к Данная статья не применяется к Эта статья не привязана к какому-либо конкретному продукту. В этой статье указаны не все версии продуктов.

Ознакомьтесь с другими ресурсами

Симптомы

Common Indicators

MDM_DATA_DEGRADED event followed by sustained IO latency lasting 1-15+ minutes
SDC hosts report IO errors and/or IO timeouts during the degraded window

VMware (ESXi): VMFS heartbeat timeouts, SCSI hardware errors (sense data: 0x4 0x0 0x0), VMs entering read-only state, potential HA failover
Linux: IO errors in system logs (/var/log/messages, dmesg), applications may experience IO timeouts or filesystems remounting read-only

MDM event logs show the system in DEGRADED state longer than expected for a single SDS loss
System eventually self-recovers to NORMAL state without manual intervention (in most cases)

Note: As of this writing, this issue has only been reported in environments with VMware (ESXi) and Linux SDC hosts. There are no known reports of this behavior impacting Windows SDC hosts, though the underlying defect is in MDM core logic and is not OS-specific.

Scenario 1: Ungraceful SDS Loss (No Maintenance Mode)

An SDS is decoupled unexpectedly. SDCs disconnect from the affected SDS, the cluster enters DEGRADED, and the role-balance storm begins immediately. Example MDM event sequence:

SDC_DISCONNECTED_FROM_SDS_IP    SDC disconnected from SDS <name>
SDS_DECOUPLED                   SDS <name> decoupled
MDM_DATA_DEGRADED               The system is now in DEGRADED state

IO errors begin within seconds of the decouple event across multiple surviving SDS nodes, not just the SDS that was lost.

Scenario 2: SDS Power-Off During PMM Entry

An SDS is powered off before it finishes entering Protected Maintenance Mode. The MDM records the PMM request followed by an unexpected decouple before SDS_MAINTENANCE_MODE_STARTED is logged. Example MDM event sequence:

CLI_COMMAND_SUCCEEDED            Command enter_protected_maintenance_mode succeeded
SDS_DECOUPLED                   SDS <name> decoupled
MDM_DATA_DEGRADED               The system is now in DEGRADED state

The role-balance storm persists until the SDS re-joins and completes the maintenance mode transition.

Scenario 3: SDS in Instant Maintenance Mode (IMM)

An SDS enters or exits IMM. During the transition, IO latency spikes are observed. The MDM does not report DEGRADED state in this scenario, but SDC hosts experience IO retries and latency until the IMM transition completes.

Scenario 4: SDS Exit from Maintenance Mode

An SDS exits IMM or PMM. During re-integration, IO errors may occur briefly as the role-balance logic reassigns combs to the returning SDS.

Log Outputs:

Note: The log examples below are generic representations of the patterns observed across multiple occurrences of this issue. They are not tied to any specific scenario listed above - the same patterns appear regardless of the trigger.

MDM Event Logs: The MDM event log shows the cluster-level sequence. The key indicators are an SDS decouple followed by DEGRADED state, with the system remaining in DEGRADED longer than expected for a single SDS loss:

SDC_DISCONNECTED_FROM_SDS_IP    WARNING  SDC <name> disconnected from the IP address <ip> of SDS <name>
MULTIPLE_SDC_CONNECTIVITY_CHANGES INFO   Multiple SDC connectivity changes occurred
SDS_DECOUPLED                   ERROR    SDS: <name> (ID: <id>) decoupled
MDM_DATA_DEGRADED               ERROR    The system is now in DEGRADED state

When the storm resolves and the system stabilizes:

MDM_DATA_NORMAL                 INFO     The system is now in NORMAL state

SDS Trace Logs:

On the surviving SDS nodes (not the SDS that was lost), trace logs will show repeated IO fault responses for combs that should be stable. These indicate the role-balance storm is actively flipping primary/secondary assignments:

raidComb_SetPriTgtGenNum: combId <id> combGenNum: cur <gen> new <gen>
contCmd_SetCombState: CombId <id> devId <id> PRI->SEC Switch roles
contCmd_SetCombState: CombId <id> devId <id> SEC->PRI Switch roles

IO faults seen on surviving SDS nodes during the storm:

IO_FAULT_NOT_PRI    -- SDS received IO for a comb it is no longer primary for
IO_FAULT_WRONG_COMB_GEN -- SDC's cached comb generation is stale
IO_HARD_ERROR       -- SDS could not complete the IO (partner SDS unreachable)

A high volume of Switch roles entries in a short time window (thousands or more within seconds) is the definitive SDS-side indicator of this issue.

SDC / Host Logs:

VMware (ESXi) SDC IO retries showing the comb, target SDS, and fault code:

vmkernel log

PowerFlex mapVolIO_Do_CK:1496 :Mit: <addr>. Retrying IO Type WRITE. Failed comb: <id>. SDS_ID <id>. Comb Gen <gen>. Head Gen <gen>.
PowerFlex mapVolIO_Do_CK:1510 :Mit: <addr>. Vol ID <id>. Last fault Status IO_FAULT_NOT_PRI(12). Retry count (1)

If retries exhaust, SCSI errors are returned:

sense data: 0x4 0x0 0x0    -- SCSI Hardware Error

VMFS heartbeat timeouts on affected datastores:

HBX: 3089: '<datastore>': HB at offset <offset> - Waiting for timed out HB

Linux SDC hosts -- /var/log/messages or dmesg. IO errors surfaced through the SCSI layer or filesystem:

sd <device>: [scini] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
blk_update_request: I/O error, dev <device>, sector <sector>
EXT4-fs error (device <device>): ext4_journal_check_start: Detected aborted journal

Key point: The distinguishing characteristic is IO errors appearing on multiple SDS nodes, not just the SDS that was lost. If IO errors are isolated to the failed SDS, the issue is expected degraded-state behavior, not this defect.

Impact

During the role-balance storm, IO latency spikes cause temporary IO stalls on affected volumes. The duration and severity of impact depends on cluster size, IO load, and the number of combs affected.
Observed impact has included:

IO stalls lasting from approximately 15 seconds to 15+ minutes
VMs entering read-only state
VMFS heartbeat timeouts on ESXi hosts, potentially triggering HA failover or VM power-state alerts
Application IO timeouts on Linux SDC hosts
Large-scale environments may experience widespread impact across hundreds or thousands of VMs

No data loss has been observed in any reported occurrence. The system self-recovers once the role-balance storm subsides and the cluster returns to NORMAL state. The severity of impact scales with the size of the protection domain. Environments with a large number of SDS nodes, volumes, and active IO at the time of the event will experience greater impact.

Причина

A software defect in the MDM role-balance logic causes a feedback loop when the cluster transitions state due to an SDS loss or maintenance mode operation. Under certain conditions, the MDM repeatedly reassigns which SDS nodes are responsible for serving IO to affected combs. Each reassignment invalidates the SDC's cached view of where data is located, forcing IO retries. When a large number of combs are affected simultaneously, the volume of reassignments outpaces the SDCs' ability to update, resulting in sustained IO errors across multiple hosts. The storm is typically self-limiting. It resolves once the cluster stabilizes, but the duration depends on the size of the protection domain and IO load at the time of the event.

Разрешение

This issue is fixed in PowerFlex Core version 4.5.6. Upgrade to this version once available. Contact Dell Support for release timeline information.

-For planned maintenance operations:
• Do not power-cycle or reboot an SDS until the MDM logs SDS_MAINTENANCE_MODE_STARTED. Verify the SDS has fully entered maintenance mode before proceeding with physical maintenance.
• Monitor for latency spikes when entering or exiting maintenance mode.

-For unplanned SDS outages:

• The storm is self-limiting and typically resolves within minutes as the cluster stabilizes. If the issue is observed, collect getinfo logs from all SDS nodes in the protection domain, from the all manager MDMs as soon as possible after the event, and contact Dell Support.

-In rare cases where the issue does not self-resolve, temporarily disabling and re-enabling rebuild can allow the MDM to stabilize:

 
scli --set_rebuild_mode --protection_domain_name <pd_name> --storage_pool_name <sp_name> --disable_rebuild

# Wait 5-10 seconds, then enable rebuild:

 
scli --set_rebuild_mode --protection_domain_name <pd_name> --storage_pool_name <sp_name> --enable_rebuild

Important: The cluster has reduced redundancy while rebuild is disabled. Only disable rebuild long enough for the system to stabilize, then re-enable immediately. It is recommended to perform this action with Dell Support guidance.

Note: Rebuild is managed at the Storage Pool level. If the affected SDS has devices in multiple Storage Pools, apply this action to each affected Storage Pool. Storage Pools that do not contain devices from the affected SDS are not impacted. The Protection Domain, Storage Pool, and SDS-to-device mapping can be identified from the scli --query_all command output.

Additional Info

Impacted Versions

PowerFlex Core - 4.5.x and lower

Fixed In Version

PowerFlex Core - 4.5.6 and higher

Затронутые продукты

PowerFlex rack, PowerFlex Appliance, PowerFlex rack connectivity, PowerFlex Software

Номер статьи: 000450312

Тип статьи: Solution

Последнее изменение: 10 Apr 2026

Версия: 1

Проверьте, распространяются ли на ваше устройство услуги технической поддержки.

PowerFlex: Excessive Data Role-switching Causes IO Latency and Errors

Симптомы

Причина

Разрешение

Затронутые продукты

Симптомы

Причина

Разрешение

Additional Info

Impacted Versions

Fixed In Version

Затронутые продукты

Свойства статьи

Получите ответы на свои вопросы от других пользователей Dell

Услуги технической поддержки

Свойства статьи

Получите ответы на свои вопросы от других пользователей Dell

Услуги технической поддержки

PowerFlex: Excessive Data Role-switching Causes IO Latency and Errors

Подробная статья

Симптомы

Причина

Разрешение

Затронутые продукты

Симптомы

Причина

Разрешение

Additional Info

Impacted Versions

Fixed In Version

Затронутые продукты

Свойства статьи

Получите ответы на свои вопросы от других пользователей Dell

Услуги технической поддержки

Свойства статьи

Получите ответы на свои вопросы от других пользователей Dell

Услуги технической поддержки