PowerFlex 3.x: SDS service continuously panics with function drl_IsClean

Summary: In rare scenarios, the SDS service may continuously panic with the function drl_IsClean. This issue has been observed when the SDS devices are larger than 2 TB in size.

Acest articol se aplică pentru Acest articol nu se aplică pentru Acest articol nu este legat de un produs specific. Acest articol nu acoperă toate versiunile de produs existente.

Symptoms

SDS service continuously panics with the following stack trace:

/opt/emc/scaleio/sds/logs/exp.0

2024/07/22 21:54:33.819866 Panic in file /data/build/workspace/ScaleIO-Common-Job/src/tgt/bm/drl.c, line 1238, function drl_IsClean, PID 17253.Panic Expression !(offsetInLbs < pDrl->protectedOffsetInLbs) PANIC_ID_tgt_1497349762194.
/opt/emc/scaleio/sds/bin/sds-3.6.400.107(mosDbg_PanicPrepare+0x13a) [0x93ab8a]
/opt/emc/scaleio/sds/bin/sds-3.6.400.107(drl_IsClean+0x5e) [0x9346ae]
/opt/emc/scaleio/sds/bin/sds-3.6.400.107(mgPhyDev_IsDrlGroupClean+0x4b) [0x93476b]
/opt/emc/scaleio/sds/bin/sds-3.6.400.107(mgPhyComb_ReadIntegrityBits+0x130) [0x906040]
/opt/emc/scaleio/sds/bin/sds-3.6.400.107(mgStorageRegion_ReadRegionIntegrity+0xb4) [0x906224]
/opt/emc/scaleio/sds/bin/sds-3.6.400.107(storageRegion_ReadDirtyRegion+0xad) [0x740f4d]
/opt/emc/scaleio/sds/bin/sds-3.6.400.107(raidComb_ReadDrl+0x7d) [0x74105d]
/opt/emc/scaleio/sds/bin/sds-3.6.400.107(ioh_ReadCombDrl+0x758) [0x5eb368]
/opt/emc/scaleio/sds/bin/sds-3.6.400.107(ioh_NewRequest+0x2084) [0x5fb4a4]
/opt/emc/scaleio/sds/bin/sds-3.6.400.107(contNet_RecvIORequest+0x2c4) [0x601534]

 

Impact 

User data unavailability may occur if any other SDS decouples as part of it being in one of the following states:

  • Instant Maintenance Mode (IMM)
  • Error state
  • During an ongoing rebuild

Cause

 

SDS service panics caused by large device offsets.

 

Resolution

Fix:

  • PowerFlex 3.6.5 and above (end of support)
  • PowerFlex 4.5 and above

 

Workaround:

Follow one of the options.
If Option 1 does not resolve the issue, go to Option 2.

 

Option 1:

    • Enter the SDS node into IMM from scli command line or Presentation Server UI.
      • If the SDS node cannot enter IMM, stop the SDS daemon by running the script /opt/emc/scaleio/sds/bin/delete_service.sh.
        Take necessary precautions to prevent the cluster from entering a Data Unavailability (DU) state. Before stopping the SDS daemon, verify that no Rebuild is in progress.
        If you're unsure about the DU state, consult L2 or an SME.
    • Stop the SDS service once the SDS is placed in IMM 
      /opt/emc/scaleio/sds/bin/delete_service.sh
    • Remove the shared memory on the SDS (including CloudLink shared memory). 
      • Move the files generated by the following command to a temporary directory
        ls -l /dev/shm | egrep -i *EMC_sds*
        ls -l /dev/shm | egrep emc_scaleio_*
         
         
    • Start the SDS service
      /opt/emc/scaleio/sds/bin/create_service.sh
  •  
    • Exit SDS out of IMM using scli or Presentation server UI. A rebuild is expected to start. If the SDS was not in IMM, go to the next step 
       
    • Check the output of the following command to ensure that the SDS is connected:
      scli --query_all_sds


    Option 2: 

    • If the system is not in a Data Failure state and sufficient free or spare capacity is available, remove the SDS node from the PowerFlex Cluster. Once the rebalance is complete, readd the SDS node with all the SDS devices.

     

    IMPORTANT:
    Background Scanner (BGS) and Partial Device Error (PDE) could potentially cause the issue to recur. If possible, disable BGS or use BGS in "report only" mode.
    Persistent checksums should not trigger issues. However, if there is a checksum mismatch, a slight rebuild is initiated, which may cause the issue to arise again. If possible, disable Persistent Checksum.

    Produse afectate

    PowerFlex Software

    Produse

    PowerFlex Appliance
    Proprietăți articol
    Article Number: 000228035
    Article Type: Solution
    Ultima modificare: 08 Jul 2025
    Version:  9
    Găsiți răspunsuri la întrebările dvs. de la alți utilizatori Dell
    Servicii de asistență
    Verificați dacă dispozitivul dvs. este acoperit de serviciile de asistență.