PowerFlex 3.X During NDU an SDS crash while performing the mgStorageRegion_CopyFromBuffer function

Summary: SDS crashing on function mgStorageRegion_CopyFromBuffer

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Symptoms

 - Upgrade from 3.0 to 3.6

 - SDS exiting instant Maintenance Mode (IMM)  

- Inflight Checksum is enabled.  

SDS crashes with the following panic: 

2022/12/20 10:22:18.543129 Panic in file /data/build/workspace/ScaleIO-Common-Job/src/tgt/storage/mg_impl/mg_storage_region.c, line 4188, function mgStorageRegion_CopyFromBuffer, PID 13108.Panic Expression !(bufferSizeInBytes != ((sizeInLbs) * (512))) PANIC_ID_tgt_1517847817759.
/opt/emc/scaleio/sds/bin/sds-3.6.400.107(mosDbg_PanicPrepare+0x13a) [0x93ab8a]
/opt/emc/scaleio/sds/bin/sds-3.6.400.107(mgStorageRegion_CopyFromBuffer+0x1a7) [0x810477]
/opt/emc/scaleio/sds/bin/sds-3.6.400.107(mgStorageRegion_CopyFromCachedBuffer+0x33) [0x810873]
/opt/emc/scaleio/sds/bin/sds-3.6.400.107(storageRegion_CopyFromCachedBuffer+0xde) [0x4d3f3e]
/opt/emc/scaleio/sds/bin/sds-3.6.400.107(mgStorageRegion_ReadFromSyncBuf+0x45) [0x4d4a85]
/opt/emc/scaleio/sds/bin/sds-3.6.400.107(raidComb_ReadFromRemoteUntilEntireRegionIsAcquired+0x451) [0x74e021]
/opt/emc/scaleio/sds/bin/sds-3.6.400.107(raidComb_ReadVolumeFromRemote+0x3ae) [0x74e8ee]
/opt/emc/scaleio/sds/bin/sds-3.6.400.107(raidReverseRebuild_LoopAndCopy+0x140f) [0x75518f]
/opt/emc/scaleio/sds/bin/sds-3.6.400.107(raidReverseRebuild_Start+0xb56) [0x7583c6]
/opt/emc/scaleio/sds/bin/sds-3.6.400.107(raidSyncPool_StartJob+0x375) [0x5e3145]

The Inflight Checksum feature is enabled on the Storage Pool/s:

Storage Pool durpsdsc7p8pool1 (Id: 469b0f6800000007) has 7 volumes and 236.9 TB (242550 GB) free net capacity. 2.5 PB (2524 TB) volume allocation limit.
...
Inflight Checksum: Enabled
...

 

Impact

SDS is crashing and unable to exit IMM. If during an upgrade, the upgrade will not be able to continue.

Cause

Due to being an edge case and the fact that it is a very outdated version that is no longer supported, it is essential to implement the workaround described in the Resolution section below.

Resolution

Disable Inflight Checksum on the Storage Pool that the SDS contributes to:

1) Query all SDSs to get the SDS name/id:

[root@nestedsvm2 ~]# scli --query_all_sds
Query-all-SDS returned 3 SDS nodes.

Protection Domain 8eeacbf900000000 Name: pd1
SDS ID: ab471ceb00000002 Name: svm103 State: Connected, Joined IP: 15.15.15.103 Port: 7072 Version: 3.6.500
SDS ID: ab471cea00000001 Name: svm102 State: Connected, Joined IP: 15.15.15.102 Port: 7072 Version: 3.6.500
SDS ID: ab471ce800000000 Name: svm101 State: Connected, Joined IP: 15.15.15.101 Port: 7072 Version: 3.6.500

2) Query the SDS by either name/id and look for "Storage Pool: ":

 Note: the Storage Pool/s it contributes devices to; in this case, there is only one, which is Storage Pool name "sp1."

[root@nestedsvm2 ~]# scli --query_sds --sds_id ab471ceb00000002 | grep -i 'storage pool:'
         1: Storage Pool: sp1  inflight requests factor: 115, inflight bandwidth factor 115
                Storage Pool: sp1, Capacity: 198 GB, State: Normal
                Storage Pool: sp1, Capacity: 98 GB, State: Normal
                Storage Pool: sp1, Capacity: 98 GB, State: Normal

3) Query the Storage Pool and look for "Inflight Checksum: ":

[root@nestedsvm2 ~]# scli --query_storage_pool --storage_pool_name sp1 --protection_domain_name pd1 | grep -i 'Inflight checksum'
        Inflight Checksum: Enabled

3.a) To disable Inflight Checksum, it can be done using scli or in the Presentation Server. To disable it using scli, run the command below:

[root@nestedsvm2 ~]# scli --set_checksum_mode --protection_domain_name pd1 --storage_pool_name sp1 --disable_inflight_checksum
Checksum mode modified successfully

3.b) In the Presentation Server, navigate to Storage Pools > select Storage Pool > Modify > General > uncheck "Enable Inflight Checksum" > Apply.  

 

 Impacted Versions

PowerFlex 3.5.x

 PowerFlex 3.6.x

Affected Products

PowerFlex appliance connectivity

Products

PowerFlex rack, VxFlex Ready Nodes, PowerFlex custom node, PowerFlex appliance R650, PowerFlex appliance R6525, PowerFlex appliance R660, PowerFlex appliance R6625, Powerflex appliance R750, PowerFlex appliance R760, PowerFlex appliance R7625 , PowerFlex appliance R640, PowerFlex appliance R740XD, PowerFlex appliance R7525, PowerFlex appliance R840 ...
Article Properties
Article Number: 000225317
Article Type: Solution
Last Modified: 03 Feb 2025
Version:  3
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.