PowerFlex 3.X:在 NDU 期間,SDS 在執行 mgStorageRegion_CopyFromBuffer 功能時當機

Summary: SDS 在功能mgStorageRegion_CopyFromBuffer上當機

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Symptoms

 - 從 3.0 升級到 3.6

 - SDS 退出即時維護模式 (IMM)  

- 啟用了飛行中校驗和。  

SDS 當機並發生以下錯誤: 

2022/12/20 10:22:18.543129 Panic in file /data/build/workspace/ScaleIO-Common-Job/src/tgt/storage/mg_impl/mg_storage_region.c, line 4188, function mgStorageRegion_CopyFromBuffer, PID 13108.Panic Expression !(bufferSizeInBytes != ((sizeInLbs) * (512))) PANIC_ID_tgt_1517847817759.
/opt/emc/scaleio/sds/bin/sds-3.6.400.107(mosDbg_PanicPrepare+0x13a) [0x93ab8a]
/opt/emc/scaleio/sds/bin/sds-3.6.400.107(mgStorageRegion_CopyFromBuffer+0x1a7) [0x810477]
/opt/emc/scaleio/sds/bin/sds-3.6.400.107(mgStorageRegion_CopyFromCachedBuffer+0x33) [0x810873]
/opt/emc/scaleio/sds/bin/sds-3.6.400.107(storageRegion_CopyFromCachedBuffer+0xde) [0x4d3f3e]
/opt/emc/scaleio/sds/bin/sds-3.6.400.107(mgStorageRegion_ReadFromSyncBuf+0x45) [0x4d4a85]
/opt/emc/scaleio/sds/bin/sds-3.6.400.107(raidComb_ReadFromRemoteUntilEntireRegionIsAcquired+0x451) [0x74e021]
/opt/emc/scaleio/sds/bin/sds-3.6.400.107(raidComb_ReadVolumeFromRemote+0x3ae) [0x74e8ee]
/opt/emc/scaleio/sds/bin/sds-3.6.400.107(raidReverseRebuild_LoopAndCopy+0x140f) [0x75518f]
/opt/emc/scaleio/sds/bin/sds-3.6.400.107(raidReverseRebuild_Start+0xb56) [0x7583c6]
/opt/emc/scaleio/sds/bin/sds-3.6.400.107(raidSyncPool_StartJob+0x375) [0x5e3145]

在下列儲存集區上啟用了流動檢查總和功能:

Storage Pool durpsdsc7p8pool1 (Id: 469b0f6800000007) has 7 volumes and 236.9 TB (242550 GB) free net capacity. 2.5 PB (2524 TB) volume allocation limit.
...
Inflight Checksum: Enabled
...

 

影響

SDS 當機且無法結束 IMM。如果在升級期間,升級將無法繼續。

Cause

由於是邊緣情況,而且這是一個非常過時的版本,不再受支援,因此必須實施下面「解決方案」部分中描述的解決方法。

Resolution

停用 SDS 所貢獻的儲存集區上的流動檢查總和:

1) 查詢所有 SDS 以取得 SDS 名稱/ID:

[root@nestedsvm2 ~]# scli --query_all_sds
Query-all-SDS returned 3 SDS nodes.

Protection Domain 8eeacbf900000000 Name: pd1
SDS ID: ab471ceb00000002 Name: svm103 State: Connected, Joined IP: 15.15.15.103 Port: 7072 Version: 3.6.500
SDS ID: ab471cea00000001 Name: svm102 State: Connected, Joined IP: 15.15.15.102 Port: 7072 Version: 3.6.500
SDS ID: ab471ce800000000 Name: svm101 State: Connected, Joined IP: 15.15.15.101 Port: 7072 Version: 3.6.500

2) 依名稱/ID 查詢 SDS,並尋找「儲存集區:":

 注意:它提供設備的存儲池;在這種情況下,只有一個,那就是儲存集區名稱「sp1」。

[root@nestedsvm2 ~]# scli --query_sds --sds_id ab471ceb00000002 | grep -i 'storage pool:'
         1: Storage Pool: sp1  inflight requests factor: 115, inflight bandwidth factor 115
                Storage Pool: sp1, Capacity: 198 GB, State: Normal
                Storage Pool: sp1, Capacity: 98 GB, State: Normal
                Storage Pool: sp1, Capacity: 98 GB, State: Normal

3) 查詢儲存集區,並尋找「Inflight Checksum:":

[root@nestedsvm2 ~]# scli --query_storage_pool --storage_pool_name sp1 --protection_domain_name pd1 | grep -i 'Inflight checksum'
        Inflight Checksum: Enabled

3.a) 要禁用飛行校驗和,可以使用 scli 或在演示伺服器中完成。若要使用 scli 停用,請執行下列命令:

[root@nestedsvm2 ~]# scli --set_checksum_mode --protection_domain_name pd1 --storage_pool_name sp1 --disable_inflight_checksum
Checksum mode modified successfully

3.b) 在簡報伺服器中,導覽至「儲存集區」,>選取「儲存集區>修改>一般>」取消勾選「啟用外傳遞檢查總和」。 >  

 

 受影響的版本

PowerFlex 3.5.x

 PowerFlex 3.6.x

Affected Products

PowerFlex appliance connectivity

Products

PowerFlex rack, VxFlex Ready Nodes, PowerFlex custom node, PowerFlex appliance R650, PowerFlex appliance R6525, PowerFlex appliance R660, PowerFlex appliance R6625, Powerflex appliance R750, PowerFlex appliance R760, PowerFlex appliance R7625 , PowerFlex appliance R640, PowerFlex appliance R740XD, PowerFlex appliance R7525, PowerFlex appliance R840 ...
Article Properties
Article Number: 000225317
Article Type: Solution
Last Modified: 03 Feb 2025
Version:  3
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.