PowerFlex 3.X 在 NDU 期间执行mgStorageRegion_CopyFromBuffer功能时 SDS 崩溃
Summary: SDS 在功能mgStorageRegion_CopyFromBuffer时崩溃
This article applies to
This article does not apply to
This article is not tied to any specific product.
Not all product versions are identified in this article.
Symptoms
- 从 3.0 升级到 3.6
- SDS 退出即时维护模式 (IMM)
- 启用了动态校验和。
SDS 崩溃并出现以下死机:
2022/12/20 10:22:18.543129 Panic in file /data/build/workspace/ScaleIO-Common-Job/src/tgt/storage/mg_impl/mg_storage_region.c, line 4188, function mgStorageRegion_CopyFromBuffer, PID 13108.Panic Expression !(bufferSizeInBytes != ((sizeInLbs) * (512))) PANIC_ID_tgt_1517847817759. /opt/emc/scaleio/sds/bin/sds-3.6.400.107(mosDbg_PanicPrepare+0x13a) [0x93ab8a] /opt/emc/scaleio/sds/bin/sds-3.6.400.107(mgStorageRegion_CopyFromBuffer+0x1a7) [0x810477] /opt/emc/scaleio/sds/bin/sds-3.6.400.107(mgStorageRegion_CopyFromCachedBuffer+0x33) [0x810873] /opt/emc/scaleio/sds/bin/sds-3.6.400.107(storageRegion_CopyFromCachedBuffer+0xde) [0x4d3f3e] /opt/emc/scaleio/sds/bin/sds-3.6.400.107(mgStorageRegion_ReadFromSyncBuf+0x45) [0x4d4a85] /opt/emc/scaleio/sds/bin/sds-3.6.400.107(raidComb_ReadFromRemoteUntilEntireRegionIsAcquired+0x451) [0x74e021] /opt/emc/scaleio/sds/bin/sds-3.6.400.107(raidComb_ReadVolumeFromRemote+0x3ae) [0x74e8ee] /opt/emc/scaleio/sds/bin/sds-3.6.400.107(raidReverseRebuild_LoopAndCopy+0x140f) [0x75518f] /opt/emc/scaleio/sds/bin/sds-3.6.400.107(raidReverseRebuild_Start+0xb56) [0x7583c6] /opt/emc/scaleio/sds/bin/sds-3.6.400.107(raidSyncPool_StartJob+0x375) [0x5e3145]
在存储池上启用了动态校验和功能:
Storage Pool durpsdsc7p8pool1 (Id: 469b0f6800000007) has 7 volumes and 236.9 TB (242550 GB) free net capacity. 2.5 PB (2524 TB) volume allocation limit. ... Inflight Checksum: Enabled ...
影响
SDS 崩溃,无法退出 IMM。如果在升级过程中,升级将无法继续。
Cause
由于这是一种极端情况,而且这是一个非常过时的版本,不再受支持,因此必须实施下面的“解决方案”部分中所述的解决方法。
Resolution
在 SDS 有助于以下目标的存储池上禁用动态校验和:
1) 查询所有 SDS 以获取 SDS 名称/id:
[root@nestedsvm2 ~]# scli --query_all_sds Query-all-SDS returned 3 SDS nodes. Protection Domain 8eeacbf900000000 Name: pd1 SDS ID: ab471ceb00000002 Name: svm103 State: Connected, Joined IP: 15.15.15.103 Port: 7072 Version: 3.6.500 SDS ID: ab471cea00000001 Name: svm102 State: Connected, Joined IP: 15.15.15.102 Port: 7072 Version: 3.6.500 SDS ID: ab471ce800000000 Name: svm101 State: Connected, Joined IP: 15.15.15.101 Port: 7072 Version: 3.6.500
2) 按名称/ID 查询 SDS,并查找“Storage Pool:":
提醒:它向其提供设备的存储池;在这种情况下,只有一个,即存储池名称“SP1”。
[root@nestedsvm2 ~]# scli --query_sds --sds_id ab471ceb00000002 | grep -i 'storage pool:'
1: Storage Pool: sp1 inflight requests factor: 115, inflight bandwidth factor 115
Storage Pool: sp1, Capacity: 198 GB, State: Normal
Storage Pool: sp1, Capacity: 98 GB, State: Normal
Storage Pool: sp1, Capacity: 98 GB, State: Normal
3) 查询存储池并查找“Inflight Checksum:":
[root@nestedsvm2 ~]# scli --query_storage_pool --storage_pool_name sp1 --protection_domain_name pd1 | grep -i 'Inflight checksum'
Inflight Checksum: Enabled
3.a) 要禁用动态校验和,可以使用 scli 或在演示服务器中完成。要使用 scli 禁用它,请运行以下命令:
[root@nestedsvm2 ~]# scli --set_checksum_mode --protection_domain_name pd1 --storage_pool_name sp1 --disable_inflight_checksum Checksum mode modified successfully
3.b) 在演示服务器中,导航到“存储池”, > 选择“存储池 > 修改 > 常规 > ”,取消选中“启用动态校验和” > 应用”。
受影响的版本
PowerFlex 3.5.x
PowerFlex 3.6.x
Affected Products
PowerFlex appliance connectivityProducts
PowerFlex rack, VxFlex Ready Nodes, PowerFlex custom node, PowerFlex appliance R650, PowerFlex appliance R6525, PowerFlex appliance R660, PowerFlex appliance R6625, Powerflex appliance R750, PowerFlex appliance R760, PowerFlex appliance R7625
, PowerFlex appliance R640, PowerFlex appliance R740XD, PowerFlex appliance R7525, PowerFlex appliance R840
...
Article Properties
Article Number: 000225317
Article Type: Solution
Last Modified: 03 Feb 2025
Version: 3
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.