PowerFlex SDS panics due to memory allocation failures
Summary: SDS process can panic (or keep panicking continuously) due to memory allocation failures.
Symptoms
Scenario
Either insufficient memory can cause this problem on the SDS host (i.e. SVM memory assignment) or due to Operating System configuration.
Symptoms
The SDS process panics with the following backtrace:
01/12 22:26:55.091827 Panic in file /data/build/workspace/ScaleIO-SLES12-2/src/mos/usr/mos_utils.c, line 235, function mos_AllocPageAlignedOrPanic, PID 11191.Panic Expression pMem != ((void *)0) .
/opt/emc/scaleio/sds/bin/sds-2.6.0.133(mosDbg_PanicPrepare+0x11d) [0x4f713d]
/opt/emc/scaleio/sds/bin/sds-2.6.0.133(mos_AllocPageAlignedOrPanic+0x2d) [0x4fa95e]
/opt/emc/scaleio/sds/bin/sds-2.6.0.133(stmp_Allocate+0x110) [0x49c063]
/opt/emc/scaleio/sds/bin/sds-2.6.0.133(phyToothMap_HardenIntern+0x37b) [0x46edcf]
/opt/emc/scaleio/sds/bin/sds-2.6.0.133(phyToothMap_HardenAll+0x39) [0x46f5a3]
/opt/emc/scaleio/sds/bin/sds-2.6.0.133(phyDev_HardenCombArr+0x34) [0x464a31]
/opt/emc/scaleio/sds/bin/sds-2.6.0.133(phyComb_ReadTooth+0x6a) [0x4b2b59]
/opt/emc/scaleio/sds/bin/sds-2.6.0.133(raidMigrate_Start+0x5f0) [0x4bb9e2]
/opt/emc/scaleio/sds/bin/sds-2.6.0.133(raidSyncPool_StartJob+0x1cf) [0x489e7f]
/opt/emc/scaleio/sds/bin/sds-2.6.0.133() [0x50d008]
Depending on the OS settings, OOM (Out-of-memory) errors can be observed in the messages file.
Impact
This behavior can trigger a DATA_DEGRADED/DATA_FAILURE situation.
Cause
Insufficient memory on the SDS host In the vSphere environment please checks if there is enough RAM assigned to an SVM and if it is reserved. If yes, check memory configuration at the OS level:
Check the sysctl kernel parameters for overcommit of memory:
File path /etc/sysctl.conf
# sysctl -a |grep commit
vm.overcommit_memory = 2 (default is 0)
vm.overcommit_ratio = 50 (default is 50)
Resolution
This is not a ScaleIO issue. ScaleIO is working as Designed.
To check and/or modify the vm.overcommit settings follow these steps:
1. Log in to the SDS using SSH as root
2. Run cat /etc/sysctl.conf | grep "vm.overcommit"
Ex.
[root@sds-node logs]# cat /etc/sysctl.conf | grep "vm.overcommit" vm.overcommit_memory = 2 vm.overcommit_ratio = 503, Run the following commands
sed -i 's/vm\.overcommit_memory = .*/vm\.overcommit_memory = 2/g' /etc/sysctl.conf sed -i 's/vm\.overcommit_ratio = .*/vm\.overcommit_ratio = 100/g' /etc/sysctl.conf sysctl -p
Validation
[root@sds-node logs]# cat /etc/sysctl.conf | grep "vm.overcommit" vm.overcommit_memory = 2 vm.overcommit_ratio = 100
Repeat these steps on all impacted SDSs in the environment to ensure that they are set to the recommended best practice settings. You do not need to place the SDS into maintenance mode to perform this operation.