CSM PowerFlex node pods are stuck at init:0/2
Summary: Container Storage Modules (CSM) PowerFlex node pods are stuck at init:0/2 after hosts were reboot due to some changes applied on them.
Symptoms
All PowerFlex node pods are stuck at init:0/2 and reported error "Warning FailedMount 8s (x6 over 23s) kubelet MountVolume.SetUp failed for volume "scaleio-path-bin" : mkdir /bin/emc: read-only file system " in pod description.
The issue occurred with CSM Operator 1.9 and 1.8.1 with Container Storage Interface (CSI) Driver for PowerFlex 2.13.1 and 2.14.
Cause
The root cause for this issue is that the CSM operator is unable to determine if it is running in an OpenShift environment during initialization. It caches that information for later use. It is likely that at the time of operator starting, the following command did not return the expected result:
oc get --raw /apis | jq | grep "security.openshift.io"
This suggests that the security.openshift.io Application Programming Interface (API) group may not have been available at that moment.
In the operator log, if you see [isOpenShift err false], it means that the operator is unable to determine the OpenShift Environment.
2025-06-17T08:45:38.167Z INFO workspace/main.go:99 isOpenShift err false {"TraceId": "main"}2025-06-17T08:45:38.168Z INFO workspace/main.go:105 Kubernetes environment {"TraceId": "main"}
The correct log message to determine the OpenShift environment is:
2025-06-19T00:03:14.913Z INFO workspace/main.go:138 Openshift environment {"TraceId": "main"}
Resolution
- Manually run the following command:
oc get --raw /apis | jq | grep "security.openshift.io"
and
oc auth can-i get /apis/security.openshift.io --as=system:serviceaccount:dell-csm-operator:dell-csm-operator-manager-service-account
- The expected result is
# oc get --raw /apis | jq | grep "security.openshift.io" "name": "security.openshift.io", "groupVersion": "security.openshift.io/v1", "groupVersion": "security.openshift.io/v1",
# oc auth can-i get /apis/security.openshift.io --as=system:serviceaccount:dell-csm-operator:dell-csm-operator-manager-service-accountyes
3. If you see above output, then restart the CSM operator:
oc delete pod dell-csm-operator-controller-manager-xxx-xxx -n openshift-operators
4. If the output is not the same as above, ask the customer to engage Red Hat support to check further.