Dell APEX Cloud Platform for Red Hat OpenShift: PowerFlex CSI pods fail to start due to FailedMount error
Summary: When running Dell APEX Cloud Platform (ACP) for Red Hat OpenShift with PowerFlex storage, you may experience an issue where PowerFlex CSI pods fail to start due to a FailedMount error caused by a read-only file system. ...
Symptoms
In ACP for Red Hat OpenShift environments running with PowerFlex storage, when using "oc" command to review the pod status within the "vxflexos" namespace, you may find the "vxflexos-node-xxxxx" pods are not in a Running state.
For example:
$ oc -n vxflexos get pod NAME READY STATUS RESTARTS AGE vxflexos-controller-6b6fd787fd-j9cjb 5/5 Running 5 170m vxflexos-controller-6b6fd787fd-lv42q 5/5 Running 5 145m vxflexos-node-2lbwz 0/2 Init:0/2 0 87m vxflexos-node-4px4j 0/2 Init:0/2 0 86m vxflexos-node-sw9vt 0/2 Init:0/2 0 87m
When describing the problematic pod, you can find a "FailedMount" warning due to "read-only file system."
For example:
$ oc -n vxflexos describe pod vxflexos-node-4px4j ...skip... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 87m default-scheduler Successfully assigned vxflexos/vxflexos-node-4px4j to c2-esx03.racka05.local Warning FailedMount 116s (x50 over 87m) kubelet MountVolume.SetUp failed for volume "scaleio-path-bin" : mkdir /bin/emc: read-only file system
Cause
A CSM and PowerFlex CSI issue that the CSM Operator mistakenly identified the platform as Kubernetes instead of OpenShift.
Resolution
The workaround is to first restart the CSM Operator pod, then restart the PowerFlex Node pod.
-
Restart the CSM Operator pod.
$ oc get pod -n openshift-operators $ oc delete pod dell-csm-operator-controller-manager-<pod-random-suffix> -n openshift-operators
Note: The <pod-random-suffix> placeholder in the second command should be replaced with the value obtained from the output of the first command. -
Restart the PowerFlex Node pod.
$ oc -n vxflexos rollout restart daemonset vxflexos-node
-
Check the PowerFlex CSI pods are all in Running status.
$ oc get pod -n vxflexos NAME READY STATUS RESTARTS AGE vxflexos-controller-6b6fd787fd-jb4zd 5/5 Running 5 3d8h vxflexos-controller-6b6fd787fd-nqxsn 5/5 Running 0 3d7h vxflexos-node-8s89l 2/2 Running 2 3d10h vxflexos-node-cs7fk 2/2 Running 2 3d10h vxflexos-node-fk2wn 2/2 Running 2 3d10h
-
Check the pods under the dell-acp namespace are all in Running status.
$ oc get pod -n dell-acp NAME READY STATUS RESTARTS AGE istio-ingressgateway-79fff9fd9d-8r9qq 1/1 Running 0 4d21h mcp-agent-ocp-dknx9 2/2 Running 2 5d23h mcp-agent-ocp-fkggp 2/2 Running 2 5d23h mcp-agent-ocp-rcqg9 2/2 Running 2 6d mcp-agent-ocp-tt9kj 2/2 Running 2 6d mcp-agent-ocp-z2mvx 2/2 Running 2 5d23h mcp-agent-ocp-zj8d6 2/2 Running 2 6d mcp-agentregistry-ocp-7c66d8597d-kln44 2/2 Running 0 4d21h mcp-bootstrap-controller-77c466b74-cvfbh 2/2 Running 0 4d22h mcp-cce-55857f9b9-hms6c 3/3 Running 0 4d21h mcp-certificate-management-ocp-6ccf7dcd65-gx2xc 2/2 Running 0 4d21h mcp-cluster-operator-ocp-6dbff499f6-rpqnl 7/7 Running 0 4d21h mcp-compute-cluster-operator-ocp-9fff87844-b7b7c 2/2 Running 0 4d21h mcp-day1-bringup-ocp-54467b76b8-pk2n9 2/2 Running 0 4d22h mcp-depot-manager-69b774b845-kh4z9 2/2 Running 0 4d21h mcp-depot-manager-image-holder-fkx6g 2/2 Running 2 5d23h mcp-depot-manager-image-holder-hmjx6 2/2 Running 2 5d23h mcp-depot-manager-image-holder-lnvkk 2/2 Running 2 5d23h mcp-depot-manager-image-holder-pd74s 2/2 Running 2 6d mcp-depot-manager-image-holder-x65kg 2/2 Running 2 6d mcp-depot-manager-image-holder-xlcld 2/2 Running 2 6d mcp-discovery-adapter-9b674f897-psvqx 2/2 Running 0 4d21h mcp-eservice-f5d99dfbc-wkwl4 2/2 Running 0 4d21h mcp-event-distributor-ocp-57cf5f956b-r59gk 2/2 Running 0 4d21h mcp-event-doctor-ocp-76c8657b95-59m22 2/2 Running 0 4d22h mcp-event-transformer-844d5fd6d7-bwtdt 2/2 Running 0 4d21h mcp-event-transformer-ocp-56668b48bb-vx9wr 2/2 Running 0 4d21h mcp-event-transformer-pe-8584865795-kkxd6 2/2 Running 0 4d22h mcp-fermion-5d7d4b5b5-qmrtt 3/3 Running 0 4d22h mcp-infrastructure-view-ocp-dbddd6774-h8xjf 2/2 Running 0 4d21h mcp-job-manager-6bf4b96df4-pvjpl 2/2 Running 0 4d22h mcp-kgs-56cb5c547f-6cn5f 2/2 Running 0 4d21h mcp-kgs-ocp-78698d9b97-5fb7g 2/2 Running 0 4d22h mcp-lcm-orchestrator-687565898d-9ktp6 2/2 Running 0 4d22h mcp-log-64t8v 2/2 Running 3 (4d22h ago) 5d23h mcp-log-84m6f 2/2 Running 3 (4d22h ago) 6d mcp-log-8lpsn 2/2 Running 3 (4d22h ago) 5d23h mcp-log-mmswb 2/2 Running 3 (4d21h ago) 6d mcp-log-n8blb 2/2 Running 3 (4d22h ago) 5d23h mcp-log-q4p5f 2/2 Running 3 (4d21h ago) 6d mcp-manager-ocp-mcp-console-plugin-server-ocp-69b8c6654b-lzv25 2/2 Running 0 4d21h mcp-manager-ocp-mcp-deployment-wizard-7dd7c9694c-pjsj9 2/2 Running 0 4d22h mcp-mgmtsvc-operator-standalone-686c8564d8-nrrjm 2/2 Running 0 4d22h mcp-mq-0 2/2 Running 0 4d21h mcp-operation-lock-59b4f7866c-jfwgw 3/3 Running 0 4d22h mcp-operator-installer-ocp-85776fdc57-8dbb4 2/2 Running 0 4d21h mcp-powerflex-operation-5b54dfc8d4-dtm27 2/2 Running 0 4d22h mcp-rcs-564c77bbc7-84vlh 3/3 Running 0 4d22h mcp-security-ocp-74bcf55b5-mbcbj 2/2 Running 0 4d22h mcp-storage-ocp-7d87476b9c-k4wkb 2/2 Running 0 4d21h mcp-support-76867cc955-tdbg6 2/2 Running 0 4d21h mcp-telemetry-56dbc6c654-5rqpn 4/4 Running 0 4d22h