Jono_A
2 Bronze

PowerFlex controller crashloopbackoff

Hi Team,

I'm currently running the following:

CentOS 7.9, CSI: 1.4, K8s: 1.20, SDC: 3.5.1.1

I followed the instructions in the latest release of the CSI plugin but I get the following "CrashLoopBackOff" errors:

vxflexos snapshot-controller-0                          1/1 Running                   0 3h39m
vxflexos vxflexos-controller-654d7445fc-4zbkl  4/5 CrashLoopBackOff 60 3h39m
vxflexos vxflexos-controller-654d7445fc-tnd8w 4/5 CrashLoopBackOff 57 3h39m
vxflexos vxflexos-node-89flx                              2/2 Running                   0 3h39m
vxflexos vxflexos-node-8dpf7                             2/2 Running                   0 3h39m
vxflexos vxflexos-node-hkkws                            2/2 Running                   0 3h39m

I also noticed this:

default csi-snapshotter-0                                    3/3 Running                    3h44m <--- shouldn't this be in the same namespace as the vxflexos controller?

and when I run kubectl logs csi-snapshotter-0 csi-snapshotter, I get these errors:

I0407 01:41:16.751963 1 reflector.go:255] Listing and watching *v1.VolumeSnapshotContent from github.com/kubernetes-csi/external-snapshotter/client/v4/informers/externalversions/factory.go:117
E0407 01:41:16.763315 1 reflector.go:138] github.com/kubernetes-csi/external-snapshotter/client/v4/informers/externalversions/factory.go:117: Failed to watch *v1.VolumeSnapshotContent: failed to list *v1.VolumeSnapshotContent: volumesnapshotcontents.snapshot.storage.k8s.io is forbidden: User "system:serviceaccount:default:csi-snapshotter" cannot list resource "volumesnapshotcontents" in API group "snapshot.storage.k8s.io" at the cluster scope
I0407 01:41:30.286330 1 reflector.go:255] Listing and watching *v1.VolumeSnapshotClass from github.com/kubernetes-csi/external-snapshotter/client/v4/informers/externalversions/factory.go:117
E0407 01:41:30.293416 1 reflector.go:138] github.com/kubernetes-csi/external-snapshotter/client/v4/informers/externalversions/factory.go:117: Failed to watch *v1.VolumeSnapshotClass: failed to list *v1.VolumeSnapshotClass: volumesnapshotclasses.snapshot.storage.k8s.io is forbidden: User "system:serviceaccount:default:csi-snapshotter" cannot list resource "volumesnapshotclasses" in API group "snapshot.storage.k8s.io" at the cluster scope

I've downloaded the files from here: https://github.com/kubernetes-csi/external-snapshotter/tree/v4.0.0/client/config/crd

and just ran kubectl create -f snapshot.storage.k8s.io_volumesnapshotclasses.yaml for each one. 

Hope someone can help resolve these issues? Thanks!

Jono

0 Kudos
4 Replies
Flo_csI
3 Argentum

Re: PowerFlex controller crashloopbackoff

Hi @Jono_A,

Can you confirm you have created the RBAC rules with kubectl create -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/v4.0.0/deploy/kubernetes/snaps... ?

The snapshotter is a common component for every CSI snapshot-controller. You need only one for your cluster therefore, I wouldn't put it in vxflexos namespace.

If you want to change the namespace for another namespace like kube-system, you will have to edit the two files under: https://github.com/kubernetes-csi/external-snapshotter/tree/v4.0.0/deploy/kubernetes/snapshot-contro... 

FYI in Openshift, that common snapshotter is installed under the namespace: openshift-cluster-storage-operator.

 

Let us know if that solves the issue.

0 Kudos
Jono_A
2 Bronze

Re: PowerFlex controller crashloopbackoff

Hi Flo_csl,

Yep, can confirm that RBAC rules has been created:

[jono@K8s-Master-171 ~]$ kubectl create -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/v4.0.0/deploy/kubernetes/snaps... Error from server (AlreadyExists): error when creating "https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/v4.0.0/deploy/kubernetes/snaps...": serviceaccounts "snapshot-controller" already exists Error from server (AlreadyExists): error when creating "https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/v4.0.0/deploy/kubernetes/snaps...": clusterroles.rbac.authorization.k8s.io "snapshot-controller-runner" already exists Error from server (AlreadyExists): error when creating "https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/v4.0.0/deploy/kubernetes/snaps...": clusterrolebindings.rbac.authorization.k8s.io "snapshot-controller-role" already exists Error from server (AlreadyExists): error when creating "https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/v4.0.0/deploy/kubernetes/snaps...": roles.rbac.authorization.k8s.io "snapshot-controller-leaderelection" already exists Error from server (AlreadyExists): error when creating "https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/v4.0.0/deploy/kubernetes/snaps...": rolebindings.rbac.authorization.k8s.io "snapshot-controller-leaderelection" already exists

However, I ran kubectl logs vxflexos-controller-66bcc75977-455bd -n vxflexos driver and found this error:

time="2021-04-07T09:07:07Z" level=fatal msg="grpc failed" error="rpc error: code = FailedPrecondition desc = All arrays are not working. Could not proceed further: map[PowerFlex-SantaClara:rpc error: code = FailedPrecondition desc = unable to login to VxFlexOS Gateway: Unauthorized]"

but i can confirm that the username and password is correct in my config.json file.

Anything else i should look for?

0 Kudos
Jono_A
2 Bronze

Re: PowerFlex controller crashloopbackoff

btw, i did start from scratch again hence why i'm getting a different error now.

0 Kudos
Jono_A
2 Bronze

Re: PowerFlex controller crashloopbackoff

ok solved the issue...in the config.json i had to change the end point to 'https' instead of 'http'...