PowerProtect: Kubernetes backup failed with error 'controller pod is not running'
摘要: PPDM Kubernetes backup failed with error 'controller pod is not running'
本文适用于
本文不适用于
本文并非针对某种特定的产品。
本文并非包含所有产品版本。
症状
In the instance where this was observed, all PPDM Kubernetes backups start to fail after recovery of PPDM from its server disaster recovery backup. It could apply to other situation though.
Kubernetes backup failed with error 'controller pod is not running'.
Below error can be observed in logs:
2021-07-21T03:49:48.340Z ERROR [] [task-5011a057-340f-40fb-8cd8-12414685d058] [][][][TRACE_ID:a66ce529604914ad;JOB_ID:a9b8915af1637407][] [K8sHelperApi.isDone(90)] - Failed to wait on job com.emc.dpsg.ecdm.baseresourceservice.exception.ValidationServiceException: controller pod is not running
2021-07-21T03:50:14.065Z WARN [] [dsSource-plpd-testcluster] [][][][][] [c.e.b.c.s.p.K8sHealthMonitor.checkPodHealth(200)] - Controller Pod is down, cluster: <k8s_cluster_name>, age=PT153H49M43.065S
Output of command kubectl describe pod -n powerprotect for that k8s cluster:
powerprotect powerprotect-controller-666ffccbbf-p5rwh 0/1 ImagePullBackOff 0 6d12h
velero-ppdm backup-driver-587cfcdf59-2mc8p 1/1 Running 0 49d
velero-ppdm velero-5df5fcd896-p68rw 1/1 Running 0 49d
Kubernetes backup failed with error 'controller pod is not running'.
Below error can be observed in logs:
2021-07-21T03:49:48.340Z ERROR [] [task-5011a057-340f-40fb-8cd8-12414685d058] [][][][TRACE_ID:a66ce529604914ad;JOB_ID:a9b8915af1637407][] [K8sHelperApi.isDone(90)] - Failed to wait on job com.emc.dpsg.ecdm.baseresourceservice.exception.ValidationServiceException: controller pod is not running
2021-07-21T03:50:14.065Z WARN [] [dsSource-plpd-testcluster] [][][][][] [c.e.b.c.s.p.K8sHealthMonitor.checkPodHealth(200)] - Controller Pod is down, cluster: <k8s_cluster_name>, age=PT153H49M43.065S
Output of command kubectl describe pod -n powerprotect for that k8s cluster:
powerprotect powerprotect-controller-666ffccbbf-p5rwh 0/1 ImagePullBackOff 0 6d12h
velero-ppdm backup-driver-587cfcdf59-2mc8p 1/1 Running 0 49d
velero-ppdm velero-5df5fcd896-p68rw 1/1 Running 0 49d
原因
Powerprotect-controller pod is unable to pull required image from internet.
解决方案
1. Check if Kubernetes cluster can access Docker Hub at https://hub.docker.com/ and Quay at https://quay.io/ to pull required images.
2. If a Kubernetes cluster cannot access these sites due to firewall or other restrictions, you can pull these images to a local registry that the cluster can access. Please follow below procedure.
1). Create an application.properties file /usr/local/brs/lib/cndm/config/application.properties on
the PowerProtect Data Manager appliance with the following contents:
k8s.docker.registry=fqdn:port For example, k8s.docker.registry=artifacts.example.com:8446
k8s.image.pullsecrets=secret resource name Specify this entry only if you require an image pull secret.
2). Run cndm restart to apply the properties.
Note: See PPDM Administration and User Guide for more details.
3. As Kubernetes cluster has already been added as an asset source in PPDM GUI, a manual discovery of the Kubernetes cluster is required after step 1 or 2 is checked/performed.
2. If a Kubernetes cluster cannot access these sites due to firewall or other restrictions, you can pull these images to a local registry that the cluster can access. Please follow below procedure.
1). Create an application.properties file /usr/local/brs/lib/cndm/config/application.properties on
the PowerProtect Data Manager appliance with the following contents:
k8s.docker.registry=fqdn:port For example, k8s.docker.registry=artifacts.example.com:8446
k8s.image.pullsecrets=secret resource name Specify this entry only if you require an image pull secret.
2). Run cndm restart to apply the properties.
Note: See PPDM Administration and User Guide for more details.
3. As Kubernetes cluster has already been added as an asset source in PPDM GUI, a manual discovery of the Kubernetes cluster is required after step 1 or 2 is checked/performed.
受影响的产品
PowerProtect Data Manager文章属性
文章编号: 000190024
文章类型: Solution
上次修改时间: 27 8月 2022
版本: 6
从其他戴尔用户那里查找问题的答案
支持服务
检查您的设备是否在支持服务涵盖的范围内。