PowerFlex 4.x: How to Rotate RKE2 Certificates on MVM Nodes
Summary: RKE2 Certificates should be renewed automatically but in rare cases, this renewal does not happen. This procedure covers how to Rotate RKE2 Certificates on MVM Nodes.
Instructions
Procedure:
1. SSH into any MVM node
2. Verify that all nodes are in the Ready status:
kubectl get node
3. List all Postgres database instances and identify the pod name with the Leader role:
kubectl exec -n powerflex -c database $(kubectl get pods -n powerflex -l='postgres-operator.crunchydata.com/role=master, postgres-operator.crunchydata.com/instance-set' | grep Running | cut -d' ' -f1) -- sh -c 'patronictl list'
Example output:
+ Cluster: postgres-ha-ha +------------------------------------------+--------------+-----------+----+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+-------------------------+------------------------------------------+--------------+-----------+----+-----------+
| postgres-ha-cmo1-58bs-0 | postgres-ha-cmo1-58bs-0.postgres-ha-pods | Sync Standby | streaming | 7 | 0 |
| postgres-ha-cmo1-lvkz-0 | postgres-ha-cmo1-lvkz-0.postgres-ha-pods | Sync Standby | streaming | 7 | 0 |
| postgres-ha-cmo1-v7zg-0 | postgres-ha-cmo1-v7zg-0.postgres-ha-pods | Leader | running | 7 | | ← This is the Leader pod
+-------------------------+------------------------------------------+--------------+-----------+----+-----------+
4. Identify which MVM is running the PostgreSQL Leader pod:
for x in `kubectl get pods -n powerflex | grep "postgres-ha-cmo" | awk '{print $1}'`; do echo $x; kubectl get pods -n powerflex $x -o json | grep '"nodeName"' | cut -d ':' -f2; echo " "; done
Example output:
postgres-ha-cmo1-58bs-0
"pfmp-mvm3-c",
postgres-ha-cmo1-lvkz-0
"pfmp-mvm2-c",
postgres-ha-cmo1-v7zg-0
"pfmp-mvm1-c", ← This is the MVM running the Leader pod
5. SSH into the target MVM
・ Nodes currently in the NotReady state should be updated first.
・ The MVM running the PostgreSQL Leader pod must be updated last.
・ The other nodes can be updated in any order.
6. Switch to the root user:
sudo -s
7. Label the target MVM for maintenance:
kubectl label node <Target-MVM> cmo.maintenance.mode=true
Success Example output:
node/pfmp-mvm2-c labeled
Error Example output:
error: 'cmo.maintenance.mode' already has a value (false), and --overwrite is false
You can check whether the maintenance label is set to true using the following command:
kubectl describe nodes <Target-MVM> | grep cmo.maintenance.mode
Note: Edit Target-MVM
Example output:
cmo.maintenance.mode=true
kubectl label node <Target-MVM> cmo.maintenance.mode-
8. Drain the pods from the target MVM:
kubectl drain <Target-MVM> --ignore-daemonsets --delete-emptydir-data
Example output:
node/pfmp-mvm2-c cordoned
Warning: ignoring DaemonSet-managed Pods: calico-system/calico-node-6b28t, kube-system/rke2-ingress-nginx-controller-nzdw4, kube-system/rke2-multus-ds-7qj7z, powerflex/logging-fluent-bit-hg4jx, powerflex/metallb-speaker-dgsm7, powerflex/powerflex-status-service-r2phw, powerflex/powerflex-status-ui-vb4f5
evicting pod powerflex/vault-2
evicting pod powerflex/block-legacy-gateway-0
...
node/<Node Name> drained ← This message confirms successful draining
9. Confirm that the node status is Ready, SchedulingDisabled:
kubectl get node
Example output:
NAME STATUS ROLES AGE VERSION
pfmp-mvm1-c Ready control-plane,etcd,master 93d v1.25.3+rke2r1
pfmp-mvm2-c Ready,SchedulingDisabled control-plane,etcd,master 93d v1.25.3+rke2r1
pfmp-mvm3-c Ready control-plane,etcd,master 93d v1.25.3+rke2r1
10. Stop the RKE2 service on the Target MVM:
・Confirm the hostname:
hostname
・Stop the RKE2 service:
systemctl stop rke2-server
11. Rotate the certificates:
rke2 certificate rotate
Example output:
INFO[0000] Server detected, rotating server certificates
INFO[0000] Rotating certificates for admin service
INFO[0000] Rotating certificates for etcd service
INFO[0000] Rotating certificates for api-server service
INFO[0000] Rotating certificates for controller-manager service
INFO[0000] Rotating certificates for cloud-controller service
INFO[0000] Rotating certificates for scheduler service
INFO[0000] Rotating certificates for rke2-server service
INFO[0000] Rotating dynamic listener certificate
INFO[0000] Rotating certificates for rke2-controller service
INFO[0000] Rotating certificates for auth-proxy service
INFO[0000] Rotating certificates for kubelet service
INFO[0000] Rotating certificates for kube-proxy service
INFO[0000] Successfully backed up certificates for all services to path /var/lib/rancher/rke2/server/tls-1748242637, please restart rke2 server or agent to rotate certificates
12. Restart the RKE2 service:
systemctl start rke2-server
13. Once the prompt returns, remove the maintenance label and uncordon the node:
kubectl uncordon <Target-MVM> ; kubectl label node <Target-MVM> cmo.maintenance.mode-
14. After waiting 10–30 minutes, verify the following:
14-1. Certificate expiration dates are updated (about 1 year ahead):
openssl x509 -in /var/lib/rancher/rke2/server/tls/etcd/client.crt -noout -dates
openssl x509 -in /var/lib/rancher/rke2/server/tls/etcd/server-client.crt -noout -dates
openssl x509 -in /var/lib/rancher/rke2/server/tls/etcd/peer-server-client.crt -noout -dates
14-2. All nodes are in Ready status:
kubectl get node
14-3. No pods are in abnormal status:
kubectl get pod -A -o wide | egrep -v "Running|Completed"
15. If all checks pass, proceed with the next MVM starting from Step 5.
Complete the process on one MVM before moving to the next.
The MVM running the PostgreSQL Leader pod must be updated last.
When you have completed the procedure on all three MVMs, run the command in step 3 to verify postgres database health. One pod should be the Leader and in a state of running. There should be 0MB Lag and both Sync Standby Members should have a state of streaming.
Healthy Database Example:
+ Cluster: postgres-ha-ha +------------------------------------------+--------------+-----------+----+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+-------------------------+------------------------------------------+--------------+-----------+----+-----------+
| postgres-ha-cmo1-58bs-0 | postgres-ha-cmo1-58bs-0.postgres-ha-pods | Sync Standby | streaming | 7 | 0 |
| postgres-ha-cmo1-lvkz-0 | postgres-ha-cmo1-lvkz-0.postgres-ha-pods | Sync Standby | streaming | 7 | 0 |
| postgres-ha-cmo1-v7zg-0 | postgres-ha-cmo1-v7zg-0.postgres-ha-pods | Leader | running | 7 | |
+-------------------------+------------------------------------------+--------------+-----------+----+-----------+