PowerFlex 4.X: How to Perform a Graceful Reboot on PFMP MVM Management VMs
Summary: Details the graceful reboot of PowerFlex Management Platform (PFMP) VMs for version 4.X, including labeling, draining, and rebooting MVMs while keeping two nodes active and checking PostgreSQL health. In the scope of this procedure, MVM1 is the postgres Leader. It is drained and rebooted last. ...
Instructions
Rebooting MVM VMs can solve some Pod problems, deployment failures and other errors
The commands in this procedure are run from a root bash shell. To mirror the steps below, log in to the MVMs using delladmin then run sudo -s to switch to a new root shell.
Example:
delladmin@pfmp-mvm03:~> whoami delladmin delladmin@pfmp-mvm03:~> sudo -s pfmp-mvm03:/home/delladmin # whoami root
- List all Postgres database instances and identify the Pod name with the Leader role, leader node should be last node to be drained and rebooted:
-
- PFMP 4.6
kubectl exec -n powerflex -c database $(kubectl get pods -n powerflex -l='postgres-operator.crunchydata.com/role=master, postgres-operator.crunchydata.com/instance-set' | grep Running | cut -d' ' -f1) -- sh -c 'patronictl list'
Run the following command to identify which MVM is running the Postgres Leader Pod. This is the last node to be drained and rebooted:
for x in `kubectl get pods -n powerflex | grep "postgres-ha-cmo" |awk '{print $1}'` ; do echo $x; kubectl get pods -n powerflex $x -o json | grep '"nodeName"' | cut -d ':' -f2 ; echo " "; done
-
- PFMP 4.8
kubectl exec -it -n powerflex $(kubectl get pods -n powerflex | grep postgres-monitor | awk '{print $1'}) -- kubectl cnpg status postgres-ha-cnpg
Example output
delladmin@node2:~> kubectl exec -it -n powerflex $(kubectl get pods -n powerflex | grep postgres-monitor | awk '{print $1'}) -- kubectl cnpg status postgres-ha-cnpg
Cluster Summary
Name powerflex/postgres-ha-cnpg
System ID: 7570829541331841052
PostgreSQL Image: dockerrepo:30500/cnpg/cnpg-postgres:14.18-22-53.6b63004-22-9.0
Primary instance: postgres-ha-cnpg-1
Primary start time: 2025-11-09 22:33:09 +0000 UTC (uptime 3803h6m21s)
Status: Cluster in healthy state
Instances: 3
Ready instances: 3
Size: 11G
Current Write LSN: B/6211B568 (Timeline: 3 - WAL File: 000000030000000B00000062)
Continuous Backup status
Not configured
Streaming Replication status
Replication Slots Enabled
Name Sent LSN Write LSN Flush LSN Replay LSN Write Lag Flush Lag Replay Lag State Sync State Sync Priority Replication Slot
---- -------- --------- --------- ---------- --------- --------- ---------- ----- ---------- ------------- ----------------
postgres-ha-cnpg-2 B/6211B568 B/6211B568 B/6211B568 B/6211B568 00:00:00.000365 00:00:00.001507 00:00:00.001618 streaming async 0 active
postgres-ha-cnpg-3 B/6211B568 B/6211B568 B/6211B568 B/6211B568 00:00:00.000321 00:00:00.001511 00:00:00.001575 streaming async 0 active
Instances status
Name Current LSN Replication role Status QoS Manager Version Node
---- ----------- ---------------- ------ --- --------------- ----
postgres-ha-cnpg-1 B/6211B568 Primary OK Burstable 1.26.1 pfmp-mvm01
postgres-ha-cnpg-2 B/6211B568 Standby (async) OK Burstable 1.26.1 pfmp-mvm02
postgres-ha-cnpg-3 B/6211B568 Standby (async) OK Burstable 1.26.1 pfmp-mvm03
- Open a terminal to the MVM3( one of the non-leader nodes). Run the following command:
kubectl get nodes
- Label MVM3 for maintenance:
kubectl label node pfmp-mvm03 cmo.maintenance.mode=true
- Drain node MVM03 where running Pods are gracefully evicted from the node. The Pods schedule and run on a different node. When the drain process completes, the node reboots. Wait for the node to come back up.
- Run the following command to drain the node:
kubectl drain pfmp-mvm03 --ignore-daemonsets --delete-emptydir-data
- Once the node is drained, reboot the node:
sudo reboot
- SSH to MVM02 and run the following command to monitor the node you rebooted to reach a STATUS of Ready:
watch kubectl get nodes
- Once MVM03 reports a Ready STATUS, SSH to MVM03 and perform the following command to uncordon and remove the maintenance label.
kubectl uncordon pfmp-mvm03 ; kubectl label node pfmp-mvm03 cmo.maintenance.mode-
Note: The "-" after
cmo.maintenance.mode in the command above is very important. Do not forget to include the DASH symbol. This is required to remove the label from the node.
- Wait 5 to 20 minutes, then run the command in step 1 to view the database cluster health. You can repeat the steps for the next MVM once the output matches the Healthy Database Example below.
- Repeat steps 3-8 on MVM02, then MVM01.
When you have completed the procedure on all three MVMs, run the command in step 1 to verify postgres database health. One Pod should be the Leader and in a state of running. There should be 0MB Lag and both Sync Standby Members should have a state of streaming.
Healthy Database Example PFMP 4.6:
+ Cluster: postgres-ha-ha +------------------------------------------+--------------+-----------+----+-----------+ | Member | Host | Role | State | TL | Lag in MB | +-------------------------+------------------------------------------+--------------+-----------+----+-----------+ | postgres-ha-cmo1-8t2v-0 | postgres-ha-cmo1-8t2v-0.postgres-ha-pods | Leader | running | 10 | | | postgres-ha-cmo1-h4hx-0 | postgres-ha-cmo1-h4hx-0.postgres-ha-pods | Sync Standby | streaming | 10 | 0 | | postgres-ha-cmo1-pb88-0 | postgres-ha-cmo1-pb88-0.postgres-ha-pods | Sync Standby | streaming | 10 | 0 | +-------------------------+------------------------------------------+--------------+-----------+----+-----------+