PowerFlex 4.x: PFMP MVM Management Nodes Graceful Reboot Procedure
Summary: This procedure covers how to perform a graceful reboot on a management node. This process ensures that the PowerFlex Management Platform (PFMP) stays running throughout the process. Maintenance is performed on one Management node at a time. In the scope of this procedure, MVM1 is the postgres Leader. It is drained and rebooted last. ...
This article applies to
This article does not apply to
This article is not tied to any specific product.
Not all product versions are identified in this article.
Instructions
Note: Exercise caution when performing this procedure. Two Management virtual machine (MVM) nodes must be up and running to maintain PFMP functionality.
Rebooting MVM VMs can solve some pod problems, deployment failures and other errors
The commands in this procedure are run from a root bash shell. To mirror the steps below, log in to the MVMs using delladmin then run sudo -s to switch to a new root shell.
Example:
delladmin@pfmp-mvm03:~> whoami delladmin delladmin@pfmp-mvm03:~> sudo -s pfmp-mvm03:/home/delladmin # whoami root
Procedure:
- List all Postgres database instances and identify the pod name with the Leader role:
kubectl exec -n powerflex -c database $(kubectl get pods -n powerflex -l='postgres-operator.crunchydata.com/role=master, postgres-operator.crunchydata.com/instance-set' | grep Running | cut -d' ' -f1) -- sh -c 'patronictl list'
- Run the following command to identify which MVM is running the Postgres Leader pod. This is the last node to be drained and rebooted:
for x in `kubectl get pods -n powerflex | grep "postgres-ha-cmo" |awk '{print $1}'` ; do echo $x; kubectl get pods -n powerflex $x -o json | grep '"nodeName"' | cut -d ':' -f2 ; echo " "; done
- Open a terminal to the MVM3. Run the following command:
kubectl get nodes
- Label MVM3 for maintenance:
kubectl label node pfmp-mvm03 cmo.maintenance.mode=true
- Drain node MVM03 where running pods are gracefully evicted from the node. The pods schedule and run on a different node. When the drain process completes, the node reboots. Wait for the node to come back up.
Note: In Linux, if you run two commands joined by && (AND operator) and the first command fails (exits with a non-zero exit code), then the second command is not performed. This behavior is due to short-circuit evaluation in the shell.
- Run the following command to drain the node:
kubectl drain pfmp-mvm03 --ignore-daemonsets --delete-emptydir-data
- Once the node is drained, reboot the node:
sudo reboot
- SSH to MVM02 and run the following command to monitor the node you rebooted to reach a STATUS of Ready:
watch kubectl get nodes
- Once MVM03 reports a Ready STATUS, SSH to MVM03 and perform the following command to uncordon and remove the maintenance label.
kubectl uncordon pfmp-mvm03 ; kubectl label node pfmp-mvm03 cmo.maintenance.mode-
Note: The "-" after
cmo.maintenance.mode in the command above is very important. Do not forget to include the DASH symbol. This is required to remove the label from the node.
- Wait 5 to 20 minutes, then run the command in step 1 to view the database cluster health. You can repeat the steps for the next MVM once the output matches the Healthy Database Example below.
- Repeat steps 3-8 on MVM02, then MVM01.
Note: When performing this procedure on MVM02, use MVM03 for step 6 to monitor MVM02 node status. When working on MVM01, use MVM02 for step 6 to monitor MVM01 node status. Kubectl commands DO NOT work on a node that is not in a Ready state.
Note: When you perform this procedure, the Compliance bundle may go into an ERROR state. Log in to the PFxM UI, click Settings > Compliance Versions. The Compliance bundle must be Resynchronized if it is in an ERROR state.
When you have completed the procedure on all three MVMs, run the command in step 1 to verify postgres database health. One pod should be the Leader and in a state of running. There should be 0MB Lag and both Sync Standby Members should have a state of streaming.
Healthy Database Example:
+ Cluster: postgres-ha-ha +------------------------------------------+--------------+-----------+----+-----------+ | Member | Host | Role | State | TL | Lag in MB | +-------------------------+------------------------------------------+--------------+-----------+----+-----------+ | postgres-ha-cmo1-8t2v-0 | postgres-ha-cmo1-8t2v-0.postgres-ha-pods | Leader | running | 10 | | | postgres-ha-cmo1-h4hx-0 | postgres-ha-cmo1-h4hx-0.postgres-ha-pods | Sync Standby | streaming | 10 | 0 | | postgres-ha-cmo1-pb88-0 | postgres-ha-cmo1-pb88-0.postgres-ha-pods | Sync Standby | streaming | 10 | 0 | +-------------------------+------------------------------------------+--------------+-----------+----+-----------+
Affected Products
PowerFlex rack, VxFlex Ready Nodes, PowerFlex custom node, PowerFlex appliance R650, PowerFlex appliance R6525, PowerFlex appliance R660, PowerFlex appliance R6625, Powerflex appliance R750, PowerFlex appliance R760, PowerFlex appliance R7625
, PowerFlex appliance R640, PowerFlex appliance R740XD, PowerFlex appliance R7525, PowerFlex appliance R840
...
Article Properties
Article Number: 000225550
Article Type: How To
Last Modified: 19 Jun 2025
Version: 12
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.