PowerEdge: VSAN reduced redundancy
Summary: As VxRack controller clusters operate with three nodes in the cluster, the vSAN cluster might not have enough resources for the disk group to be fully evacuated during an upgrade while maintaining full protection. ...
Instructions
Goals
While the disk format upgrade is optional, and the vSAN cluster continues to run smoothly if previous disk format version is used, for best results, upgrade the objects to use the latest on-disk format. The latest on-disk format provides the complete feature set of vSAN.
Facts
Depending on the size of disk groups, the disk format upgrade can be time-consuming because the disk groups are upgraded one at a time. For each disk group upgrade, all data from each device is evacuated and the disk group is removed from the vSAN cluster. The disk group is then added back to vSAN with the new on-disk format.
Once the on-disk format is upgraded, the software cannot be rolled back on the hosts or add certain older hosts to the cluster.
When an upgrade of the on-disk format is initiated, vSAN performs several operations that can be monitored from the Resyncing Components page. During the upgrade, the upgrade process can be monitored from the vSphere Web Client when you navigate to the Resyncing Components page. See Monitor the Resynchronization Tasks in the vSAN Cluster. You also can use the RVC vsan.upgrade_status <cluster> command to monitor the upgrade. Use the optional -r <seconds> flag to refresh the upgrade status periodically until you press Ctrl+C. The minimum number of seconds allowed between each refresh is 60.
You can monitor other upgrade tasks, such as device removal and upgrade, from the vSphere Web Client in the Recent Tasks pane of the status bar.
The following considerations apply when upgrading the disk format:
- If you upgrade a cluster with three hosts, and you want to perform a full evacuation, the evacuation fails for objects with a Primary level of failures to tolerate greater than zero. A three-host cluster cannot reprotect a disk group that is being fully evacuated using the resources of only two hosts. For example, when the Primary level of failures to tolerate is set to 1, vSAN requires three protection components (two mirrors and a witness), where each protection component is placed on a separate host.
- For a three-host cluster, you must choose the Ensured data accessibility evacuation mode. When in this mode, any hardware failure might result in data loss.
- You must also ensure that enough free space is available. The space must be equal to the logical consumer capacity of the largest disk group. This capacity must be available on a disk group separate from the one that is being migrated.
- When upgrading a three-host cluster or when upgrading a cluster with limited resources, allow the virtual machines to operate in a reduced redundancy mode.
- Using the --allow-reduced-redundancy command option means that certain virtual machines might be unable to tolerate failures during the migration. This lowered tolerance for failure can also cause data loss. vSAN restores full compliance and redundancy after the upgrade is completed. During the upgrade, the compliance status of virtual machines and their redundancies is temporarily noncompliant. After you complete the upgrade and finish all rebuild tasks, the virtual machines will become compliant.
- While the upgrade is in progress, do not remove or disconnect any host, and do not place a host in maintenance mode. These actions might cause the upgrade to fail.
Solution
This is a failure forward only operation and can only be completed using the VCSA Ruby console.
Re-validate the health of the VSAN prior to running the upgrade.
1. Verify that VSAN is healthy first, then log in to VCSA as root and run the following:
Command> rvc administrator@vsphere.local:VMwar3!!@vcenter_fqdn >
2. Change the directory to this vCenter.
cd fqdn_of_vcenter
3. Change the directory to the Datacenter object.
/vcenter_fqdn> cd DatacenterName/
4. Change the directory to the Computers object.
/vcenter_fqdn/DatacenterName> cd computers/
5. Change the directory to the Cluster that has the VSAN.
/vcenter_fqdn/DatacenterName/computers> cd VXRC /vcenter_fqdn/DatacenterName/computers/VXRC>
6. Run the upgrade task with reduced redundancy mode.
vsan.ondisk_upgrade . --allow-reduced-redundancy