VxRail: Cluster Shutdown Fails to Power off the Hosts
Summary: VxRail cluster shutdown may fail to power off the hosts. If it was upgraded from 4.7.x to a new major version and some hosts have changed hostname or IP address.
Symptoms
After initializing a cluster shutdown, all virtual machines are powered off, but the ESXi hosts are still running.
Similar error messages can be observed in /scratch/log/shutdown_ESX.log on node one.
Graceful shutdown: 2021-08-13 02:50:41,913.913Z - INFO Start to invoke reboot help script Graceful shutdown: 2021-08-13 02:51:22,566.566Z - INFO Command result is Begin to prepare the cluster for gracefully rebooting... Graceful shutdown: 2021-08-13 02:51:22,567.567Z - INFO Command result is Failed to connect host <host_ip_address>, skipping it... Graceful shutdown: 2021-08-13 02:51:22,567.567Z - INFO Command result is Time among connected hosts are synchronized. Graceful shutdown: 2021-08-13 02:51:22,567.567Z - INFO Command result is Exiting cluster preparation. Graceful shutdown: 2021-08-13 02:51:22,567.567Z - INFO Command result is Detected unsupported hosts ['<host_ip_address>'] Graceful shutdown: 2021-08-13 02:51:22,567.567Z - INFO Command result is Please remove above unsupported hosts and try again. Graceful shutdown: 2021-08-13 02:51:22,567.567Z - ERROR Can not execute the shutdown process due to execute reboot_helper failed for now VC, VXM, PSC already down only hosts alive Graceful shutdown: 2021-08-13 02:52:00,397.397Z - INFO Deleting /tmp/__shutdown Graceful shutdown: 2021-08-13 02:52:00,397.397Z - INFO Delete Completely
If you run the reboot_helper.py script using command python /usr/lib/vmware/vsan/bin/reboot_helper.py prepare on node one, it gets below error:
[root@esxihost01:~] python /usr/lib/vmware/vsan/bin/reboot_helper.py prepare Begin to prepare the cluster for gracefully rebooting... ERROR:root:Failed to test vsan vmodl version with error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1108) on <host_ip_address>' WARNING:root:Retry retrieving vsan vmodl version, 0 ERROR:root:Failed to test vsan vmodl version with error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1108) on <host_ip_address>' WARNING:root:Retry retrieving vsan vmodl version, 1 ERROR:root:Failed to test vsan vmodl version with error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1108) on <host_ip_address>' Failed to connect host <host_ip_address>, skipping it... Time among connected hosts are synchronized. Exiting cluster preparation. Detected unsupported hosts ['<host_ip_address>'] Please remove above unsupported hosts and try again.
Cause
ESXi host certificate on 4.7.x is signed by the Platform Service Controller, on 7.0.x/8.x it is signed by vCenter.
VxRail upgrade does not update ESXi host certificate when upgrading from 4.7.x to 7.0.x.
vCenter updates the certificate on the ESXi host if it got renamed or the IP got changed. So, its certificate is different to others.
During the cluster shutdown process, the reboot_helper.py script establishes a connection to the other hosts and then get the certificate error.
Resolution
To prevent this issue from happening, we must update certificate on each host after upgrade from 4.7.x to 7.0.x.
- Select the host in vSphere.
- Click Configure > Certificate at right section.
- Click the RENEW and REFRESH CA CERTIFICATES buttons.

- After completing the above steps for each host, retry the shutdown procedure.