VxRail: Node expansion validation error that the host CPU is not compatible with the cluster EVC mode
Summary: For a new node running on VxRail versions from 8.0.320 to 8.0.361, during host first boot, the VxRail manager virtual machine is powered on and configured with Skylake CPU mode. If the cluster EVC mode is lower than Skylake, then adding the new node to the cluster will report CPU incompatible error. ...
Symptoms
Node expansion validate shows error "The host CPU is not compatible with the cluster EVC mode".
You may hit this issue when:
- The new node VxRail version is from 8.0.320 to 8.0.361.
- The VxRail manager virtual machine on the new node is already powered on.
- The cluster EVC is lower than Skylake, e.g. Haswell.
Cause
For VxRail versions 8.0.320 to 8.0.361, the VxRail manager virtual machine is configured with an CPU mode set to Skylake during the host's first boot.
However, if the cluster was initially deployed on a version earlier than 8.0.320, the default cluster EVC level would have been set to Haswell. Even if the cluster has been upgraded to a newer version, the cluster EVC level remains at Haswell unless manually changed. As a result, when adding a new node with a VM running on Skylake CPU mode, which is higher than the cluster's EVC (Haswell), the validation process will report a CPU incompatibility error.
Resolution
Run the following commands on all the to-be-added new nodes to stop the election service and power off the VxRail manager VM.
esxcli daemon control stop -s election
python /opt/vxutils/data/vxrail-primary --stop
Note: The election service will power on the VxRail manager VM on other nodes, so need to stop the election service on each node before powering off the VxRail manager VM. Do not stop the election service through the UI; please use command.
Additional Information
If the cluster was initially deployed on VxRail 8.0.320 to 8.0.361, the cluster default EVC level is set to Skylake, so you will not hit this issue.
In the future VxRail version higher than 8.0.361, the cluster deployment default EVC level will be set to Haswell again, adding a new node running on 8.0.320 to 8.0.361 may also hit this issue.