VxRail: After replacing a disk, the physical view in VxRail Plug-in is showing the disk slot as unmanaged and the drive configuration empty
Summary: After replacing a disk, the physical view in VxRail Plug-in is showing the disk slot as unmanaged and drive configuration empty.
Symptoms
After replacing a disk, the physical view in VxRail Plug-in is showing the disk slot as unmanaged and drive configuration empty:
Issue 1. Disk is replaced using normal VMware process not from VxRail physical view which resolves the replacement disk failure.
Issue 2: VxRail Physical view now shows disk with incorrect serial number and disk missing.
Issue 3: After correction of the above issues, the physical view shows the disk slot as unmanaged and the drive configuration empty.
Cause
Disk replacement was done using an unsupported method - The VxRail plugin should always be used.
A complete vxnode.config file should have disk segment, PSU segment, local_slot_claims segment, disk_group_options segment, disk_group_type segment. The local_slot_claims or disk_group_options or both should exist in this file. If none of them exits, it is regenerated from the hardware-model-specs.json.
Resolution
Run below commands on VxRail manager with root user to update ESXi file vxnode.config with new Disk/PSU serial number and slot info. Replace <ESXi hostname> and <ESXi root password> with real value.
To update Disk info:
curl -X POST --unix-socket /var/lib/vxrail/nginx/socket/nginx.sock http://127.0.0.1/rest/vxm/internal/do/v1/hosts/baseline-update -H 'Content-Type: application/json' -d '[{"hostname":"<ESXi hostname>", "username":"root","password":"<ESXi root password>", "update_disk":true}]'
To update PSU info:
curl -X POST --unix-socket /var/lib/vxrail/nginx/socket/nginx.sock http://127.0.0.1/rest/vxm/internal/do/v1/hosts/baseline-update -H 'Content-Type: application/json' -d '[{"hostname":"<ESXi hostname>", "username":"root","password":"<ESXi root password>", "update_psu":true}]'
Restart the services on the VxRail Manager
systemctl restart vmware-marvin
systemctl restart runjars
- If you get an error from running the curl command, check the iDRAC if the power supplies are listed. If not (the TSR report also shows that the power supplies are absent), then reboot or reset the iDRAC and confirm that the power supplies are listed afterwards (the next TSR report should also show the power supplies).
- If the baseline-update curl command returns 200 success, but the vxnode.config file is not updated, review short.term.log on the VxRail manager to identify what is wrong. A possible cause is the platform service on the node is not running. You can reset iDRAC and restart the platform service to see if it can bring the service back up, then run the baseline-update command again.