Dell VxRail: A stale public.operation_status record prevents single node shutdown on VxRail 4.7.410
Resumen: Validation for single node shutdown does not compete without displaying its task.
Síntomas
A single node shutdown request from VxRail plug-in("Monitor"->"Physical View"->"ACTIONS"->"Shutdown Host") stopped at "Validate" step.
Three-fourth ESXi hosts completed following steps without problem. But a specific host stopped at "Validate" step without displaying usual tasks in "Shutdown Host - hostname" dialogue box.
- VM Migration
- Validate
- Confirm
- Shutdown
Restarting all related services between VxRail Manager and ESXi did not resolve the issue.
Restarting VxRail Manager did not resolve the issue.
Restarting the ESXi host did not resolve the issue.
Management Account does not have any problem.
From log bundle, a stale public.operation_status record was found (SERVICE_TAG must be replaced).
# pwd /VxRail_Support_Bundle_528b5b3d-d2f4-2f70-fc35-d3e15c274bcc_2022-06-01_00_23_57/vxrail_data_collection_2022-06-01_00_23_57/dump # grep "^COPY " db_mysticmanager -n |grep public.operation_status -A1 3623:COPY public.operation_status (id, owner, state, error, progress, starttime, endtime, target, step, detail, extension) FROM stdin; 3695:COPY public.power_supply (sn, part_number, revision_number, name, manufacturer, slot, health, missing, appliance_id) FROM stdin; # expr 3695 - 3623 - 1 71 # grep "^COPY public.operation_status" db_mysticmanager -A71 |grep "^COPY \|HOST_SHUTDOWN[[:space:]]IN_PROGRESS"|grep SERVICE_TAG 3d56845d-32be-4b67-b5a6-f10790ccedcc HOST_SHUTDOWN IN_PROGRESS \N 0 1649927129841 \N SERVICE_TAG \N \N \N
Causa
A stale public.operation_status record prevented completing "Validate" step.
Resolución
The issue was resolved by deleting the stale public.operation_status record.
-
Take a snapshot of VxRail Manager virtual machine in the vSphere Client.
-
Log in to VxRail Manager as mystic using ssh and then switch user(su -) to root user.
-
Run a following command to display a list of IN_PROGRESS/HOST_SHUTDOWN operation in VxRail Manager database.
# psql -U postgres mysticmanager -c "select id, owner, state, error, target from public.operation_status where owner='HOST_SHUTDOWN' and state='IN_PROGRESS';"
-
Identify the "id" of "IN_PROGRESS" state operation from output from step 3.
-
Run a following command to delete the record from VxRail Manager database.
# psql -U postgres mysticmanager -c "delete from public.operation_status where id='ID_FROM_PREVIOUS_PSQL_COMMAND_OUTPUT' and owner='HOST_SHUTDOWN' and state='IN_PROGRESS';"
-
Run following commands to restart the vmware-marvin and runjars services on VxRail Manager.
# systemctl restart vmware-marvin # systemctl restart runjars
-
Try again single node shutdown from VxRail plug-in("Monitor"->"Physical View"->"ACTIONS"->"Shutdown Host") and check "Validate" step result.
-
If the "Validate" step complete, you can delete the snapshot of VxRail Manager virtual machine.