VxRail: vSAN Object Inaccessible, Disk Failure, Excessive I/O Latency, Disk Overall Health Red

Summary: Do not Remove disks during vSAN resync as it can result in a Data Loss.

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Symptoms

This article is applicable for both VxRail 7.x and VxRail 8.x versions.

vSAN health check finds disk failure, or vmware-vsan-health-summary-result.log finds physdiskoverall health is red or yellow.

VxRail-Virtual-SAN-Cluster-xxxxxxxxx  Overall Health : red
   Group physicaldisks health : red
      Test physdiskoverall health : red
         DisksWithIssues: Host  Disk  OverallOperationHealth  Metadata  Operational  InCmmds/Vsi  OperationalState  Recommendation  Uuid
                             (Host-10, LocalToshibaDisk(Naa.50000xxxxxxxxxx), Red, Green, Red, Yes/Yes, ImpendingPermanentDiskFailure,EvacuationFailedDueToInaccessibleObjects, PleaseReferTo'Data'HealthCheckAndResolveTheInaccessibleObjects

vsandevicemonitord.log reports:

INFO vsandevicemonitord WARNING - WRITE Average Latency on VSAN device naa.50000xxxxxxxx has exceeded threshold value 2000000 us 2 times.
INFO vsandevicemonitord Tier 2 (naa.50000xxxxxxxx) as unhealthy

Cause

The Dying Disk Handling (DDH) feature of vSAN diagnoses disk or disk group health by detecting either excessive I/O latency for a vSAN disk or maximum log congestion that vSAN determines to be due to log leak issues in a vSAN disk group over an extended period. Unhealthy disk or disk groups are marked as such and the disk or disk groups are no longer used for new data placement.

When DDH detects that a disk has exceeded the I/O latency threshold during the monitoring interval, vSAN generates a VMkernel Observation (VOB) and log a message to the vsandevicemonitord.log file in the /var/run/log directory. The log entry below is an example for a disk that must be replaced once the required data evacuation is complete and the disk is in an evacuated state:

WARNING - WRITE Average Latency on VSAN device <NAA disk name> has exceeded threshold value <IO latency threshold for disk> us <# of intervals with excessive IO latency> times.

When DDH detects that a caching tier has excessive log congestion during the monitoring interval, vSAN generates a VOB and log to the vsandevicemonitord.log file. Excessive log congestion messages are in this format:

WARNING - Maximum log congestion on VSAN device <NAA disk name> <current intervals with excessive log congestion>/<intervals required to be unhealthy>

In both of these situations, vSAN triggers the evacuation of some or all data from the affected disk or disk groups. The overall disks health section in the vSAN health monitoring UI reports any of the following operational states for the affected disk or disk groups along with recommendations for the user. The recommendations after the evacuation is complete differ depending on whether vSAN detected excessive I/O latencies or excessive log congestion.

Resolution

See VMware article 326878, Dying Disk Handling (DDH) in vSAN This hyperlink is taking you to a website outside of Dell Technologies. 

Do not remove or replace disk during the below situations when vSAN resync is ongoing. If you do that, Data Loss may occur.

Impending permanent disk failure, data evacuation failed due to insufficient resources (Health state - Red)

Or

Impending permanent disk failure, data evacuation failed due to inaccessible objects (Health state - Red)

Do not remove or replace a disk when the object is inaccessible.
Object inaccessible means that all copies of the object are missing. If you remove or replace a disk, this may cause data loss.

Workaround:

  1. Engage VMware
  2. If excessive I/O latency caused the capacity disk unhealthy status, recover the disk by remount. Remounting the disk does not change the vSAN UUID of the disk.
esxcli vsan storage diskgroup unmount -u <disk group UUID>
esxcli vsan storage diskgroup mount -u <disk group UUID>

Affected Products

VxRail, VxRail Appliance Series, VxRail Software
Article Properties
Article Number: 000186364
Article Type: Solution
Last Modified: 17 Jun 2025
Version:  9
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.