Article Number: 000019587

VxRail: Details on Data health – vSAN Object Health check in the vSAN Health Service

This article may have been automatically translated. If you have any feedback regarding its quality, please let us know using the form at the bottom of this page.

Article Content

Instructions

Data health vSAN Object Health check in the vSAN Health Service and provides details on why it might report an error.

These are the possible states that an object may have when it is not healthy.
This can be viewed under VSAN web client >Monitor> Health> data

Data move: vSAN is building data on the ESXi hosts and storage in the cluster either because you requested some form of maintenance mode or evacuation, or because of re-balancing activities. Objects in this state are fully compliant with their policy and are healthy, but vSAN is actively rebuilding them. You should not be worried, as the object is not at risk. However, a performance impact can be expected while objects are in this state. You can cross reference to the re-syncing components view to learn more about active data sync activities.

Healthy: The object is in perfect condition, exactly aligned with its policy, and is not currently being moved or otherwise worked on.

Inaccessible: An object has suffered more failures (permanent or temporary) than it was configured to tolerate, and is currently unavailable and inaccessible. If the failures are not temporary (For example: An ESXi host reboot), you should work on the underlying root cause such as a failed ESXi hosts, failed network, removed disks and so on as quickly as possible to restore availability, as virtual machines that are using these objects cannot function correctly while in this inaccessible state.

Non-availability related incompliance: This is a catch all state when none of the other states apply. An object with this state is not compliant with its policy, but is meeting the availability (NumberOfFailuresToTolerate) policy. There is currently no documented case where this state would be applicable.

Non-availability related reconfig: vSAN is rebuilding data on the ESXi hosts and storage in the cluster because you requested a storage policy change that is unrelated to availability. In other words, such an object is fully in compliance with the NumberOfFailuresToTolerate policy and the data movement is to satisfy another policy change, such as NumberOfDiskStripesPerObject. You do not need to worry about an object in this state, as it is not at risk.

Reduced availability - active rebuild: The object has suffered a failure, but it was configured to be able to tolerate the failure. I/O continues to flow and the object is accessible. vSAN is actively working on re-protecting the object by rebuilding new components to bring the object back to compliance.

Reduced availability with no rebuild: The object has suffered a failure, but VSAN was able to tolerate it. For example: I/O is flowing and the object is accessible. However, VSAN is not working on re-protecting the object. This is not due to the delay timer (reduced availability - no rebuild - delay timer) but due to other reasons. This could be because there are not enough resources in the cluster, or this could be because there was not enough resources in the past, or there was a failure to re-protect in the past and VSAN has yet to retry. Refer to the limits health check for a first assessment if any resources may be exhausted. You have to resolve the failure or add resources as quickly as possible in order to get back to being fully protected against a subsequent failure.

Reduced availability with no rebuild - delay timer: The object has suffered a failure, but vSAN was able to tolerate it. I/O is flowing and the object is accessible. However, vSAN is not yet working on re-protecting the object, as it is waiting for the 60-minute (default) delay timer to expire before issuing the re-protect.

You can choose to issue an explicit request to skip the delay timer and start re-protect immediately, if it is known that the failed entity cannot be recovered within the delay period.

However, if you know that the failed host is actively rebooting or knows that a wrong drive is incorrectly pulled and it is being reinserted, then it is advisable to just wait for those tasks to finish, as that will be the quickest way to fully re-protect the obje

VxRail: Details on Data health – vSAN Object Health check in the vSAN Health Service

Article Content

Instructions

Article Properties

Affected Product

Product

Last Published Date

Version

Article Type

Welcome

Welcome to Dell

VxRail: Details on Data health – vSAN Object Health check in the vSAN Health Service

Article Content

Instructions

Article Properties

Affected Product

Product

Last Published Date

Version

Article Type