Skip to main content
  • Place orders quickly and easily
  • View orders and track your shipping status
  • Enjoy members-only rewards and discounts
  • Create and access a list of your products
  • Manage your Dell EMC sites, products, and product-level contacts using Company Administration.

Article Number: 000168665


VxRail: vSAN Health Queries Fail with "VsanHealthPropertyProvider" Timeouts When Running Veeam ONE or Other Monitoring Tools

Article Content


Symptoms



After upgrading to VxRail 4.7.x you may experience the following errors in vCenter if certain versions of Veeam ONE or other monitoring software for the vSAN is running and providing analytics for the cluster.
  • Attempting to load any interface under Cluster > Monitor > vSAN may fail to load or report the following error:
    • The query execution timed out because of a back-end property provider 'com.vmware.vsphere.client.vsan.health.VsanHealthPropertyProvider' which took more than 120 seconds.
  • Cluster > Manage > vSAN > Physical Disks may report 0 disks & disk groups for one or more hosts
  • vSAN > Physical Disks will be blank or only report disks for certain nodes
  • vSAN > Virtual Objects will be blank or only report certain VMs
  • vSAN > Resyncing Components will not report on syncing data
  • vSAN > Health may be reporting multiple errors
    • Network - Hosts with connectivity issues
    • Network - All hosts have a vSAN vmknic configured
    • Cluster - Advanced Configuration in Sync
      • HostX reporting inconsistent configuration
    • Cluster - Disk format version
      • HostX warning that on-disk format needs to be updated
    • Limits  - After 1 additional host failure
      • Component utilization - 0% (X of 0)
      • Disk space utilization - 0% (XGB of 0GB)
  • Placing any host in Maintenance Mode may fail with "General vSAN Error"
  • Each host will constantly report the following messages in the vmkernel.log:
[root@NodeX:~] tail /var/log/vmkernel.log
2018-12-06T14:55:30.958Z cpu3:2550663)MemSchedAdmit: 470: Admission failure in path: vsanperfsvc/python.2550635/uw.2550635
2018-12-06T14:55:30.958Z cpu3:2550663)MemSchedAdmit: 477: uw.2550635 (3479393) extraMin/extraFromParent: 64/64, vsanperfsvc (2360) childEmin/eMinLimit: 38886/38912
2018-12-06T14:55:30.958Z cpu3:2550663)MemSchedAdmit: 470: Admission failure in path: vsanperfsvc/python.2550635/uw.2550635
2018-12-06T14:55:30.958Z cpu3:2550663)MemSchedAdmit: 477: uw.2550635 (3479393) extraMin/extraFromParent: 64/64, vsanperfsvc (2360) childEmin/eMinLimit: 38886/38912
2018-12-06T14:55:30.958Z cpu3:2550663)MemSchedAdmit: 470: Admission failure in path: vsanperfsvc/python.2550635/uw.2550635
2018-12-06T14:55:30.958Z cpu3:2550663)MemSchedAdmit: 477: uw.2550635 (3479393) extraMin/extraFromParent: 64/64, vsanperfsvc (2360) childEmin/eMinLimit: 38886/38912
2018-12-06T14:55:30.958Z cpu3:2550663)MemSchedAdmit: 470: Admission failure in path: vsanperfsvc/python.2550635/uw.2550635
2018-12-06T14:55:30.958Z cpu3:2550663)MemSchedAdmit: 477: uw.2550635 (3479393) extraMin/extraFromParent: 64/64, vsanperfsvc (2360) childEmin/eMinLimit: 38886/38912
2018-12-06T14:55:30.958Z cpu3:2550663)MemSchedAdmit: 470: Admission failure in path: vsanperfsvc/python.2550635/uw.2550635
2018-12-06T14:55:30.958Z cpu3:2550663)MemSchedAdmit: 477: uw.2550635 (3479393) extraMin/extraFromParent: 64/64, vsanperfsvc (2360) childEmin/eMinLimit: 38886/38912


  • Each host will report the following status when attempting to restart the vsanmgmtd service:
[root@NodeX:~] /etc/init.d/vsanmgmtd status
vsanperfsvc is running
[root@NodeX:~] /etc/init.d/vsanmgmtd restart
watchdog-vsanperfsvc: Terminating watchdog process with PID 2099001
stopping timed out
vsanperfsvc started

  • Each host will have a python-zdump.001, 002, & 003 in /var/core that grow and rotate roughly every 10 minutes:
[root@NodeX:~] ls -lah /var/core/
total 595924
drwxr-xr-x    1 root     root           8 Dec  6 15:40 .
drwxr-xr-x    1 root     root           8 Jan  1  1970 ..
-rwx------    1 root     root        1.3M Dec  5 13:58 cmmds-tool-zdump.000
-rwx------    1 root     root       63.8M Dec  4 19:25 hostd-worker-zdump.000
-rwx------    1 root     root       67.1M Dec  4 18:48 hostd-worker-zdump.002
-rwx------    1 root     root       69.6M Dec  4 19:04 hostd-worker-zdump.003
-rwx------    1 root     root        3.9M Dec  6 15:18 localcli-zdump.000
-rwx------    1 root     root        3.3M Dec  6 08:14 localcli-zdump.002
-rwx------    1 root     root        3.9M Dec  6 15:15 localcli-zdump.003
-rwx------    1 root     root      178.0M Dec  6 15:28 python-zdump.001
-rwx------    1 root     root      178.0M Dec  6 15:37 python-zdump.002
-rwx------    1 root     root       12.6M Dec  6 15:40 python-zdump.003

  • Errors may clear after a reboot but will return after a short period.

Cause

These errors are due to the vsanmgmtd running out of memory, crashing, then attempting to restart on each host. Each time the service crashes in this situation, a python-zdump file is created. While vsanmgmtd is unresponsive or has crashed, the host can no longer report certain vSAN statistics to vCenter. This leads to vCenter reporting false or improper information for one or more hosts, causing many regular management operations to time-out or fail.

The vsanmgmtd is commonly known as the vSAN Health Service or vSAN Management Service. Primarily, vCenter makes API calls to this service for vSAN reporting on various components. VMware also allows other software vendors to utilize this API for their software to provide reporting and other monitoring or analytical services. As this API is generally updated with every version of vSAN, it is important to ensure that the external software is updated to maintain compatibility. In this situation the version incompatibility lead to the vsanmgmtd service being polled too often, which further lead to memory exhaustion and crashes.

Customers utilizing these tools, such as Veeam ONE, may still be running versions that only support ESXi 6.5 / vSAN 6.61 and below at the time of upgrading to VxRail 4.7.x. Due to this, it is imperative to verify compatibility of all external software in the vSAN environment for the new target vSAN version before upgrading.

Resolution

Upgrade Veeam ONE to version 9.5 Update 3, or upgrade other monitoring software to their respective versions that support ESXi / vSAN 6.7 prior to upgrading VxRail. Refer to the individual vendor's compatibility matrices to determine supported versions.

Article Properties


Product

VxRail Appliance Family, VxRail Software

Last Published Date

10 May 2023

Version

3

Article Type

Solution