VPLEX: Automated Metadata Backup Fails due to "the Active Metadata Device is Not Healthy"

Summary: This article talks to the issue where the Automated metadata backup fails due to "The active metadata device is not healthy." This article provides the remediation steps to resolve this error. ...

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Symptoms

Issue:
Automated metadata back is failing with the below error: 

The active metadata device is not healthy

From the Client logs:

2017-12-08 23:30:11,225 WARN [DefaultCommandHandler-Thread-7977]MetadataBackupManager: The automated backup of the meta-volume could not be completed: Evaluation of <<meta-volume backup -c /clusters/cluster-2 --storage-volumes VPD83T3:600601601b003e008a7a68464d8ce511 --force>> failed.
Failed to backup the active meta-volume.
The meta-volume is unable to accept I/O.
Firmware command error.
The active metadata device is not healthy.

From the firmware logs: 

localhost:5988:null:1:<3>2018/01/12 23:59:12.776: sms/6 The automated backup of the meta-volume could not be  completed.
localhost:5988:null:1:<3>2018/01/15 23:59:19.268: sms/6 The automated backup of the meta-volume could not be  completed.
localhost:5988:null:1:<3>2018/01/16 23:59:13.657: sms/6 The automated backup of the meta-volume could not be  completed.
localhost:5988:null:1:<3>2018/01/17 23:59:18.336: sms/6 The automated backup of the meta-volume could not be  completed.

Customer is using the same name for the metavolume on both clusters: 

Cluster cluster-1:

Name                              Volume Type     Operational  Health  Active  Ready  Geometry  Component  Block     Block  Capacity  Slots
--------------------------------  --------------  Status       State   ------  -----  --------  Count      Count     Size   --------  -----
--------------------------------  --------------  -----------  ------  ------  -----  --------  ---------  --------  -----  --------  -----
Log-Vol_vol                 logging-volume  ok           ok      -       -      raid-1    2          5243040   4K     20G       -
Metadatavol                          meta-volume     ok           ok      true    true   raid-1    2          20971424  4K     80G       64000
metadatavol_backup_2018Jan16_235913  meta-volume     ok           ok      false   true   raid-1    1          20971424  4K     80G       64000
Metadatavol_backup_2018Jan17_235918  meta-volume     ok           ok      false   true   raid-1    1          20971424  4K     80G       64000


Cluster cluster-2:
Name                              Volume Type     Operational  Health  Active  Ready  Geometry  Component  Block     Block  Capacity  Slots
--------------------------------  --------------  Status       State   ------  -----  --------  Count      Count     Size   --------  -----
--------------------------------  --------------  -----------  ------  ------  -----  --------  ---------  --------  -----  --------  -----
Log-Vol_vol                 logging-volume  ok           ok      -       -      raid-1    2          5243040   4K     20G       -
Metadatavol                          meta-volume     ok           ok      true    true   raid-1    2          20971424  4K     80G       64000
metadatavol_backup_2018Jan16_235913  meta-volume     ok           ok      false   true   raid-1    1          20971424  4K     80G       64000
Metadatavol_backup_2018Jan17_235918  meta-volume     ok           ok      false   true   raid-1    1          20971424  4K     80G       64000

Cause

UI code uses volume name to filter out the event. In this case, the metavolume in cluster-2 has the same name of the cluster-1 metavolume, which confused cluster-2. 

Because the Active metadata volumes have the same name on both clusters and automated backups are scheduled simultaneously on both clusters (23:30 VPLEX time), therefore cluster-2 mistakenly thinks that its amf meta restore is completed when receiving cluster-1's amf/203 event, when in reality it is the fact that the cluster-1 amf meta restore is done.

Resolution

Workaround:
To avoid this issue, use unique name for each meta volume on each cluster. (Cluster-1 and Cluster-2)

Additional Information

This issue might also manifest itself with disappearing backup metavolume and or missing backup dates.
VPlexcli:/> ll /clusters/*/system-volumes/

/clusters/cluster-1/system-volumes:
Name                             Volume Type     Operational  Health  Active  Ready  Geometry  Component  Block     Block  Capacity  Slots
-------------------------------  --------------  Status       State   ------  -----  --------  Count      Count     Size   --------  -----
-------------------------------  --------------  -----------  ------  ------  -----  --------  ---------  --------  -----  --------  -----
LOGVOL_01_vol                    logging-volume  ok           ok      -       -      raid-1    2          5242880   4K     20G       -
META_01                          meta-volume     ok           ok      true    true   raid-1    2          20446976  4K     78G       64000
META_01_backup_2018Jul28_203712  meta-volume     ok           ok      false   true   raid-1    1          20446976  4K     78G       64000
META_01_backup_2018Jul30_203854  meta-volume     ok           ok      false   true   raid-1    1          20446976  4K     78G       64000    <--- Note that the Jul29 backup was skipped.


/clusters/cluster-2/system-volumes:
Name                             Volume Type     Operational  Health  Active  Ready  Geometry  Component  Block     Block  Capacity  Slots
-------------------------------  --------------  Status       State   ------  -----  --------  Count      Count     Size   --------  -----
-------------------------------  --------------  -----------  ------  ------  -----  --------  ---------  --------  -----  --------  -----
LOGVOL_01_vol                    logging-volume  ok           ok      -       -      raid-1    2          5242880   4K     20G       -
META_01                          meta-volume     ok           ok      true    true   raid-1    2          20446976  4K     78G       64000
META_01_backup_2018Jul29_010011  meta-volume     ok           ok      false   true   raid-1    1          20446976  4K     78G       64000    <--- Note a meta-data backup volume is missing.

Affected Products

VPLEX, VPLEX GeoSynchrony

Products

VPLEX VS2, VPLEX VS6
Article Properties
Article Number: 000167982
Article Type: Solution
Last Modified: 12 Dec 2025
Version:  4
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.