VPLEX: automated backup of the metavolume could not be completed
Résumé: This article talks about what to do to re-create VPLEX metadata back-ups when either the call home 0x8a4a6006 reports for "The automated backup of the metavolume could not be completed" or 0x8a4a6003 reports for "no valid backup metavolume exist, or 0x8a4a6005 reports for "Metadata Backup does not create new backups every day." ...
Symptômes
What is Metadata backup?
- The Metadata backups are the backups of the active metadata. It includes all the system configuration settings that are in the active Metadata. Metadata backup volumes are a system-volume in VPLEX, which is created when a VPLEX cluster is initially configured.
- Metadata backups are point-in-time snapshots of the current active metavolume. The point-in-time of the metavolume backup is taken based on the schedule that was set up when they were initially configured. They can be activated only if the current active metavolume or one leg of the active metadata volume fails. They are meant to provide extra protection for major configuration changes, refreshes, or a migration.
- Whenever an end user encounters a data unavailable (DU) situation due to backend array issues, the metavolume backups play an essential role for recovering the VPLEX configuration if needed.
- For redundancy, VPLEX has two metadata backup volumes that are to be created on two different arrays, the same as each leg of the active metadata residing on two different arrays. The two metadata backups rotate daily as per the schedule. You should always see one date before the other. If they show days have passed then there is an issue with the backup script not running, or it could not complete due to an issue so the backups did not run.
For example: "Metavolume backup ("A")" gets updated today then "metavolume backup ("B")" will be updated next day and so on. See the below output for more details:
VPlexcli:/clusters/cluster-1/system-volumes> ll
Name Volume Type Operational Health Active Ready Geometry Component Block Block Capacity Slots
---------------------------- ----------- Status State ------ ----- -------- Count Count Size -------- -----
---------------------------- ----------- ----------- ------ ------ ----- -------- --------- -------- ----- -------- -----
meta meta-volume ok ok true true raid-1 2 20971264 4K 80G 32000
meta_backup_2021Jul09_040009 ( A ) meta-volume ok ok false true raid-1 1 20971264 4K 80G 64000
meta_backup_2021Jul10_040007 ( B ) meta-volume ok ok false true raid-1 1 20971264 4K 80G 64000
If a scheduled metadata backup fails a dial-home is generated as shown below:
Sample of the dial-home that is sent for this issue:
1. <SymptomCode>0x8a4a6006</SymptomCode>
<Category>Status</Category>
<Severity>Error</Severity>
<Status>Failed</Status>
<Component>CLUSTER</Component>
<ComponentID>SMS</ComponentID>
<SubComponent>CLUSTER-1</SubComponent>
<SubComponentID></SubComponentID>
<CallHome>Yes</CallHome>
<FirstTime>2012-07-10T00:00:01.334Z</FirstTime>
<LastTime>2012-07-09T00:00:01.334Z</LastTime>
<Count>1</Count>
<EventData><![CDATA[The automated backup of the meta-volume could not be completed.
[Versions:<code formats listed>] RCA: The automated backup of the meta-volume could not be completed.]]>
</EventData>
<Description>
The automated backup of the meta-volume could not be completed.
2. <SymptomCode>0x8a4a6003</SymptomCode>
<Severity>Error</Severity>
<Status>Failed</Status>
<Component>CLUSTER</Component>
<ComponentID>SMS</ComponentID>
<SubComponent>CLUSTER-1</SubComponent>
<SubComponentID></SubComponentID>
<CallHome>Yes</CallHome>
<FirstTime>2021-09-07T03:00:12.191Z</FirstTime>
<LastTime>2021-09-07T03:00:12.191Z</LastTime>
<Count>1</Count>
<EventData><![CDATA[No valid backup meta-volumes exist. [Versions:[code formats listed] RCA: The automated backup of metadata cannot identify the devices to be used. This is because existing backups cannot be located. The backups are rotated through being destroyed in order to be re-used.]]>
</EventData>
<Description>No valid backup meta-volumes exist.
<Status>Failed</Status>
3. <SymptomCode>0x8a4a6005</SymptomCode>
<Severity>Error</Severity>
<Status>Failed</Status>
<Component>CLUSTER</Component>
<ComponentID>unknown</ComponentID>
<SubComponent>sms</SubComponent>
<SubComponentID></SubComponentID>
<CallHome>Yes</CallHome>
<FirstTime>2017-12-04T00:00:35.420Z</FirstTime>
<LastTime>2018-09-06T23:59:02.813Z</LastTime>
<Count>1</Count>
<EventData><![CDATA[A meta-volume backup could not be destroyed. Reason: The meta-volume backup "<name of the affected metadata backup>" could not be destroyed: A meta-volume backup "<name of the affected metadata backup>" is not healthy enough to be destroyed. [Versions:[code formats listed>] RCA: Backup meta-volume could not be destroyed. Remedy: Confirm that the volumes configured to be used for the backup are in a healthy state. If the volumes are unhealthy, create new automated metavolume backups by: 1. Destroy the existing backups using the 'meta-volume destroy' command. 2. Unclaim those volumes if they are to be re-used with the 'storage-volume unclaim' command. 3. Use the 'configuration metadata-backup' command to reconfigure the backups. If the previous volumes used were not healthy enough to destroy, create the backups with new healthy devices.
]]></EventData>
<Description><![CDATA[A meta-volume backup could not be destroyed.
Cause
The Metadata backup may have failed to capture the point-in-time copy due to the volume used by the failed metadata backup being unhealthy on the backend array or if there is a possible
Connectivity issue between VPLEX and the backend array where the backup volume is located.
2. For SymptomCode 0x8a4a6003:
Renaming of the Metadata Backup volume components is not allowed.
/clusters/cluster-1/system-volumes:
Name Volume Type Operational Health Active Ready Geometry Component Block Block Capacity Slots
------------------------------- ----------- Status State ------ ----- -------- Count Count Size -------- -----
------------------------------- ----------- ----------- ------ ------ ----- -------- --------- -------- ----- -------- -----
C1_Meta meta-volume ok ok true true raid-1 2 20971264 4K 80G 64000
META_VOLUME_backup_2021Jun11_044501 meta-volume ok ok false true raid-1 1 20971264 4K 80G 64000
META_VOLUME_backup_2021Jun12_044501 meta-volume ok ok false true raid-1 1 20971264 4K 80G 64000
Here is an example of a good Backup Meta-Volume, notice that the component is still using the
VPD83T3: ID at the metadata backup volume component context level:
Name Slot Type Operational Health Capacity
---------------------------------------- Number -------------- Status State --------
---------------------------------------- ------ -------------- ----------- ------ --------
VPD83T3:60000970000xxxxxxxxxxxxxxxxx3030 0 storage-volume ok ok 120G
/--------------------------------------------------------\
Should see this system volume ID
Here is an example of a bad Backup Meta-Volume, where the system volume id at the component context level has been changed from its VPD83T3 system ID to a human-readable name "C1_MetaBackup_1":
VPlexcli:/clusters/cluster-1/system-volumes/META_VOLUME_backup_2018Jun11_044501/components> ll
Name Slot Type Operational Health Capacity
------------------ Number -------------- Status State --------
------------------ -------- -------------- ------------- -------- --------
C1_MetaBackup_1 0 storage-volume ok ok 120G
/--------------------------\
Human readable names are not allowed at the component level,
the backup manager script does not know the metadata backup
volume by this name, only the system ID used when the backup
volume was set up.
3. For SymptomCode 0x8a4a6005:
When checking the metadata backups you may see that the date of backup is not current.
Example for checking the current date on the VPLEX:
VPlexcli:/> date
Fri Sep 7 13:30:43 UTC YYYY <<<< note the current date in this example is Sep 7
Next check the dates of when each backup volume had last run under the system-volumes.
context, compare the dates listed in the backup volume names with the date checked above
(date was enlarged for the example):
VPlexcli:/> ll /clusters/cluster-1/system-volumes/
/clusters/cluster-1/system-volumes:
Name Volume Type Operational Health Active Ready Geometry Block Block Capacity Slots
------------------------------- -------------- Status State ------ ----- -------- Count Size -------- -----
------------------------------- -------------- ----------- ------ ------ ----- -------- -------- ----- -------- -----
c1_meta meta-volume ok ok true true raid-1 20971264 4K 80G 32000
c1_meta_backup_2018Aug01_030002 meta-volume ok ok false true raid-1 20971264 4K 80G 32000
c1_meta_backup_2018Aug02_030003 meta-volume ok ok false true raid-1 20971264 4K 80G 32000
Looking at the storage-array context level of the VPLEX there are LUNs that show 'visibility' as 'none' contained within the storage array that hosts a backup volume and the 'connectivity-status' shows 'error'.
/clusters/cluster-1/storage-elements/storage-arrays/EMC-CLARiiON-CKM00000000000/logical
units/VPD83T3:6006016099xxxxxxxxxxxxxxxxx1e111:
Name Value
---------------------- --------------------
active-aao-controller [CKM00000000000.SPB]
active-aao-visibility []
alua-support none
connectivity-status error <<<< communication/connectivity issue between VPLEX and array
luns []
passive-aan-controller [CKM00000000000.SPA]
passive-aan-visibility []
storage-volume -
visibility none <<<<< not seeing the backend volume
This indicates that there is a connectivity issue between the storage array and VPLEX and this would cause an issue if it occurred during the running of the automated backup script as it would not be able to see the storage volume from the array due to the connectivity issue.
Résolution
A. For SymptomCode's 0x8a4a6003 and 0x8a4a6006:
NOTE: Attempting to rename the component with VPD83T3: ID has problems with the colon ":" and does not work.
To resolve the issue, follow the steps in the workaround:
Workaround:
- If the VPLEX is a Metro configuration ensure that you run the workaround on the cluster that reported the issue should you need to delete the metabackup volumes.
(see sample call home details in the Issue section) to see the details using the command.
ll /clusters/cluster-<id>/system-volumes,
NOTE: you can type the command as 'll /clusters/*/system-volumes' and this will list the system-volume details for all clusters in the configuration. If this is a VPLEX-Local then you will
only see the info for cluster-1.
Sample output using cluster-1:
Name Volume Type Operational Health Active Ready Geometry Component Block Block Capacity Slots
------------------------------- ----------- Status State ------ ----- -------- Count Count Size -------- -----
------------------------------- ----------- ----------- ------ ------ ----- -------- --------- -------- ----- -------- -----
C1_Meta meta-volume ok ok true true raid-1 2 20971264 4K 80G 64000
Meta_backup_2018Sep07_154626 meta-volume ok ok false true raid-1 1 20971264 4K 80G 64000
Meta_backup_2018Sep07_154649 meta-volume ok ok false true raid-1 1 20971264 4K 80G 64000
meets the requirement of metadata-backup :
VPlexcli:/> configuration show-meta-volume-candidates
Name Capacity Vendor IO Status Type Array Name
---------------------------------------- -------- -------- --------- ----------- ---------------------
VPD83T3:60000970000XXXXXXXXXXXXXXXXXXXXX 187G EMC alive traditional EMC-SYMMETRIX-XXXXXXXXX
VPD83T3:60000970000XXXXXXXXXXXXXXXXXXXXX 98.5G EMC alive traditional EMC-SYMMETRIX-XXXXXXXXX
For example:
VPlexcli:/clusters/cluster-1/system-volumes> meta-volume destroy Meta_backup_2018Sep07_154649
3. Run the 'schedule list' command from the VPlexcli to display the current "metadata backup local"
schedule and the job number associated with it.
For example:
VPlexcli:/> schedule list
[0] 30 13 * * 3 syrcollect
[2] 23 30 * * * metadata backup local
4. Remove the 'metadata backup local' schedule by running 'schedule remove [job ID]' mentioned in
step (3).
Removed scheduled job 2.
5. Unclaim both of the former metadata backup volumes by using below command.
For example with the metadata backup name,
VPlexcli: clusters/cluster-1/storage-elements/storage-volumes> unclaim Meta_backup_2018Sep07_154649
Example for VPD number:
VPD83T3:60000970000292XXXXXXXXXXXXXXXXXXXXX
VPlexcli: clusters/cluster-1/storage-elements/storage-volumes> unclaim VPD83T3:60000970000284XXXXXXXXXXXXXXXXXXXXX
*Note: you may see that a scheduled time is already set, if you want to keep that scheduled time
type "Y," if not type "N" and later in the script you will be prompted for the new time you want to
have the metadata backups run.
Example of configuring the metadata backup:
VPlexcli:/clusters/cluster-1/system-volumes> configuration metadata-backup
A back up of the meta-data is already scheduled to occur everyday at
4:45 (UTC).
Do you want change the existing schedule? (Y/N): Y <<< Y to keep current time
Configuring Meta-data Backups
To configure meta-data backups you will need to select two unclaimed
volumes (78G or greater), preferably on two different arrays. Backups
will occur automatically each day, at a time you specify. Please note:
All times are UTC and are not based on the local time.
VPLEX is currently configured to backup metadata on the following
volumes :
VPD83T3:6000097000029XXXXXXXXXXXXXXXXXXXXX,VPD83T3:6006048000029030XXXXXXXXXXXXXXXXXXXXX
Would you like to change the volumes on which to backup the metadata? [no]: Yes
Available Volumes for Meta-data Backup
Name Capacity Vendor IO Status Type Array Name
---------------------------------------- -------- -------- --------- ----------- -----------------------
VPD83T3:6000097000029XXXXXXXXXXXXXXXXXXXXX 120G EMC alive traditional EMC-SYMMETRIX-<serial number>
VPD83T3:6006048000029030XXXXXXXXXXXXXXXXXX 120G EMC alive traditional EMC-SYMMETRIX-<serial number>
Please select volumes for meta-data backup, preferably from two
different arrays (volume1,volume2):VPD83T3:6006048000029030XXXXXXXXXXXXXXXXXXXXX,VPD83T3:6006048000029030XXXXXXXXXXXXXXXXXXXXX
VPLEX is configured to back up meta-data every day at 04:45 (UTC).
Would you like to change the time the meta-data is backed up? [no]: N
You have chosen to configure the backup of the meta-data. Please note:
All times are UTC and are not based on the local time.
Review and Finish
Would you like to run the setup process now? [yes]: yes
Scheduling the backup of metadata...
Performing metadata backup (This will take a few minutes)
Successfully performed the initial backing up of metadata
Successfully scheduled the backing up of metadata
Successfully scheduled the metadata backup
The metadata backup has been successfully scheduled.
7. To see that the new metadata backups have been created and what the system has named them
from the VPlexcli prompt run "ll clusters/cluster-<id>/system-volumes"
Sample output:
VPlexcli:/> ll /clusters/cluster-1/system-volumes/
/clusters/cluster-1/system-volumes:
Name Volume Type Operational Health Active Ready Geometry Component Block Block Capacity Slots
------------------------------- ----------- Status State ------ ----- -------- Count Count Size -------- -----
------------------------------- ----------- ----------- ------ ------ ----- -------- --------- -------- ----- -------- -----
C1_Meta meta-volume ok ok true true raid-1 2 20971264 4K 80G 64000
C1_Meta_backup_2018Oct07_123208 meta-volume ok ok false true raid-1 1 20971264 4K 80G 64000
C1_Meta_backup_2018Oct07_123208 meta-volume ok ok false true raid-1 1 20971264 4K 80G 64000
8. Then run the 'schedule list' command again to confirm the "metadatabackup local' is listed with
the correct time you had set it to run each day.
VPlexcli:/> schedule list
[0] 56 18 * * * syrcollect
[1] 45 4* * * metadatabackup local
9. Now that you have removed the old metadata backup volumes and re-created new ones, monitor
the backups for a couple days to ensure they run as scheduled. The script will alternate between
the two backup volumes each time the backup script runs so you should see one backup dated
the day after the other. The time the backup runs is appended to the backup name and it is okay if
it is not exactly at the time set, it may vary a little, this is normal. You should see at least one
backup volume has a new date if this is the first time running the new backups.
Example:
VPlexcli:/> ll /clusters/*/system-volumes/
/clusters/cluster-1/system-volumes:
Name Volume Type Operational Health Active Ready Geometry Component Block Block Capacity Slots
------------------------------- -------------- Status State ------ ----- -------- Count Count Size -------- -----
------------------------------- -------------- ----------- ------ ------ ----- -------- --------- -------- ----- -------- -----
C1Logging_vol logging-volume ok ok - - raid-0 1 2621440 4K 10G -
C1_Meta meta-volume ok ok true true raid-1 2 20971264 4K 80G 64000
C1_Meta_backup_2018Oct08_044532 meta-volume ok ok false true raid-1 1 20971264 4K 80G 64000
C1_Meta_backup_2018Oct07_123208 meta-volume ok ok false true raid-1 1 20971264 4K 80G 64000
|
Informations supplémentaires
Sample out put for seeing what volumes were available for use to create metadata backup volumes,
VPlexcli:/clusters/cluster-1/storage-elements/storage-volumes> configuration show-meta-volume-candidates
Name Capacity Vendor IO Status Type Array Name
---------------------------------------- -------- -------- --------- ----------- ---------------------------
VPD83T3:60060160c9c02cXXXXXXXXXXXX 80G DGC alive traditional EMC-CLARiiON-<serial number>
VPD83T3:60060160c9c02c0XXXXXXXXXXX 80G DGC alive traditional EMC-CLARiiON-<serial number>
Sample output when you want to change the time the backups will run:
VPlexcli:/> configuration metadata-backup
A back up of the meta-data is already scheduled to occur everyday at 4:15 (UTC). Do you want change the existing schedule? (Y/N): y
Configuring Meta-data Backups
To configure meta-data backups you will need to select two unclaimed
volumes (78G or greater), preferably on two different arrays. Backups will occur automatically each day, at a time you specify. Please note: All times are UTC and are not based on the local time.
Available Volumes for Meta-data Backup
Name Capacity Vendor IO Status Type Array Name
---------------------------------------- -------- -------- --------- ----------- ---------------------------
VPD83T3:60060160c9c02cXXXXXXXXXXXX 80G DGC alive traditional EMC-CLARiiON-<serial number>
VPD83T3:60060160c9c02c0XXXXXXXXXXXX 80G DGC alive traditional EMC-CLARiiON-<serial number>
Please select volumes for meta-data backup, preferably from two different arrays (volume1,volume2):VPD83T3:60060160c9c02c00XXXXXXXXXXXX,VPD83T3:60060160c9c02c0058XXXXXXXXXXXX
VPLEX is configured to back up meta-data every day at 04:15 (UTC).
Would you like to change the time the meta-data is backed up? [no]: yes << [Here is where you're asked again if want to change the time the backups will run]
What hour of the day (UTC) should the meta-data be backed up? (0..23): 23
What minute of the hour should the meta-data be backed up? (0..59): 00
VPLEX is configured to back up meta-data every day at 23:00 (UTC).
Would you like to change the time the meta-data is backed up? [no]: <<
[use the default selection this time to keep the newly set time by pressing the Enter/Return key]
You have chosen to configure the backup of the meta-data. Please note:
All times are UTC and are not based on the local time.
Review and Finish
Review the configuration information below. If the values are correct,
enter yes (or simply accept the default and press Enter) to start the
setup process. If the values are not correct, enter no to go back and
make changes or to exit the setup.
Meta-data Backups
Meta-data will be backed up every day at 23:00.
The following volumes will be used for the backup
:VPD83T3:60060160c9c02XXXXXXXXXXXX,VPD83T3:60060160c9c02c005XXXXXXXXXXXX
Would you like to run the setup process now? [yes]: <<
use the default selection, just press the Enter/Return key
Scheduling the backup of metadata...
Performing metadata backup (This will take a few minutes)
Successfully performed the initial backing up of metadata
Successfully scheduled the backing up of metadata
Successfully scheduled the metadata backup
The metadata backup has been successfully scheduled.