Metro node: Post upgrade to 8.0.x, the metadata backup stops functioning

Summary: This article talks to the issue where post upgrade to 8.0.x code, the metadata backup becomes non-operational. This article provides the workaround steps to restore the metadata backup functionality. ...

Érintett termékek

Ez a cikk a következő(k)re vonatkozik: Ez a cikk nem vonatkozik a következő(k)re: Ez a cikk nem kapcsolódik egyetlen konkrét termékhez sem. Ez a cikk nem azonosítja az összes termékverziót.

További források megtekintése

Symptoms

Dell Impacted Hardware:
Metro node mn114
Metro node mn215
Metro Node-Local/Metro

Dell Impacted Software:
Metro node OS 8.0.0.0.0.267
Metro node OS 8.0.0.1.0.21
Metro node OS 8.0.1.0.0.220

Change Activities Impacted:
Post upgrade to metro node OS 8.0.x

Issue:

The ndu pre-check command reports the below error for each Cluster in a metro node Configuration:

Example for Cluster-1:

VPlexcli:/> ndu pre-check
Warning:
During the NDU process, multiple directors will be offline for a portion of the time.
This is non-disruptive but is dependent on a host-based multipathing solution being
installed, configured, and operating on all connected hosts.
================================================================================
Performing NDU pre-checks
================================================================================
Verify NDU is not in progress..                                            OK
Verify that the directors have been running continuously for 15 days..     OK
Verify director communication status..                                     OK
. . .
Verify meta-volume backup configuration..                                  ERROR
. . .
================================================================================
Errors (x errors found)
================================================================================
cluster-1
     Metadata backups are NOT created according to schedule
     Last backup: Mon Aug 19 00:00:00 UTC 20xx
     Current time: Fri Dec 13 03:41:33 UTC 20xx
     There has been no metadata backup for 116 day(s)
     Run 'metadatabackup local' on cluster-1

Example for Cluster-2:

VPlexcli:/> ndu pre-check
Warning:
During the NDU process, multiple directors will be offline for a portion of the time.
This is non-disruptive but is dependent on a host-based multipathing solution being
installed, configured, and operating on all connected hosts.
================================================================================
Performing NDU pre-checks
================================================================================
Verify NDU is not in progress..                                            OK
Verify that the directors have been running continuously for 15 days..     OK
Verify director communication status..                                     OK
. . .
Verify meta-volume backup configuration..                                  ERROR
. . .
================================================================================
Errors (x errors found)
================================================================================
cluster-2
     Metadata backups are NOT created according to schedule
     Last backup: Sat Mar 16 01:30:00 UTC 20xx
     Current time: Fri Dec 13 03:41:33 UTC 20xx
     There has been no metadata backup for 272 day(s)
     Run 'metadatabackup local' on cluster-2

When command the ll ~system-volumes command is run, the metadata backup volume date reflects a previous date.

In the below example, the metadata backup stops working on both the clusters in a Metro environment:

VPlexcli:/> ll ~system-volumes
/clusters/cluster-1/system-volumes:

Name                                     Volume Type     Operational  Health  Active  Ready  Geometry  Component  Block     Block  Capacity  Slots

---------------------------------------  --------------  Status       State   ------  -----  --------  Count      Count     Size   --------  -----

---------------------------------------  --------------  -----------  ------  ------  -----  --------  ---------  --------  -----  --------  -----

meta_C1_xxxxxx                           meta-volume     ok           ok      true    true   raid-1    2          20971264  4K     80G       64000

meta_C1_xxxxxxx_backup_20xx-11-21_01-30  meta-volume     ok           ok      false   true   raid-1    1          20971264  4K     80G       64000
                        \------------/ date and time the last backup was run 

/clusters/cluster-2/system-volumes:

Name                                     Volume Type     Operational  Health  Active  Ready  Geometry  Component  Block     Block  Capacity  Slots

---------------------------------------  --------------  Status       State   ------  -----  --------  Count      Count     Size   --------  -----

---------------------------------------  --------------  -----------  ------  ------  -----  --------  ---------  --------  -----  --------  -----

meta_C2_xxxxxx                           meta-volume     ok           ok      true    true   raid-1    2          20971264  4K     80G       64000

meta_C2_xxxxxxx_backup_20xx-11-20_12-43  meta-volume     ok           ok      false   true   raid-1    1          20971264  4K     80G       64000
                       \------------/ date and time the last backup was run

Symptoms:

The metadata backup stops working on both clusters in a Metro environment.
The metadata backup stops working on either one of the clusters in a Metro environment
The metadata backup stops working in a local cluster

Cause

During the scheduled daily metadata backup, the service "daily_metadata_backup.service" occasionally gets stuck in the activating state on either director-1-1-A, director-2-1-A, or both.

Resolution

Permanent Resolution:

Metro node Engineering is investigating this issue. When a fix is available, this article will be updated.

Workaround:

To check the status of the service "daily_metadata_backup.service," at the Shell Prompt, run the command, sudo systemctl status daily_metadata_backup.service on an A-node for example, director-1-1-A or director-2-1-A. Check and confirm that the "Active: activating (start)" attribute is present and it is running longer than a minute. If yes, this means that this service is stuck on that particular A-node.

The below example shows that director-1-1-A and director-2-1-A both have service "daily_metadata_backup.service" attribute "Active: activating (start)" present and has been running longer than a minute which means this service is stuck on these nodes as shown below.

Cluster-1:

service@director-2-1-a:~> sudo systemctl status daily_metadata_backup.service
● daily_metadata_backup.service - metronode automated daily metadata backups
     Loaded: loaded (/etc/systemd/system/daily_metadata_backup.service; static)
     Active: activating (start) since Sat 2024-10-xx 01:30:18 UTC; 1 month 3 days ago   <---------------------------
TriggeredBy: ● daily_metadata_backup.timer
   Main PID: 22553 (daily_metadata_)
      Tasks: 1
     CGroup: /system.slice/daily_metadata_backup.service
             └─22553 /usr/bin/python3 /opt/dell/vplex/sbin/daily_metadata_backup.py
Oct xx 01:30:18 director-2-1-a systemd[1]: Starting metronode automated daily metadata backups...
.
.
.
<truncated>

Cluster-2:

service@director-1-1-a:~> sudo systemctl status daily_metadata_backup.service
● daily_metadata_backup.service - metronode automated daily metadata backups
     Loaded: loaded (/etc/systemd/system/daily_metadata_backup.service; static)
     Active: activating (start) since Sat 2024-10-xx 01:30:18 UTC; 1 month 2 days ago   <---------------------------
TriggeredBy: ● daily_metadata_backup.timer
   Main PID: 22553 (daily_metadata_)
      Tasks: 1
     CGroup: /system.slice/daily_metadata_backup.service
             └─22553 /usr/bin/python3 /opt/dell/vplex/sbin/daily_metadata_backup.py
Oct xx 01:30:18 director-1-1-a systemd[1]: Starting metronode automated daily metadata backups...
.
.
.
<truncated>

Next to check the status of service "daily_metadata_backup.timer" on A-node for example, director-1-1-A, director-2-1-A, run the command sudo systemctl status daily_metadata_backup.timer and confirm that the "Trigger:" attribute shows as "n/a." If yes, this means that this service is stuck on that particular A-node.

The below example shows that director-1-1-A and director-2-1-A both have service "daily_metadata_backup.timer" attribute "Trigger:" showing as "n/a," which means this service is stuck on these nodes.

Cluster-1:

service@director-1-1-a:~> sudo systemctl status daily_metadata_backup.timer
● daily_metadata_backup.timer - metronode automated daily metadata backups
Loaded: loaded (/etc/systemd/system/daily_metadata_backup.timer; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/daily_metadata_backup.timer.d
└─daily_backup.conf
Active: active (running) since Wed 2024-11-20 12:46:10 UTC; 18h ago
Trigger: n/a                       <<<<<<<<<<<<
Triggers: ● daily_metadata_backup.service
Nov 20 12:46:10 director-1-1-a systemd[1]: Started metronode automated daily metadata backups.
service@director-1-1-a:~>

Cluster-2:

service@director-2-1-a:~> sudo systemctl status daily_metadata_backup.timer
● daily_metadata_backup.timer - metronode automated daily metadata backups
Loaded: loaded (/etc/systemd/system/daily_metadata_backup.timer; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/daily_metadata_backup.timer.d
└─daily_backup.conf
Active: active (running) since Wed 2024-11-xx 12:46:10 UTC; 18h ago
Trigger: n/a                           >>>>>>>>>>>>>>>>>>>>>>>
Triggers: ● daily_metadata_backup.service
Nov xx 12:46:10 director-2-1-a systemd[1]: Started metronode automated daily metadata backups.
service@director-2-1-a:~>

Once it is confirmed which node, or possibly both nodes, has the two mentioned services stuck, stop the services "daily_metadata_backup.service" and "daily_metadata_backup.timer," and then start the service for "daily_metadata_backup.timer" to resolve this situation and for the metadata backup to start functioning.

NOTE: Do not use the "restart" command option.

In the below example since both A-nodes are affected, the services are stopped and started as follows:
```
sudo systemctl stop daily_metadata_backup.service
```
```
sudo systemctl stop daily_metadata_backup.timer
```
```
sudo systemctl start daily_metadata_backup.timer
```

Run below command to check the status to confirm it is not stuck anymore as follows:

The below examples show the running of the status command for "daily_metadata_backup.service" to check if the "Active: inactive (dead)," line that signifies that the service is indeed not running, which when waiting for the next backup cycle of the metadata is "inactive (dead)":

service@director-2-1-a:~> sudo systemctl status daily_metadata_backup.service
● daily_metadata_backup.service - metronode automated daily metadata backups
     Loaded:  loaded (/etc/systemd/system/daily_metadata_backup.service; static)
     Active: inactive (dead) since Fri 2024-11-22 21:07:41 UTC; 1min 49s ago          >>>>>>>>>>>>
TriggeredBy: ● daily_metadata_backup.timer
    Process: 9183 ExecStart=/opt/dell/vplex/sbin/daily_metadata_backup.py (code=exited, status=0/SUCCESS)
   Main PID: 9183 (code=exited, status=0/SUCCESS)
Nov 22 21:07:36 director-2-1-a systemd[1]: Starting metronode automated daily metadata backups...
Nov 22 21:07:41 director-2-1-a systemd[1]: daily_metadata_backup.service: Succeeded.
Nov 22 21:07:41 director-2-1-a systemd[1]: Finished metronode automated daily metadata backups.
service@director-2-1-a:~>

service@director-1-1-a:~> sudo systemctl status daily_metadata_backup.service
● daily_metadata_backup.service - metronode automated daily metadata backups
     Loaded:  loaded (/etc/systemd/system/daily_metadata_backup.service; static)
     Active: inactive (dead) since Fri 2024-11-22 21:07:41 UTC; 1min 49s ago          >>>>>>>>>>>>
TriggeredBy: ● daily_metadata_backup.timer
    Process: 9183 ExecStart=/opt/dell/vplex/sbin/daily_metadata_backup.py (code=exited, status=0/SUCCESS)
   Main PID: 9183 (code=exited, status=0/SUCCESS)
Nov 22 21:07:36 director-1-1-a systemd[1]: Starting metronode automated daily metadata backups...
Nov 22 21:07:41 director-1-1-a systemd[1]: daily_metadata_backup.service: Succeeded.
Nov 22 21:07:41 director-1-1-a systemd[1]: Finished metronode automated daily metadata backups.
service@director-2-1-a:~>

Below example shows service "daily_metadata_backup.timer" should be "active(waiting)" and "Trigger" should be set to current or present day, signifying that the service is now functioning as expected:

service@director-2-1-a:~> sudo systemctl status daily_metadata_backup.timer
● daily_metadata_backup.timer - metronode automated daily metadata backups
     Loaded: loaded (/etc/systemd/system/daily_metadata_backup.timer; enabled; vendor preset: disabled)
    Drop-In: /etc/systemd/system/daily_metadata_backup.timer.d
             └─daily_backup.conf
     Active: active (waiting) since Fri 2024-11-22 21:09:24 UTC; 14s ago   >>>>>>>>>>>
    Trigger: Sat 2024-11-23 01:30:00 UTC; 4h 20min left   >>>>>>>>>>>
   Triggers: ● daily_metadata_backup.service
Nov 22 21:09:24 director-2-1-a systemd[1]: Started metronode automated daily metadata backups.
service@director-2-1-a:~>

service@director-1-1-a:~> sudo systemctl status daily_metadata_backup.timer
● daily_metadata_backup.timer - metronode automated daily metadata backups
     Loaded: loaded (/etc/systemd/system/daily_metadata_backup.timer; enabled; vendor preset: disabled)
    Drop-In: /etc/systemd/system/daily_metadata_backup.timer.d
             └─daily_backup.conf
     Active: active (waiting) since Fri 2024-11-22 21:09:24 UTC; 14s ago   >>>>>>>>>>>
    Trigger: Sat 2024-11-23 01:30:00 UTC; 4h 20min left   >>>>>>>>>>>
   Triggers: ● daily_metadata_backup.service
Nov 22 21:09:24 director-1-1-a systemd[1]: Started metronode automated daily metadata backups.
service@director-2-1-a:~>

Wait and monitor for the next metadata backup to be completed by running ll ~system-volumes command to confirm that the issue has been resolved, and metadata backup is occurring successfully as follows.

Example:

VPlexcli:/> ll ~system-volumes
/clusters/cluster-1/system-volumes:

Name                                     Volume Type     Operational  Health  Active  Ready  Geometry  Component  Block     Block  Capacity  Slots

---------------------------------------  --------------  Status       State   ------  -----  --------  Count      Count     Size   --------  -----

---------------------------------------  --------------  -----------  ------  ------  -----  --------  ---------  --------  -----  --------  -----

meta_C1_xxxxxx                           meta-volume     ok           ok      true    true   raid-1    2          20971264  4K     80G       64000
meta_C1_xxxxxxx_backup_2024-11-23_01-30  meta-volume     ok           ok      false   true   raid-1    1          20971264  4K     80G       64000
meta_C1_4UQT429_backup_2024-11-24_01-30  meta-volume     ok           ok      false   true   raid-1    1          20971264  4K     80G       64000

/clusters/cluster-2/system-volumes:

Name                                     Volume Type     Operational  Health  Active  Ready  Geometry  Component  Block     Block  Capacity  Slots

---------------------------------------  --------------  Status       State   ------  -----  --------  Count      Count     Size   --------  -----

---------------------------------------  --------------  -----------  ------  ------  -----  --------  ---------  --------  -----  --------  -----

meta_C2_xxxxxx                           meta-volume     ok           ok      true    true   raid-1    2          20971264  4K     80G       64000
meta_C2_xxxxxxx_backup_2024-11-23_12-43  meta-volume     ok           ok      false   true   raid-1    1          20971264  4K     80G       64000
meta_C2_xxxxxxx_backup_2024-11-24_12-43  meta-volume     ok           ok      false   true   raid-1    1          20971264  4K     80G       64000

Érintett termékek

metro node

Termékek

metro node mn-114, metro node mn-215

Article Number: 000264665

Article Type: Solution

Utoljára módosítva: 22 ápr. 2025

Version: 4

Ellenőrizze, hogy a készüléke rendelkezik-e támogatási szolgáltatással.

Metro node: Post upgrade to 8.0.x, the metadata backup stops functioning

Summary: This article talks to the issue where post upgrade to 8.0.x code, the metadata backup becomes non-operational. This article provides the workaround steps to restore the metadata backup functionality. ...

Symptoms

Cause

Resolution

Érintett termékek

Symptoms

Cause

Resolution

Permanent Resolution:

Érintett termékek

Termékek

Termék tulajdonságai

Választ kaphat kérdéseire más Dell-felhasználóktól

Támogatási szolgáltatások

Termék tulajdonságai

Választ kaphat kérdéseire más Dell-felhasználóktól

Támogatási szolgáltatások

Metro node: Post upgrade to 8.0.x, the metadata backup stops functioning

Summary: This article talks to the issue where post upgrade to 8.0.x code, the metadata backup becomes non-operational. This article provides the workaround steps to restore the metadata backup functionality. ... Több megjelenítése Kevesebb megjelenítése

Részletes cikk

Symptoms

Cause

Resolution

Érintett termékek

Symptoms

Cause

Resolution

Permanent Resolution:

Érintett termékek

Termékek

Termék tulajdonságai

Választ kaphat kérdéseire más Dell-felhasználóktól

Támogatási szolgáltatások

Termék tulajdonságai

Választ kaphat kérdéseire más Dell-felhasználóktól

Támogatási szolgáltatások

Summary: This article talks to the issue where post upgrade to 8.0.x code, the metadata backup becomes non-operational. This article provides the workaround steps to restore the metadata backup functionality. ...