PowerFlex Upgrade from 3.x to 4.x fails a database backup fails due to issue with ElasticSearch
Summary: This article explains how to solve the migration failure from PFxM 3.x to PFMP 4.x due to a 3.x PFxM backup failure caused by ElasticSearch data.
Symptoms
PowerFlex Upgrade from 3.x to 4.x fails during the import procedure a database backup of the old PowerFlex 3.8.3 is triggered and fails.
2025-05-21 15:10:06,299 INFO [backup-1,tid=252] (BackupApplianceCallable.java:216): Executing the script: /opt/Dell/scripts/backup-elasticsearch.py
2025-05-21 15:10:16,779 DEBUG [backup-1,tid=252] (ExecuteSystemCommands.java:103): Last login: Wed May 21 15:10:01 CEST 2025 on cron
2025-05-21 15:10:16,780 ERROR [backup-1,tid=252] (ExecuteSystemCommands.java:127): Non-zero return code running OS command /usr/bin/sudo /opt/Dell/scripts/backup-elasticsearch.py: 3
2025-05-21 15:10:16,781 ERROR [backup-1,tid=252] (ExecuteSystemCommands.java:128): Console output running OS command: Last login: Wed May 21 15:10:01 CEST 2025 on cron
2025-05-21 15:10:16,781 ERROR [backup-1,tid=252] (BackupApplianceCallable.java:225): Error executing the script /opt/Dell/scripts/backup-elasticsearch.py: rc=3
2025-05-21 15:10:16,781 ERROR [backup-1,tid=252] (BackupApplianceCallable.java:246): Unable to backup the database. com.dell.asm.i18n2.exception.AsmCheckedException: The appliance cannot be backed because of an unknown exception.
2025-05-21 15:10:16,785 INFO [backup-1,tid=252] (BackupApplianceCallable.java:73): Executing the script /opt/Dell/scripts/backup-clean.sh
/opt/Dell/ASM/logs/asmManger.log
Line 16272: 2025-05-21 11:32:50,308 DEBUG [backup-1,tid=245] (ExecuteSystemCommands.java:103): rm: cannot remove ‘/var/es-backup/indices’: Directory not empty
Line 16276: 2025-05-21 11:32:50,315 ERROR [backup-1,tid=245] (ExecuteSystemCommands.java:128): Console output running OS command: rm: cannot remove ‘/var/es-backup/indices’: Directory not emptyCleanup of /var/es-backup failed with RC=1Last login: Wed May 21 11:32:35 CEST 2025
Line 18203: 2025-05-21 12:02:54,161 DEBUG [backup-1,tid=245] (ExecuteSystemCommands.java:103): rm: cannot remove ‘/var/es-backup/indices’: Directory not empty
Line 18207: 2025-05-21 12:02:54,165 ERROR [backup-1,tid=245] (ExecuteSystemCommands.java:128): Console output running OS command: rm: cannot remove ‘/var/es-backup/indices’: Directory not emptyCleanup of /var/es-backup failed with RC=1Last login: Wed May 21 12:02:37 CEST 2025Cause
An issue with the Elastic Search data size resulting in a failed PFxM Backup.
Resolution
Commands referenced in this KB are ran on the 3.x PowerFlex Manager CLI using delladmin account.
To provide a rollback option, although its not needed since the 3.x PFxM will be decommissioned after the migration is complete. Prior to applying the steps in this KB. You can take a snapshot of the PFxM 3.x VM.
- Uncheck snapshot virtual machine memory
- Check Quiesce guest filesystem.
1.) Run the following command to list all indices in the elasticsearch database. (max size should be 5G)
curl localhost:9200/_cat/indices?v
If there are many snmp trap indices, it is ok to delete older traps if you are ok with it:
Ex.
(this deletes a specific date)
[delladmin@pfxm ~]$ curl localhost:9200/_cat/indices?v | grep snmp-traps-2024.01.11
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 203k 100 203k 0 0 232k 0 --:--:-- --:--:-- --:--:-- 232k
green open snmp-traps-2024.01.11 cXY2UDexQnapOdS-Jgj4ig 1 0 3068 0 1mb 1mb
[delladmin@pfxm ~]$ curl -X DELETE localhost:9200/snmp-traps-2024.01.11
{"acknowledged":true}
[delladmin@pfxm ~]$
[delladmin@pfxm ~]$ curl localhost:9200/_cat/indices?v | grep snmp-traps-2024.01.11
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 202k 100 202k 0 0 305k 0 --:--:-- --:--:-- --:--:-- 306k
Ex.
Or you can delete all snmp-traps, syslog, and scaleio indices by the below commands to shrink the size of the PFxM backup:
2.) Run the following commands to delete all indices
curl -X DELETE http://localhost:9200/syslog*
curl -X DELETE http://localhost:9200/scaleio*
curl -X DELETE http://localhost:9200/snmp-traps*
The following command is also an option but will Delete ALL data in ElasticSearch:
*Warning: This command will remove all historical alerts, performance metrics from resources including switches if applicable, and syslog data.
curl -X DELETE http://localhost:9200/_all
3.) Run the following commands to remove elasticsearch log files.
sudo systemctl stop elasticsearch
(Note: Do not delete vxfm-es-cluster.log)
cd /var/log/elasticsearch
rm -rf *log.gz*
rm -rf *gc.log*
rm -rf *showlog.log*
sudo systemctl start elasticsearch
sudo systemctl status elasticsearch
4.) Restart the rsyslog service and check the status with the commands below.
sudo systemctl restart rsyslog
sudo systemctl status rsyslog