NetWorker NMC upgrade to 19.10.0.5 on Linux fails - migration to new Postgres version 12
Summary: When the NetWorker NMC package is upgraded, part of the upgrade procedure is the move to a new Postgres version if needed. The upgrade fails with: - A space issue: could not write file. No space left on the device - A cleanup issue: rm: cannot remove '/nsr/nmc/nmcdb': Device or resource busy ...
Symptoms
Postgres will not start after installing the NMC package. We find errors in /opt/lgtonmc/logs/install.log.
In the install log we see:
error while copying relation "public.gst_action_saveset": could not write file "/nsr/nmc/nmcdb12/pgdata/base/16401/17097.2": No space left on device
Failure, exiting
Or:
./delete_old_cluster.sh
10/10/24 08:30:06.697501 gstdbinit-D0 pg_upgrade Succeeded
Upgrade to Postgres 12 Successful
rm: cannot remove '/nsr/nmc/nmcdb': Device or resource busy
Cause
Moving to a newer Postgres version is part of the NMC upgrade process.
pg_upgrade runs and dumps the database files into a new location (/nsr/nmc/nmcdb12).
This requires enough space in /nsr/nmc.
When the Postgres update is successful, the old path /nsr/nmc/nmcdb is deleted by the ./delete_old_cluster.sh script.
There are different scenarios for the rm command to fail:
- The path is in use; something is actively using the /nsr/nmc/nmcdb during the upgrade process.
- The default permission settings of /nsr/nmcdb have been modified.
- The nmcdb resides on a remote mount point, instead of a local file system device. NetWorker has specific limitations regarding /nsr mount points on NFS storage. These limitations are detailed in the NetWorker Installation Guide, available through: https://www.dell.com/support/home/product-support/product/networker/docs.
Resolution
Before upgrading to a new NetWorker NMC version, run the following checks:
- Check space usage of nmcdb; run :
du -sk /nsr/nmc/nmcdb
- Check free space for the file system for path /nsr/nmc/; run:
df -h
The upgrade process deletes the old database with the command "rm -rf /nsr/nmc/nmcdb"
Additional Information
NetWorker 19.10.0.x supports NFS versions 4.0, 4.1 and 4.2 for NFS share configuration only. NetWorker 19.10.0.x NFS share installation is qualified in Linux platforms (SLES 15 SP1, RHEL 9.0, Ubuntu 18.04 LTS, Oracle Linux (UEK kernel) 8.8).
Example of default permissions of nmcdb file system:
[root@nsr ~]# ls -ltr /nsr/nmc/ total 0 drwx------. 3 nsrnmc nsrnmc 40 Oct 16 12:36 nmcdb drwx------. 2 root root 6 Oct 28 14:00 nmcdb_stage [root@nsr ~]# ls -ltr /nsr/nmc/nmcdb total 8 -rw-------. 1 nsrnmc nsrnmc 65 Jun 7 14:30 gstd_db.conf drwx------. 20 nsrnmc nsrnmc 4096 Oct 28 11:35 pgdata [root@nsr ~]# ls -ltr /nsr/nmc/nmcdb/pgdata/ total 72 -rw-------. 1 nsrnmc nsrnmc 3 Oct 16 12:36 PG_VERSION drwx------. 2 nsrnmc nsrnmc 6 Oct 16 12:36 pg_twophase drwx------. 2 nsrnmc nsrnmc 6 Oct 16 12:36 pg_tblspc drwx------. 2 nsrnmc nsrnmc 6 Oct 16 12:36 pg_snapshots drwx------. 2 nsrnmc nsrnmc 6 Oct 16 12:36 pg_serial drwx------. 2 nsrnmc nsrnmc 6 Oct 16 12:36 pg_replslot drwx------. 2 nsrnmc nsrnmc 6 Oct 16 12:36 pg_dynshmem drwx------. 2 nsrnmc nsrnmc 6 Oct 16 12:36 pg_commit_ts -rw-------. 1 nsrnmc nsrnmc 88 Oct 16 12:36 postgresql.auto.conf -rw-------. 1 nsrnmc nsrnmc 1636 Oct 16 12:36 pg_ident.conf drwx------. 2 nsrnmc nsrnmc 18 Oct 16 12:36 pg_subtrans -rw-r--r--. 1 nsrnmc nsrnmc 26821 Oct 16 12:36 postgresql.conf -rw-------. 1 nsrnmc nsrnmc 1679 Oct 16 12:36 server.key -rw-------. 1 nsrnmc nsrnmc 981 Oct 16 12:36 server.crt drwx------. 2 nsrnmc nsrnmc 18 Oct 16 12:36 pg_xact drwx------. 4 nsrnmc nsrnmc 36 Oct 16 12:36 pg_multixact drwx------. 6 nsrnmc nsrnmc 58 Oct 16 12:36 base -rw-r--r--. 1 nsrnmc nsrnmc 4245 Oct 16 12:36 pg_hba.conf drwx------. 2 nsrnmc nsrnmc 188 Oct 22 00:00 pg_log drwx------. 2 nsrnmc nsrnmc 18 Oct 28 11:35 pg_notify -rw-------. 1 nsrnmc nsrnmc 70 Oct 28 11:35 postmaster.opts -rw-------. 1 nsrnmc nsrnmc 33 Oct 28 11:35 current_logfiles -rw-------. 1 nsrnmc nsrnmc 82 Oct 28 11:35 postmaster.pid drwx------. 2 nsrnmc nsrnmc 6 Oct 28 11:35 pg_stat drwx------. 2 nsrnmc nsrnmc 4096 Oct 28 11:35 global drwx------. 4 nsrnmc nsrnmc 68 Oct 28 14:05 pg_logical drwx------. 3 nsrnmc nsrnmc 92 Oct 28 14:05 pg_wal drwx------. 2 nsrnmc nsrnmc 126 Oct 28 14:44 pg_stat_tmp
Additional KBs:
NetWorker: NMC Service and Accessibility Issues (General Troubleshooting Guide)
NetWorker: How to Recover the NMC database?