NetWorker: orphan save sets (SSID) on CloudBoost MagFS after Magfs SDK returned: CONNECTION_DISCONNECTED

摘要: Possible orphan files in the CloudBoost file system leading in over usage of cloud space

本文章適用於 本文章不適用於 本文無關於任何特定產品。 本文未識別所有產品版本。

症狀

NetWorker backups are configured to go to a CloudBoost device. The NetWorker Storage Node performing the write operation observes a CONNECTION_DISCONNECTED error:
The error appears on the save process at the storage node, but is the Storage Node nsrmmd daemon reports the following:

MM/DD/YYYY HH:mm:SS nsrmmd SYSTEM critical Unable to write to a file: CONNECTION_DISCONNECTED
MM/DD/YYYY HH:mm:SS nsrmmd SYSTEM error Cannot write to base/NW_DEVICE_NAME/##/##/LONG_SSID - errno No error.
MM/DD/YYYY HH:mm:SS nsrmmd SYSTEM critical Unable to write buffer to disk for ssid=SSID: Failed opening file / directory base/NW_DEVICE_NAME/##/##/LONG_SSID: Magfs SDK returned: CONNECTION_DISCONNECTED.
MM/DD/YYYY HH:mm:SS nsrmmd NSR warning Unable to close or sync file for ssid=SSID: Failed to get fd for ssid=SSID to sync/close file
MM/DD/YYYY HH:mm:SS nsrmmd SYSTEM critical Unable to remove SSID file 'base/NW_DEVICE_NAME/##/##/LONG_SSID' on device 'rd=storagenode:base/NW_DEVICE_NAME': Unable to stat save set file 'base/NW_DEVICE_NAME/##/##/LONG_SSID': Unable to retrieve the file statistics: CONNECTION_DISCONNECTED
MM/DD/YYYY HH:mm:SS nsrmmd NSR error MM/DD/YY HH:mm:SS nsrmmd #5: save set \\NAS_FILER\CIFS_SHARE$\~snapshot\backup.0 for client nw_client_NAS was aborted and removed from volume VOLUME_NAME
MM/DD/YYYY HH:mm:SS nsrsnmd SYSTEM notice nw_cbcl_disconnect: Mount handle is NULL.

Similar errors are found in the NetWorker server and storage node's daemon.raw:

All the backups writing to the same CloudBoost are aborted simultaneously, and then run as zombie until reach the timeout. 
The probe is shown above where the CloudBoost MagFS is showing multiple Save Set IDs (SSID) finishing at the exact same time:

/mnt/magfs/base/NW_CB_DEVICE//active$ ls -la
total 3586415541
drwx--x--- 1 root root            0 Jul 11 15:54 .
d--------- 1 root root            0 Jul  6 15:56 ..
-rwx------ 1 root root 204710061008 Jul  7 23:04 13481ae1-00000006-b420c59c-5d20c59c-014d1600-644823be
-rwx------ 1 root root 428284665149 Jul  7 23:04 14d46d72-00000006-0220c577-5d20c577-00ff1600-644823be
-rwx------ 1 root root 247950379831 Jul  7 23:04 173699d8-00000006-ff20c578-5d20c578-01021600-644823be
-rwx------ 1 root root 839859093330 Jul  7 23:04 298cc682-00000006-7920c5e9-5d20c5e9-01881600-644823be
-rwx------ 1 root root  91465187328 Jul  7 23:04 2bc5effa-00000006-ce219f97-5d219f97-03331600-644823be
-rwx------ 1 root root 132714594304 Jul  7 23:04 58a680e5-00000006-0320c577-5d20c577-00fe1600-644823be
-rwx------ 1 root root 140331190669 Jul  7 23:04 5b44744f-00000006-fe20c578-5d20c578-01031600-644823be
-rwx------ 1 root root            0 Jul  9 15:21 60207da3-00000006-5b24b0f6-5d24b0f6-07a61600-644823be
-rwx------ 1 root root            0 Jul 10 05:21 7b8b6b2c-00000006-8d2575c6-5d2575c6-09741600-644823be
-rwx------ 1 root root 437006037198 Jul  7 23:04 89b8a19e-00000006-0020c578-5d20c578-01011600-644823be
-rwx------ 1 root root            0 Jul 10 01:58 8d658a35-00000006-96254645-5d254645-096b1600-644823be
-rwx------ 1 root root  41388343296 Jul  7 23:04 9514b7dc-00000006-cd219f97-5d219f97-03341600-644823be
-rwx------ 1 root root 167433480664 Jul  7 23:04 9aaa5eda-00000006-d4217c62-5d217c62-032d1600-644823be
-rwx------ 1 root root 114686069939 Jul  7 23:04 c212db4c-00000006-0520c4af-5d20c4af-00fc1600-644823be
-rwx------ 1 root root 826660414175 Jul  9 13:30 d37d7328-00000006-0420c577-5d20c577-00fd1600-644823be
-rwx------ 1 root root            0 Jul  9 15:43 f46f6d04-00000006-2624b62e-5d24b62e-07db1600-644823be
On the NW server, the SSID is deleted in the media database as showed below:
[root@nsr]# mminfo -avot -q ssid=9f3884ab-00000006-f424e0b2-5d24e0b2-080d1600-644823be
6095:mminfo: no matches found for the query

But the files corresponding to the failed backup are still at the MagFS folder at the CB:

maginatics@CB_appliance:/mnt/magfs$ ls -lia base/NW_CB_DEVICE//64/47/9f3884ab-00000006-f424e0b2-5d24e0b2-080d1600-644823be
71065 -rwx------ 1 root root 378010589327 Jul 11 11:03 base/NW_CB_DEVICE//64/47/9f3884ab-00000006-f424e0b2-5d24e0b2-080d1600-644823be

原因

When the CONNECTION_DISCONNECTED error occur all the Save Set IDs (SSID) are not deleted on the CB and all chunks or objects at Cloud Provider are not deleted.

解析度

These orphans files are removed during the next NetWorker server (nsrd) startup. To restart NetWorker server process, run the following commands:

  • Linux: systemctl restart networker
  • Windows (PowerShell): net stop nsrd ; net start nsrd

If the CONNECTION_DISCONNECTED error is observed frequently, refer to the following KB: NetWorker: backups to CloudBoost are aborted by CONNECTION_DISCONNECTED error

受影響的產品

CloudBoost, NetWorker

產品

NetWorker Series
文章屬性
文章編號: 000043711
文章類型: Solution
上次修改時間: 08 1月 2026
版本:  4
向其他 Dell 使用者尋求您問題的答案
支援服務
檢查您的裝置是否在支援服務的涵蓋範圍內。