Avamar: Po upgradu na IDPA 2.7.7 selhává zálohování, když je povolena nebo nakonfigurována vrstva Cloud Tier
Summary: Upgrade zařízení IDPA na verzi 2.7.7 nebo Griffin (Avamar a Data Domain) způsobuje náhodná selhání zálohování z důvodu zamítnutých připojení v systému Data Domain z nástroje Object Existence Check (OEC). ...
Symptoms
V protokolu selhání zálohování se zobrazí následující chybové zprávy:
avtar Warning <18125>: Calling DDR_OPEN returned result code:5040 message:calling system(), returns nonzero
avtar Error <10542>: Data Domain server "ddmgmt.lab.com" open failed DDR result code: 5040, desc: calling system(), returns nonzero
avtar Error <10509>: Problem logging into the DDR server:'', only GSAN communication was enabled.
avtar FATAL <17964>: Backup is incomplete because file "/ddr_files.xml" is missing
avtar Info <10642>: DDR errors caused the backup to not be posted, errors=0, fatals=0
avtar Info <12530>: Backup was not committed to the DDR.
avtar FATAL <8941>: Fatal server connection problem, aborting initialization. Verify correct server address and login credentials.
avtar Info <19155>: - Establishing a connection via token to the Data Domain system with certificate authentication (Connection mode: A:2 E:2).
avtar Warning <18133>: Calling DDR_MOPEN returned result code:(5040) calling system(), returns nonzero message:DDRInstance::Connect: Unable to connect to DDR: ddmgmt.lab.com
[41581] [139628191549184] Wed Oct 9 18:17:26 2024
ddp_connect_with_config() failed, Hostname: ddmgmt.lab.com, Err: 5040-RPC procedure=SYSTEM_INFO failed, Can't connect to NFS server retval=4
[41581] [139628191549184] Wed Oct 9 18:17:26 2024
ddp_connect_with_config_internal() failed, Hostname: ddmgmt.lab.com, Err: 5040-RPC procedure=SYSTEM_INFO failed, Can't connect to NFS server retval=4
[41581] [139628237514496] Wed Oct 9 18:16:23 2024
ddp_access() failed, Path avamar-1234567890/STAGING/10f19ca3331644f885c61dae1eb936cb7624eb03/BACKUP-30C108396751178970C7E117A05FE89E5C34A8D3, mode 0 Err: 5004-nfs lookup failed (nfs: No such file or directory)
avtar FATAL <5889>: Fatal signal 11 in pid 41611
[SessionMgr] FATAL ERROR: <0001> uapp::handlefatal: Fatal signal 11
avtar Warning <18133>: Calling DDR_WRITE returned result code:(5040) calling system(), returns nonzero message:DDRIO_Write::WriteToDDR: ddp_write failed
[18529] [139991135528704] Thu Nov 21 09:07:15 2024
ddp_write() failed Offset 0, BytesToWrite 3805, BytesWritten 0 Err: 5040-DDBoost OST_QUERY_SECURE RPC failure 4
[18529] [139991135528704] Thu Nov 21 09:04:40 2024
ddp_stat() failed, Path avamar-1234567890//STAGING/93f26264b84f4e30018f8f9755144866b48fec42/BACKUP-3262F4E4E3FA660B5975057EC08CD98140049755/DBF03EC0AAA6783DADBE469DCDD94913E4EC2BDA, Err: 5004-nfs lookup failed (nfs: No such file or directory)
[18529] [139991153891072] Thu Nov 21 09:04:40 2024
ddp_access() failed, Path avamar-1634225547/STAGING/93f26264b84f4e30018f8f9755144866b48fec42/BACKUP-3262F4E4E3FA660B5975057EC08CD98140049755, mode 0 Err: 5004-nfs lookup failed (nfs: No such file or directory)
avtar Info <10690>: - Processed file on Data Domain: "VMConfiguration/avamar vm configuration.xml" (3,805 bytes)
avtar Error <16709>: DDRInstance::Invoke - ddrmgr write failure result code: 5040
avtar FATAL <0000>: <10565>Failed to write data to stream, stream index: 7, DDR stream handle: 1003, DDR result code: 5040 desc: calling system(), returns nonzero.
avtar FATAL <40009>: DDR encountered errors.
avtar Info <9772>: Starting graceful (staged) termination, DDR_ERROR event received (fatal severity) (wrap-up stage)
avtar Info <0000>: Entering the 'final' phase of termination, DDR_ERROR need to exit)
avtar FATAL <5155>: Backup aborted due to earlier errors. No backup created on the server.
Systém Data Domain zobrazuje několik odmítnutých připojení:
Recent Alerts and Log Messages
------------------------------
Nov 20 22:03:46 ddmgmt ddfs[22835]: WARNING: MSG-RPC-00002: Rejected 63 connection attempts to ddr in the last 1384 minutes, already has 36 connection to port 264.
Nov 20 22:34:21 ddmgmt ddfs[22835]: WARNING: MSG-RPC-00002: Rejected 53 connection attempts to ddr in the last 30 minutes, already has 44 connection to port 264.
Nov 20 23:08:11 ddmgmt ddfs[22835]: WARNING: MSG-RPC-00002: Rejected 115 connection attempts to ddr in the last 33 minutes, already has 41 connection to port 264.
Nov 20 23:33:50 ddmgmt ddfs[22835]: WARNING: MSG-RPC-00002: Rejected 8 connection attempts to ddr in the last 25 minutes, already has 43 connection to port 264.
Nov 20 23:50:00 ddmgmt ddfs[22835]: WARNING: MSG-RPC-00002: Rejected 8 connection attempts to ddr in the last 16 minutes, already has 42 connection to port 264.
Nov 21 02:04:07 ddmgmt ddfs[22835]: WARNING: MSG-RPC-00002: Rejected 33 connection attempts to ddr in the last 134 minutes, already has 42 connection to port 264.
Nov 21 02:36:45 ddmgmt ddfs[22835]: WARNING: MSG-RPC-00002: Rejected 108 connection attempts to ddr in the last 32 minutes, already has 50 connection to port 264.
Nov 21 03:06:47 ddmgmt ddfs[22835]: WARNING: MSG-RPC-00002: Rejected 47 connection attempts to ddr in the last 30 minutes, already has 53 connection to port 264.
Existuje několik CLOSE_WAIT relací odkazujících na ID procesu ddfs (Data Domain File System):
!!!! ddmgmt YOUR DATA IS IN DANGER !!!! # while true; do echo -n "CLOSE_WAIT Connections ===>"; netstat -tanp | grep CLOSE_WAIT | grep ddfs | wc -l; sleep 60; done
CLOSE_WAIT connections ===>265
CLOSE_WAIT connections ===>314
CLOSE_WAIT connections ===>360
CLOSE_WAIT connections ===>411
CLOSE_WAIT connections ===>459
CLOSE_WAIT connections ===>484
CLOSE_WAIT connections ===>503
CLOSE_WAIT connections ===>503
...
!!!! ddmgmt YOUR DATA IS IN DANGER !!!! #Cause
Online kontrola existence objektu (OEC) otevírá připojení, ale zůstávají otevřená.
Technický tým Data Domain tento příznak stále zkoumá.
Resolution
Dočasným zástupným řešením je zakázání CM_OEC_ENABLED. Objekty se zapisují do cloudu jako součást přesunu dat. Nástroj OEC provádí pravidelný výpis těchto objektů v cloudu, aby se ujistil, že to, co očekává, je v cloudu stále přítomné. Jedná se o důležitou součást architektury DIA (Data Invulnerability Architecture). Pokud některý objekt chybí, vyvolá se výstraha. Tato kontrola se neprovede, pokud je OEC zakázáno.
Chcete-li tento úkol provést, obraťte se na technickou podporu Dell Data Domain.