NMDA: DB2 backups fail randomly every night with Error 3

Summary: NMDA DB2 backups failed last night with Error 3. Problem was resolved after creating a new device and scattering backups in two storage nodes, and setting up DB2 retry and timeout parameters. ...

Affected Products

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Check out other resources

Symptoms

NMDA DB2 backup fails with Error 3
DB2 backup fails with error 'lgto_auth for `nsrmmd' failed: busy'
There is no networking or firewall issues found.

There are 1000s of below messages in /nsr/logs/daemon.raw in storage node:

"5004-nfs lookup failed (nfs: No such file or directory)"
"invalid save stream"
"Cannot stat active file"
"unable to collect deduplication statistics"
"was aborted and removed from volume"

Error in nmda-messages.log libnsrdb2.log with debug=9:

153929 2/9/2021 10:34:50 PM  4 7 987 1 18153790 0 (client) (pid18153790) NSR severe The backup session could not start: busy. 
93412 2/9/2021 10:34:50 PM  3 5 0 1 18153790 0 (client) (pid18153790) NSR error Could not perform the action 2. The status was changed to 3. 
153929 1612842069 4 7 987 1 19136950 0 (client) (pid19136950) NSR severe 39 The backup session could not start: %s. 1 49 8 0 4 busy
93412 1612842069 3 5 0 1 19136950 0 (client)  (pid19136950) NSR error 62 Could not perform the action %d. The status was changed to %d. 2 1 1 2 1 1 3
(pid = 18809144) (02/09/21 21:40:00.338942) nsrdb2sv_log_program_args: /usr/bin/nsrdasv -LL -T db2 -s (NW server) -g (group)  -a *policy action jobid=2297950 -a *policy name=(policy)  -a *policy workflow name=(workflow)  -a *policy action name=(action)  -y Tue Feb 23 23:59:59 GMT-0600 2021 -w Tue Feb 23 23:59:59 GMT-0600 2021 -m (client) -a *policy action jobid restart=Yes -b (pool) -t 1612810625 -o ....

(pid = 18809144) (02/09/21 21:40:00.624767) Backing up the (DB) database.
(pid = 18809144) (02/09/21 21:40:00.624939) set_db2_version: Exiting set_db2_version(): Return code: 10050000
(pid = 18809144) (02/09/21 21:49:08.731480) DbBackup: Exiting with error:
Unable to backup DB2MDME database due to backup request failure, SQLCODE : -2025, SQL2025N  An I/O error occurred.  Error code: "3". Media on which this error occurred: "VENDOR".
 .
(pid = 18809144) (02/09/21 21:49:08.731631) libdb2sv_main: ERROR: DbBackup() failed.
(pid = 18809144) (02/09/21 21:49:08.731685) Unable to backup DB2MDME database due to backup request failure, SQLCODE : -2025, SQL2025N  An I/O error occurred.  Error code: "3". Media on which this error occurred: "VENDOR".

Critical error is nsrmmd busy error below:

02/09/21 21:32:46 (pid 18153790): 02/09/21 21:32:46.797073 lgto_auth for `nsrd' succeeded
02/09/21 21:32:46 (pid 18153790): 02/09/21 21:32:46.855631 lgto_parms for `nsrmmd' succeeded
02/09/21 21:32:46 (pid 18153790): 02/09/21 21:32:46.855705 got `store index entries' value of `Yes'
02/09/21 21:32:46 (pid 18153790): 02/09/21 21:32:46.855803 Saving in pool 'IDC-DB2'.
02/09/21 21:32:46 (pid 18153790): 02/09/21 21:32:46.855822 server enabled for immediate mode
02/09/21 21:32:46 (pid 18153790): 02/09/21 21:32:46.882267 lgto_auth for `nsrmmd' failed: busy
02/09/21 21:32:46 (pid 18153790): 02/09/21 21:32:46.882349 Unable to acquire the user credentials for direct save nsrmmd authentication: busy.
02/09/21 21:32:46 (pid 18153790): 02/09/21 21:32:46.882439 The error TYPE is 0, SEVERITY is 0, NUMBER is -13, errnum is -13, errstr is 'busy'.

Cause

Configuration/resource availability issues.

Resolution

Problem was resolved after doing the changes below. There is no single root cause, but creating a new device and setting the parameters below helped most:

1. Added one new device in to the storage node.
2. Distributed backups evenly across the storage nodes (target session).
3. Changed backup start times.
4. Added these parameters in NMDA DB2 Application information:

NSR_MAX_START_RETRIES=50
NSR_FXBUSY_RETRIES=10
NSR_MMDB_RETRY_TIME=10

5. Increased Inactivity timeout to 300, Retries=2, Retry delay=10 in the backup action's properties.

Affected Products

NetWorker, NetWorker Module for Databases and Applications

Products

NetWorker Family

Article Number: 000183668

Article Type: Solution

Last Modified: 28 رجب 1447

Version: 6

Check if your device is covered by Support Services.

NMDA: DB2 backups fail randomly every night with Error 3

Summary: NMDA DB2 backups failed last night with Error 3. Problem was resolved after creating a new device and scattering backups in two storage nodes, and setting up DB2 retry and timeout parameters. ...

Symptoms

Cause

Resolution

Affected Products

Symptoms

Cause

Resolution

Affected Products

Products

Article Properties

Find answers to your questions from other Dell users

Support Services

Article Properties

Find answers to your questions from other Dell users

Support Services

NMDA: DB2 backups fail randomly every night with Error 3

Summary: NMDA DB2 backups failed last night with Error 3. Problem was resolved after creating a new device and scattering backups in two storage nodes, and setting up DB2 retry and timeout parameters. ... View More View Less

Detailed Article

Symptoms

Cause

Resolution

Affected Products

Symptoms

Cause

Resolution

Affected Products

Products

Article Properties

Find answers to your questions from other Dell users

Support Services

Article Properties

Find answers to your questions from other Dell users

Support Services

Summary: NMDA DB2 backups failed last night with Error 3. Problem was resolved after creating a new device and scattering backups in two storage nodes, and setting up DB2 retry and timeout parameters. ...