NetWorker: NMDA - Oracle Troubleshooting Guide
Summary: This article describes how to begin in identifying NetWorker Module for Databases and Applications (NMDA) problems.
Instructions
The goal of the document is to help the reader narrow down any NetWorker Module for Databases and Applications (NMDA) Oracle-related issues.
Gather the following information:
Step 1: Basic environment gathering:
From the NetWorker backup server:
-
Hostname.
- OS type, and version.
- NetWorker version, and build number.
-
Is this a new client configuration, or an existing environment that was working but suddenly started to fail? Any recent changes?
From the Oracle server:
- Hostname
- OS type, and version.
- NetWorker version, and build number.
- NMDA version, and build number
- Oracle version
- Name of the Oracle instance being backed up (
ORACLE_SID), options:echo $ORACLE_SIDcat /etc/oratab
- Path of
ORACLE_HOMEused for the backup, options:echo $ORACLE_HOMEgrep ORACLE_HOME ~oracle/.bash_profile ~oracle/.bashrc
- Authentication method used for the backup. For example: OS Auth, or Oracle DB Auth, options:
- OS Authentication ("
/ as sysdba"):rman target / - Oracle Database Authentication (username/password):
rman target sys@PROD password=<pwd>
- OS Authentication ("
- Is this a standalone, or clustered environment? If clustered, gather the various node names that are part of the cluster.
Before investigating any issue, consult the NetWorker compitability matrix for any support conflicts: E-Lab Interoperability Navigator 2.0-HOME
NMDA documentation is available through: Support for NetWorker Module for Databases and Applications | Manuals & Documents
Optionally, collect an NSRGET support bundle from the NetWorker server and Oracle client: NetWorker: How to Use the NSRGet NetWorker Data Collection Tool
Step 2: Review RMAN output
For backup errors:
- Identify the issue by reviewing the RMAN output.
- Go to NetWorker Management Console (NMC) -> Monitoring -> Expand the Policy -> Expand the Workflow -> Double-click the backup action. Double-click the save sets in the "Failed" section
NOTE: If the above messages do not show the RMAN error stack, you must gather the RMAN output using the
NSR_RMAN_ARGUMENTSvariable. - Review the RMAN output for any error messages, and compare with any known issues in the Knowledge Base.
- Verify that the client for which the Oracle backup is failing, you can conduct regular file system backups. If regular file system backups for this host are failing, those errors must be resolved first before any further RMAN troubleshooting.
- Crosscheck the timestamps of the Oracle backup failures, with the OS system log files, and
daemon.rawfiles on the Oracle server, NetWorker backup server, and Storage Node (if used). Investigate any error messages.
- Linux:
/nsr/logs/daemon.raw - Windows (Default):
C:\Program Files\EMC NetWorker\nsr\logs\daemon.raw - NetWorker: How to use nsr_render_log to render .raw log files
For restore issues:
- Obtain the complete RMAN output of the restore attempt. This would include the RMAN script used, along with the execution of that script, and the resulting Error Stack from RMAN. See Notes section below for an example of what the complete RMAN output looks like.
- Review the RMAN output for any error messages, and compare with any known issues in the Knowledge Base
- Verify that the client for which the Oracle restore is failing, you can conduct a simple regular file system restore. If regular file system restores for this host are failing, those errors must be resolved first, before any further RMAN troubleshooting.
- If a directed restores where backup was performed on Host_A, and restored is running on Host_B, ensure that you can perform a directed restore of a single flat file between these two hosts, including using the same media pools that would be used for Oracle.
For additional NMDA logs files, and enabling NMDA debug to further help an investigation, see: How to enable debug for NMDA
Additional Information
------------------------------------------------------------------------------------------------------------------------
Recovery Manager: Release 12.1.0.2.0 - Production on Thu Jan 28 14:18:53 2016
Copyright (c) 1982, 2014, Oracle and/or its affiliates. All rights reserved.
RMAN> connect target *
2> run {
3> allocate channel t1 type sbt_tape parms 'ENV=(NSR_SERVER=macwin1,NSR_CLIENT=pro-ora1)';
4> backup tablespace users;
5> release channel t1;
6> }
7>
connected to target database: ORCL (DBID=1429445936)
using target database control file instead of recovery catalog
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of allocate command on t1 channel at 01/28/2016 14:18:53
ORA-19554: error allocating device, device type: SBT_TAPE, device name:
ORA-27211: Failed to load Media Management Library
Additional information: 2
Recovery Manager complete.