Troubleshooting Tape Library Access Problems in NetWorker
Summary: This article is intended to help Support, and NetWorker Administrators determine the causes for a detected robot's inability to accept commands.
Symptoms
- Unable to access detected tape library installation on NetWorker Storage Node or Server
- Unable to backup data due to unusable backup hardware
- Errors accessing the robot:
0x29Device busyThe requested resource is busyStr=<There is an input or output error.>No such deviceNo such file or directoryInappropriate ioctl for device
Cause
If the library was working previously, and suddenly is not, consider the last known change as the likely cause:
- Unhandled change in library address following reboot, rediscovery, and renaming of device
- Possible damage due to power surge, outage, or other environmental event
- Failure events or reconfiguration of transport hardware
- Installation, change, or deletion of software or drivers pertaining to transport or robotics
If the library has never worked, confirm that the hardware is supported in the NetWorker Hardware Compatibility Guide (Requires Dell Support Account Sign-In). Remember that it is possible for a library to be partially functional; discovery alone does not guarantee usability or supportability.
Resolution
To troubleshoot library‑access failures, review recent changes. Then use basic and third‑party comparative tests to confirm whether any host or process can trigger a response from the robot.
Sometimes it is desirable to test specific functions, based on the available evidence. If Host A can query the robot but Host B cannot, the robot is responsive. Host A’s driver may be locking the robot. If Host B still receives errors after all hosts are unzoned, Host B may have a driver, configuration, or software issue.
If the host accessed the robot before the issue, review items are most likely to have changed. Investigate failures or known configuration changes after the event.
After the library is detected, use the following commands to test basic SCSI operations over the storage transport, not Ethernet or the web UI. Always, ensure that Operating System patches are up to date, especially concerning storage.
nsrget -o:d on affected server and nodes.
-o:d on any host with tapes where the tapes are busy writing. You can check this from the NetWorker Management Console (NMC) under Monitoring -> Devices.
The following article provides information about getting and using NSRGET: NetWorker: How to Use the NSRGet NetWorker Data Collection Tool
Library Access: Operating System:
- Windows: There is no native way to query a tape library in Windows;
mtxis a freeware utility which may be tested if wanted. It uses the changer device handle, rather than the SCSI address, when issuing commands (which may have testing implications).
- Linux: Like Windows, has no native command to query, but also has an
mtxport, which requires the device driver handle (again, different from how NetWorker accesses it).
loaderinfo -f /dev/sg#
mtx -f /dev/sg# inquiry
- Solaris: Solaris includes the
sgendriver for native tape library support, but nomtxport nor other native library commands exist for it. See the section on NetWorker commands to test library access instead (below).
- AIX: AIX does not have any native tape library support (
lusis used instead), and nomtxport exists for it. See the section on NetWorker commands to test library access instead (below).
- HP-UX:
mcis the native HP-UX command for medium changer manipulation:
mc -p $(ioscan FnkC autoch | grep /dev/rac) -r MIDS -q
- NetWorker: These commands function at a relatively atomic level, and although they are written, compiled, and tested by NetWorker support, they do not require a running NetWorker instance to function, nor any of NetWorker's configuration. In general they are considered to be reliable, low-level, software-independent test utilities. To increase debug for most utilities, you may add the following environment variables:
SJI_DEBUG=9LUS_DEBUG=9 (lusdebug ffff on AIX)CDI_DEBUG=9SCSI_DEBUG=9JBDEBUG=9
In the below, '<changer address>' varies by Operating System:
Windows: Initiator.Target.LUN (as revealed by inquire command) or \\.\changer# driver handle
Linux: Intiator.Target.LUN (as revealed by inquire command) or /dev/sg# driver handle
Solaris: /dev/scsi/changer/c#t#d# driver handle
AIX: Initiator.Target.LUN (as revealed by inquire command)
HP-UX: Initiator.Target.LUN (as revealed by inquire command) or /dev/rac/c#t#d# driver handle
sjirjc <changer address>
Requests data from the robot such as number of drives, features supported, so forth.
sjisn <changer address>
Requests drive element and serial number information from the robot.
sjirdtag <changer address>
Requests tape cartridge to element location data
cdi_inq -f <changer driver handle> -v
Requests vital product data (requires a driver handle to be used)
ielem -a <changer address>
Attempts to reinitialize elements - may be disruptive.
Library Access: Resetting the Library:
nsrjb -HEvvvvv
Issues a reset command to a problematic library, and forces an element reinitialization.
nsrjb -IIvvvvv
Forces an update and refresh to the NetWorker nsr jukebox object based on the barcodes reported by the library and the corresponding values in the media database.
nsrjb -HH
Forces the jukebox to unload all volumes and attempt a soft reset.
ielem -a is a rough equivalent of nsrjb -E that does not require a functional nsr jukebox in NetWorker.
Transport - Configuration
- For SAN: Ensure both the robot and the intended NetWorker robot control host are logged into the switch properly, and review Zoning for the robot to ensure that end-to-end connection is possible.
- Robots are not intended to be accessed or controlled by more than one host; unless there is a need (for example, a partitioned robot), ensure only the intended NetWorker robot controller host is zoned to see the robot.
- It is possible to test SAS expanders to ensure that robotic connection is established; pure point-to-point technology like SCSI requires testing connection from the relevant host.
Transport - Hardware
- If problems are detected at either the host or transport hardware level, consider testing the switch or expander, or replacing cables with 'known good' examples to rule out cabling issues.
- Review the firmware of the transport hardware, and the firmware of the robot itself for currency.
- For SCSI, ensure that terminators are correctly placed and seated snugly, cable length limits are observed and proper voltages are being used.
Host transport - Configuration
- Ensure that the concerned host has up-to-date drivers and firmware for its transport drivers - use
EMCReports(bundled withnsrget -o:e). - Ensure that any Host Bus Adapter (HBA) driver configuration that is required is done appropriately to the operating system.
Host software - Resource locking
- For any host that is zoned to see the robot (ideally - only the designated NetWorker host), check for any software that could be attempting to access the robot, such as other backup software, monitoring software, or standalone utilities which may attempt to access the robot.
- For Solaris 10, the robot is not accessible when the nsrlcpd NetWorker process is attached; thus it may appear to be inaccessible (or even undetectable) until the library in NetWorker is disabled (forcing
nsrlcpdto detach and die). - If any non-NetWorker process is suspected of locking or accessing the robot or any drive - see Troubleshooting Overwritten Labels and SCSI Resets in NetWorker for more information about troubleshooting and identification.
If the operating system detects the library but the library does not respond to commands, it is functional to some degree. It may be locked by another process or host, affected by transport issues, or experiencing a component‑level malfunction.
If no process or host can be determined to be accessing the robot besides the NetWorker Storage Node intended to control it, refer to Troubleshooting Tape Library Hardware Problems in NetWorker to determine if there is an issue with the robot itself.
Additional Information
Ensure that you understand that robotics issues which are shown to be outside of NetWorker's scope as an application (read: cannot be accessed using standard Operation System methods) are not within the scope of NetWorker support.
NetWorker: Troubleshooting Tape Library Problems in NetWorker
Support can provide guidance using the criteria above, but we do not have OS, HBA, or robotics vendor resources. This limitation can lead to prolonged, unsuccessful troubleshooting.