NetWorker: Troubleshooting Tape Library Drive Ordering Problems
Summary: This article describes a well-known problem with Tape Libraries in a SAN environment which leads to device names being changed by the operating system, leading to application failures.
Symptoms
In a Plug 'N Play operating system, devices are assigned SCSI target addresses in the order of discovery.
Because SAN discovery order varies and connectivity loss triggers Plug‑and‑Play remapping, target numbers change and cannot remain fixed.
Plug‑and‑Play renames devices based on enumeration order, so any intentional or accidental connection interruption can cause devices to be reassigned new names.
A 'drive ordering' problem describes a condition where NetWorker's configured driver name for a device does not match the actual name. This is most commonly a result of the driver name changing in the Operating System after initial NetWorker library configuration. This is typically a Plug 'N Play operating system issue, affecting Windows and Linux.
There are many errors and conditions associated with this problem, including, but not limited to:
- Error: '
nsrd: media info: failed unloading drive `{driver handle}' to slot {slot number}, error '69'' - Error: '
{hostname} the destination component full' - Error: '
{driver handle} read open error, no such device or address' - Error: '
opening: I/O error' - Error: '
nsrd: Jukebox '{jukebox}' failed: expected volume '{volid}' got {volid}' - Error: '
nsrd: Jukebox '{jukebox}' failed: expected volume '(volume_name)' got 'NULL'' - Error: '
read open error, device not ready' - Error: '
nsrjb: Jukebox error, All allocated drives are not usable, unrecoverable operation errors' - Error: '
nsrd: Jukebox '{jukebox}' failed: expected volume '{volid}' got {volid}' - Error: '
nsrd: Jukebox '{jukebox}' failed: expected volume '{volume}' got 'NULL'' - Error: '
read open error, device not ready' - Error: '
nsrjb: Jukebox error, All allocated drives are not usable, unrecoverable operation errors' - Error: '
nsrd: media warning: {driver handle} reading: read open error: No media in drive.' - Error: '
inventory: Bar code label `{barcode}' does not match media db bar code label, updating media db' - Error: '
Illegal request, medium not present' - Error: '
nsrd: media info: failed unloading drive `{driver handle}' to slot {slot number}'
Cause
NetWorker creates the library object during initial setup, linking tape drives to the OS‑generated device handles they have at that moment. It is a static association which reflects the relationship at the time of configuration. For example, a library may have two devices:
Physical drive 1 = \\.\Tape0 (or perhaps /dev/nst0 in Linux)
Physical drive 2 = \\.\Tape1 (or /dev/nst1)
In Plug‑and‑Play systems like Windows or Linux, any device disappearance—including reboots or connectivity changes—can make the OS rename the devices. Especially on a SAN, where device discovery may be disordered, the devices may be named differently on the next reboot, for example, as contrasting the above:
Physical drive 1 = \\.\Tape1 or /dev/nst1
Physical drive 2 = \\.\Tape0 or /dev/nst0
Commands to these devices may still work, assuming there is any device using the wanted name. NetWorker loses track of device names because the library’s driver‑handle associations no longer match the physical elements after the OS renames the devices. For example, NetWorker may load a tape cartridge into one drive but use an outdated, incorrect device name, issuing commands to the wrong drive after OS renaming. This can result in a wide range of errors, assuming an unexpected volume (or none at all) is found. There are many possible causes of drive ordering conditions:
- Manual misconfiguration of library using
jbconfigorjbeditcommands - Reboot of host, storage adapter, storage connectivity hardware, or tape devices
- Temporary loss of connectivity to a device
- Disabling and reenabling the device in the operating system
- Operating system updates
- Device or storage adapter driver updates
Resolution
Persistent Naming:
This is considered to be best practice and may be recommended by support even if you are not experiencing issues to proactively protect you. Use the information from the following articles:
- Implementing Tape Device Name application resilience for Windows
- Implementing Tape Device Name application resilience for Linux
Additional Information
Manual reconfiguration
If you cannot immediately enable application resilience and reconfigure your library, there are several manual alternatives which may be considered:
- NMC reconfiguration: You may update NetWorker's configuration by using the Reconfigure option of the Library instance to remove the device definitions for all affected devices, then deleting the leftover tape device instances from the Devices container, before rescanning and reconfiguring with the corrected, new names.
jbconfigcommand: These commands are still part of the NetWorker suite but are no longer used, and require more advanced knowledge of both NetWorker as well as tape library and storage transport technologies.- To begin from scratch, use
jbconfigfor manual library creation control: How to configure a NetWorker tape library manually using jbconfig command
- To begin from scratch, use
- Enforced renaming: It may be possible to disable or delete devices and readd/reenable them in the order corresponding to their current configuration in NetWorker. For example, in a simple Windows scenario for the above, one could disable both devices, reenabling the instance that is configured as Tape0 in NetWorker first, to force the operating system to name that device Tape0 once more. Linux methodology would be similar, but using /proc/scsi/scsi file to directly delete and rescan devices.