1 Rookie

 • 

100 Posts

April 6th, 2011 09:00

any idea ?

240 Posts

April 12th, 2011 10:00

Hello Can!

Please review this information in full before making any changes.  The following information is from the EMC KB article titled Understanding and troubleshooting how media is unloaded from a Tape Library  -  article number esg116949:

Symptoms

  • Problems or errors unloading volumes within NetWorker
  • sjimm errors when moving volume from drive to other element
  • Error: 'MOVE MEDIUM key:5 status:CHECK CONDITION No Additional Sense, Medium Not Present'
  • Error: 'SJIMM: There is an input or output error. 1640:sjimm: Code:0x29, Str= '
  • Volume is not returned to slot following backup
  • Volume remains seated in drive after unload attempt
  • Volume remains ejected in drive after unload attempt
  • Scanner command interrupted during operation
  • Scanner error: '8770:scanner: fn <#> rn <#>read error No media in drive.
  • Volume being marked as in NetWorker interface
  • Error: 'Expected volume ` for slot ` ', found volume ` '

Causes

There are many causes for many different types of unload problems, but the best way to begin investigating them is to understand how NetWorker treats volumes and deals with unloading them. The following discusses factors in NetWorker's unload of volumes:

1) Autoeject Feature (for both Library and NetWorker NSR Jukebox object). NetWorker's setting is commonly misinterpreted but must be in agreement with the actual libraries own setting, or problems will result. Note that this has nothing to do with return to slot after backup (see Idle Device Timeout). To illustrate the states of the library itself and NetWorker's Autoeject values:

i. A Library with the Autoeject vendor feature enabled requires only a SCSI move command to move a volume from drive element to another element; the explicit eject command is not required.

ii. A Library with the Autoeject vendor feature disabled requires an explicit eject command prior to the move command; otherwise the media is never ejected from the drive so that the robotic hand can retrieve it, making unload impossible

iii. When Autoeject is enabled in Jukebox Features, NetWorker does not send an eject command to the drive first, as it is expected the ejection is implicit in the move command (corresponding to point i).

iv. When Autoeject is disabled in Jukebox Features, NetWorker first sends an eject command to the drive before requesting a robotic move, expecting the library requires this in the course of a drive unmount.

2) NSR Jukebox > Jukebox Timers: Eject Sleep / Unload Sleep. These settings control how long NetWorker pauses before each stage of an unmount request; first the ejection of the volume from the drive, and then the remvoval of the volume from the drive to be placed into another element.

3) NSR Jukebox > Jukebox Timers: Idle Device Timeout. This value reflects how long NetWorker will leave a volume in an inactive device before returning it to its slot. A value of 0 means the volume will be allowed to remain in the drive until such time as NetWorker requests to load volume elsewhere, or a different volume into the drive.

4) NSR Jukebox > Configuration: Verify Label on Unload. This value will force NetWorker to rewind and verify a volume's label when asked to unmount it and return it to slot. In practice, this may exacerbate unload issues and lead to a more frequent marking of tapes as ; further, it will add overhead to backup operations since when the tape is remounted for next backup, the drive will need to space forward again to End of Data (EOD) for writing the next session. It is generally not recommended.

5) MGD_SHARE_DEV_LOGICAL_SWITCH_FLAG: Avoids unecessary robotics operations. If a volume is requested by NetWorker for a logical device on a Storage Node when it is already mounted on a different Storage Node's logical device instance of the same physical drive, NetWorker attempts to logically unmount / remount the volume in question between the two logical instances - without physically unloading the volume, simply to reload it in the same physical drive for a different host's use.

Resolutions

When investigating these sorts of issues, be aware of the information above and qualify the problem:

Trending failures

  • Does the problem happen reliably on a drive or drives?
  • Does the problem happen reliably with a volume or volumes?
  • Does the problem happen reliably with a library or libraries?
  • Does the problem happen reliably for a Storage Node or Nodes?
  • Does the problem happen reliably under a certain circumstance, such as unmounting during a span, or under heavy load?

Testing Autoeject

  1. While NetWorker is idle or halted, use the sjimm command to move a volume to a drive, and then out again.
  2. If the return move works without error - autoeject is enabled at the library level.
  3. If you get a failure code, either there is a physical ejection issue, external contention for the device, or autoeject is not enabled at the library level.
  4. Try again after running: mt -f (devicename) offline; if this succeeds after the manual ejection command, it is confirmed that autoeject is not enabled on the library.

Autoeject Conflicts

  • If NetWorker autoeject is enabled, but the library's is not - the tape will not be ejected from the device during an unmount - and unmounts will reliably fail.
  • If NetWorker autoeject is disabled, but the libary autoeject is enabled - both NetWorker and the library will attempt to eject the device simultaneously, which may lead to random I/O or other errors registered against the NetWorker device instance (ultimately leading to it being disabled).

Unmount Timing and Load Problems

  • For a VTL, Eject and Unload Sleep values of 10 seconds are appropriate.
  • For a PTL, Eject and Unload Sleep values of 30 seconds is a good minimum.
  • Neither will lead to excessive overhead, and setting too low may cause timing issues.
  • Idle Device Timeout will lead to inactive tapes being dismounted; when running scanner against a volume, this value should be set to 0 (indefinite) as NetWorker does not track scanner activity (and thus considers the device 'Idle' and will unmount it during the scan).
  • MGD_SHARED_DEV_VOL_SWAP: If this variable is set to no, this prevents NetWorker from attempting to logically unmount a device instance and remount on another when the physical drive and volume are the same (thus forgoing the physical unmount/remount of same volume to same device). It has been shown to alleviate problems associated with this feature.

Please let us know if this information helps you or not.  If not, I will see what else I can find.  It would be helpful to know what the library is and what type of tapes you are using.  It does not seem it, but this can make a difference.

Thank you,

Mark

No Events found!

Top