2 Intern

 • 

181 Posts

April 11th, 2006 01:00

I think is media, because I tray again with another cartridge and is working.

4 Operator

 • 

14.4K Posts

April 11th, 2006 01:00

Both could be correct (bad media and SCSI reset even above looks just like typical SCSI reset issue). It could be also dirty drive, but this is less likely.

4 Operator

 • 

14.4K Posts

April 11th, 2006 02:00

Still it could be SCSI reset and perhaps drive was cleaned automatically in the mean time. Unless some important and valid data is on tape I would suggest to relabel tape used during the error and test it in another drive (run some backup against that tape).

142 Posts

May 21st, 2014 03:00

Hi,

I have an ongoing issue where the tapes that are being marked prematurely full. This is recurring and not restricted to any set of savesets or pools. Daemon logs generally show it as "tape marked full" and nothing else, I have not found any reason in the logs for most of the times. I am using nsr server 8.0.2.4, windows 2008 R2, the library is attached via SCSI to nsr server. The drives are clean, no error or notifications received in the tape library GUI console. I got some of the tapes checked by the vendor and they reported that there is no issue with the tapes, all the tested tapes were found in good health.

What else reamins to be checked, both tape library and tapes have been checked by the respective vendors and nothing wrong was found. What can be attributed for this premature marking of tapes as full.

The last time it happened, I saw a couple of lines in the logs but could not search anything conclusive on-line.
tape.jpg

In some cases I recycle the tapes again and then sometimes clones finish, the other times they become premature full.

Regards

tech88kur

2.4K Posts

May 21st, 2014 06:00

There could be a lot of issues but where to start ...

  - Especially look at the cabling and termination

  - Make sure that you have the latest drivers (SCSI, tape drive) installed

  - Verify that the correct media type has been selected

Try to isolate the problem:

  - Try to find out whether this issue only occures when the server is really busy.

  - Switch the drive's CDI characteristics (default: SCSI commands) to 'not used'.

  - Remove the drive from the jukebox configuration and check that the drive by itself runs fine.

       Either use NetWorker routines like tapeexer.exe

       If you are in doubt use another backup SW to test.

  - If the issue can be repeated, add verbosity to the command or let it run in bedug mode.

142 Posts

May 26th, 2014 13:00

thanks bingo for writing

There could be a lot of issues but where to start ...

  - Especially look at the cabling and termination

  - Make sure that you have the latest drivers (SCSI, tape drive) installed

  - Verify that the correct media type has been selected

  • cabling was alright
  • the drives and firmwares are the latest one
  • Correct media type was selected

Try to isolate the problem:

  - Try to find out whether this issue only occures when the server is really busy.

  - Switch the drive's CDI characteristics (default: SCSI commands) to 'not used'.

  - Remove the drive from the jukebox configuration and check that the drive by itself runs fine.

       Either use NetWorker routines like tapeexer.exe

       If you are in doubt use another backup SW to test.

  - If the issue can be repeated, add verbosity to the command or let it run in bedug mode.

  • the issues happens at any time, with no association found with server busy-ness
  • did not check the 2nd. 3rd and 4th one. Instead as a matter of hit and trial we thought of deleting the recyclable volumes before re-using them and that has really helped. We found that the only recyclable tapes were getting prematurely full, but if deleted their catalog entries and re-labeled them then they worked absolutely fine. In last seven days we deleted and re-labeled 7 LTO5 tapes, all of them accepted data till their maximum size, we used a recyclable tape and it got prematurely full.

This is unexpected but I have taken that as a workaround.

Regards

tech88kur

2.4K Posts

May 27th, 2014 03:00

Thanks for sharing the information.

This sounds absolute strange. We also use LTO5s and never experienced such problems. Unfortunately we never used NW 8.0.2.


The weird thing is that it sounds like NW would somehow read the data block before overwriting it. But this is technically impossible. The only situation where NW reads the tape during a write is when he positions to the logical end-of-tape (LEOL) to append new data.

I do not deny what you have experienced - i just cannot explain how this could happen.

Also, i doubt that you use SCSI as i think there is no LTO5 drive available with SCSI at all - I think yours has a SAS interface (which might need other drivers) . But of course i do not know all brands.

The other issue could be that you have not set the correct block size and/or that something limits the block size. This is especially true for drives with SCSI/SAS interfaces. Use scanner and verify that as follows:

C:\>scanner -m \\.\Tape0
8909:scanner: using '\\.\Tape0' as the device name
93507:scanner: Cannot set \\.\Tape0 block size to 262144 bytes. Maximum size configurable is 65536 bytes.
Please modify your SCSI configuration to allow transfers of at least 262144 bytes(search for the MaximumSGList registry parameter)
93504:scanner: Cannot set \\.\Tape0 block size: 262144 bytes is outside the range 32768-65536
8936:scanner: scanning LTO Ultrium-3 tape Test.001 on \\.\Tape0
32350:scanner: adding LTO Ultrium-3 tape Test.001 to pool Default
93504:scanner: Cannot set \\.\Tape0 block size: 262144 bytes is outside the range 32768-65536
8770:scanner: fn 2 rn 0 read error The handle is invalid.

93504:scanner: Cannot set \\.\Tape0 block size: 262144 bytes is outside the range 32768-65536
8761:scanner: done with LTO Ultrium-3 tape Test.001

84255:scanner: tape_rewind rewind failed: The handle is invalid.
(6)

C:\>

Such could occur if you upgraded your hardware and only installed new tape drives along with their drivers.

On the other hand this should not be a general problem - using smaller blocks you just cannot achieve the max. capacity.

However, in case of a hardware upgrade I would never use the old tapes for new backups - even if LTO5 drives support writing to LTO4 tapes.

142 Posts

May 28th, 2014 08:00

bingo,

Just when I thought deleting the catalog entries was a workaround I got a tape which got premature full even in this case.

I noticed that the clone session was in long waiting session before I got the below. The logs are not very conclusive

tmp2.jpg

1) I disabled CDI and changed it to none

2) tried scanner -m, please find beloe output

tmp3.png

I not good at using scanner. I still have to check if the options can be changed to get a result similiar to what you showed.

I am also unknown to networker routines like tapeexr.exe. I am yet to read about them. Meanwhile I thought of putting this here in case you get something in. Please remember that the 1) issue is only with recyclable tapes 2) drives and firmwares have proper updates.

Edit :I also noticed that when the tapes get premature full the corresponding saveset shows as aborted and the saveset does not copmplete even if we insert a new volume for the pool, other savesets do clone, also this happens for saveset which are greater than 50 Gb

Regards

tech88kur

2.4K Posts

May 28th, 2014 09:00

The information I showed you has been created with a current NW Version (8.1.1.4). To make it appear in 8.0.2.4 you could try to add verbosity (scanner -mv ...    or     scanner -mvv ...). AFAIR the details have been added around NW 7.5.x.

The other behavior you see is correct:

If you abort a save set it has to be restarted completely. If you backup to tape, a new SSID will be assigned so the new backup can use the same media.

However, if this happens for a clone, NW will use another tape as the same SSID is used already for the previous tape (although it is incomplete). One of NW's core rule is that there can only be 1 entry for a save set on any media. And the clone will be restarted from the beginning of the SS.

Before you recycle a tape, go ahead and delete the volume from the media db (nsrmm -s volume_name). I remember  another thread this week where the user stated that this helped. Athough I still cannot explain why, you may want to try that.

4 Operator

 • 

14.4K Posts

May 28th, 2014 12:00

Did you check error counts on switch ports?

142 Posts

May 28th, 2014 12:00

Yes, I am deleting the tape before re-using it (though I am not using nsrmm but the gui delete option -"i delete both media db and file index entry, if any") and it seemed to work good but not any more now. The biggest problem here is that of keeping track of aborted saveset or of the savesets which are not cloned which has to be done either by pulling mminfo output in a csv file or by using gui to check aborted saveset (it is not feasible to do this daily) on tape + the excess numbers of tapes we are using, this is making it very difficult.

Problem with me here is that I do not know what is erroneous - tapes, library, NW, since I have got the tapes analyzed ( no error was reported in multiple reads and writes on tapes, they further came back saying to check our backup software), no error on Library notification bar/ history, no conclusive NW logs.

Regards

tech88kur

2.4K Posts

May 28th, 2014 17:00

You better get used to run mminfo from the command line which gives you much more possibilities to use a more precise query and to get the better report. Due to the huge number of options it seems to be complicated but once you learned the general usage mminfo from the CLI is really helpful:

  - You know exactly what you do (see below)

  - If you run a manipulation in the next step you can easily verify the result (rerun the command as it is still stored in the buffer).

  - You do not need to switch windows.

  - You can use the command for scripting, if necessary.

For example you could check for aborted/incomplete save set or the number of copies/validcopies.

-------------

Looking closer through your scanner output i am really confused. Look at the SSIDs - it says (more or less):

  scanning ssid 3514822901

     No media in drive

  scanning ssid 3565154534

So why should NW switch save sets in the middle of a process ? - This almost looks to me as if the tape has been unloaded during the process and it has been replaced by another one.

Is this possible? - at least the unload is possible at any time, for example by a SCSI bus reset.

However, this is also possible due to NetWorker! - It will occur if ...

  - The jukebox' 'Idle device timeout' is not set to 0 (the default value)

  - AND the tape has been mounted before scanning. You must load but not mount the tape.

    You can achieve that from either the jukebox GUI ...

exaple.jpg

   or from the command line: "nsrjb -ln ..."

-------------------

For the next steps:

  - run scanner only if no other jobs will use the jukebox. Use a standalone drive, if possible.

  - use "scanner -n -mvv \\.\Tape0". This will not update the info in the media index and it will provide more info and hopefully tell you a bit more when the problem re-occurs.

  - If it does, verify to which volumes the SSIDs belong using 'mminfo -q "ssid= - r "ssid,volume" '. If they are different, the tape must have been swapped.

  - If so, on the next try, you can hopefully look inside the jukebox (most likely on large ones) and verify that this is really the case.

No Events found!

Top