Both could be correct (bad media and SCSI reset even above looks just like typical SCSI reset issue). It could be also dirty drive, but this is less likely.
Still it could be SCSI reset and perhaps drive was cleaned automatically in the mean time. Unless some important and valid data is on tape I would suggest to relabel tape used during the error and test it in another drive (run some backup against that tape).
I have an ongoing issue where the tapes that are being marked prematurely full. This is recurring and not restricted to any set of savesets or pools. Daemon logs generally show it as "tape marked full" and nothing else, I have not found any reason in the logs for most of the times. I am using nsr server 8.0.2.4, windows 2008 R2, the library is attached via SCSI to nsr server. The drives are clean, no error or notifications received in the tape library GUI console. I got some of the tapes checked by the vendor and they reported that there is no issue with the tapes, all the tested tapes were found in good health.
What else reamins to be checked, both tape library and tapes have been checked by the respective vendors and nothing wrong was found. What can be attributed for this premature marking of tapes as full.
The last time it happened, I saw a couple of lines in the logs but could not search anything conclusive on-line.
In some cases I recycle the tapes again and then sometimes clones finish, the other times they become premature full.
There could be a lot of issues but where to start ...
- Especially look at the cabling and termination
- Make sure that you have the latest drivers (SCSI, tape drive) installed
- Verify that the correct media type has been selected
cabling was alright
the drives and firmwares are the latest one
Correct media type was selected
Try to isolate the problem:
- Try to find out whether this issue only occures when the server is really busy.
- Switch the drive's CDI characteristics (default: SCSI commands) to 'not used'.
- Remove the drive from the jukebox configuration and check that the drive by itself runs fine.
Either use NetWorker routines like tapeexer.exe
If you are in doubt use another backup SW to test.
- If the issue can be repeated, add verbosity to the command or let it run in bedug mode.
the issues happens at any time, with no association found with server busy-ness
did not check the 2nd. 3rd and 4th one. Instead as a matter of hit and trial we thought of deleting the recyclable volumes before re-using them and that has really helped. We found that the only recyclable tapes were getting prematurely full, but if deleted their catalog entries and re-labeled them then they worked absolutely fine. In last seven days we deleted and re-labeled 7 LTO5 tapes, all of them accepted data till their maximum size, we used a recyclable tape and it got prematurely full.
This is unexpected but I have taken that as a workaround.
This sounds absolute strange. We also use LTO5s and never experienced such problems. Unfortunately we never used NW 8.0.2.
The weird thing is that it sounds like NW would somehow read the data block before overwriting it. But this is technically impossible. The only situation where NW reads the tape during a write is when he positions to the logical end-of-tape (LEOL) to append new data.
I do not deny what you have experienced - i just cannot explain how this could happen.
Also, i doubt that you use SCSI as i think there is no LTO5 drive available with SCSI at all - I think yours has a SAS interface (which might need other drivers) . But of course i do not know all brands.
The other issue could be that you have not set the correct block size and/or that something limits the block size. This is especially true for drives with SCSI/SAS interfaces. Use scanner and verify that as follows:
C:\>scanner -m \\.\Tape0 8909:scanner: using '\\.\Tape0' as the device name 93507:scanner: Cannot set \\.\Tape0 block size to 262144 bytes. Maximum size configurable is 65536 bytes. Please modify your SCSI configuration to allow transfers of at least 262144 bytes(search for the MaximumSGList registry parameter) 93504:scanner: Cannot set \\.\Tape0 block size: 262144 bytes is outside the range 32768-65536 8936:scanner: scanning LTO Ultrium-3 tape Test.001 on \\.\Tape0 32350:scanner: adding LTO Ultrium-3 tape Test.001 to pool Default 93504:scanner: Cannot set \\.\Tape0 block size: 262144 bytes is outside the range 32768-65536 8770:scanner: fn 2 rn 0 read error The handle is invalid.
93504:scanner: Cannot set \\.\Tape0 block size: 262144 bytes is outside the range 32768-65536 8761:scanner: done with LTO Ultrium-3 tape Test.001
84255:scanner: tape_rewind rewind failed: The handle is invalid. (6)
C:\>
Such could occur if you upgraded your hardware and only installed new tape drives along with their drivers.
On the other hand this should not be a general problem - using smaller blocks you just cannot achieve the max. capacity.
However, in case of a hardware upgrade I would never use the old tapes for new backups - even if LTO5 drives support writing to LTO4 tapes.
Just when I thought deleting the catalog entries was a workaround I got a tape which got premature full even in this case.
I noticed that the clone session was in long waiting session before I got the below. The logs are not very conclusive
1) I disabled CDI and changed it to none
2) tried scanner -m, please find beloe output
I not good at using scanner. I still have to check if the options can be changed to get a result similiar to what you showed.
I am also unknown to networker routines like tapeexr.exe. I am yet to read about them. Meanwhile I thought of putting this here in case you get something in. Please remember that the 1) issue is only with recyclable tapes 2) drives and firmwares have proper updates.
Edit :I also noticed that when the tapes get premature full the corresponding saveset shows as aborted and the saveset does not copmplete even if we insert a new volume for the pool, other savesets do clone, also this happens for saveset which are greater than 50 Gb
The information I showed you has been created with a current NW Version (8.1.1.4). To make it appear in 8.0.2.4 you could try to add verbosity (scanner -mv ... or scanner -mvv ...). AFAIR the details have been added around NW 7.5.x.
The other behavior you see is correct:
If you abort a save set it has to be restarted completely. If you backup to tape, a new SSID will be assigned so the new backup can use the same media.
However, if this happens for a clone, NW will use another tape as the same SSID is used already for the previous tape (although it is incomplete). One of NW's core rule is that there can only be 1 entry for a save set on any media. And the clone will be restarted from the beginning of the SS.
Before you recycle a tape, go ahead and delete the volume from the media db (nsrmm -s volume_name). I remember another thread this week where the user stated that this helped. Athough I still cannot explain why, you may want to try that.
Yes, I am deleting the tape before re-using it (though I am not using nsrmm but the gui delete option -"i delete both media db and file index entry, if any") and it seemed to work good but not any more now. The biggest problem here is that of keeping track of aborted saveset or of the savesets which are not cloned which has to be done either by pulling mminfo output in a csv file or by using gui to check aborted saveset (it is not feasible to do this daily) on tape + the excess numbers of tapes we are using, this is making it very difficult.
Problem with me here is that I do not know what is erroneous - tapes, library, NW, since I have got the tapes analyzed ( no error was reported in multiple reads and writes on tapes, they further came back saying to check our backup software), no error on Library notification bar/ history, no conclusive NW logs.
You better get used to run mminfo from the command line which gives you much more possibilities to use a more precise query and to get the better report. Due to the huge number of options it seems to be complicated but once you learned the general usage mminfo from the CLI is really helpful:
- You know exactly what you do (see below)
- If you run a manipulation in the next step you can easily verify the result (rerun the command as it is still stored in the buffer).
- You do not need to switch windows.
- You can use the command for scripting, if necessary.
For example you could check for aborted/incomplete save set or the number of copies/validcopies.
-------------
Looking closer through your scanner output i am really confused. Look at the SSIDs - it says (more or less):
scanning ssid 3514822901
No media in drive
scanning ssid 3565154534
So why should NW switch save sets in the middle of a process ? - This almost looks to me as if the tape has been unloaded during the process and it has been replaced by another one.
Is this possible? - at least the unload is possible at any time, for example by a SCSI bus reset.
However, this is also possible due to NetWorker! - It will occur if ...
- The jukebox' 'Idle device timeout' is not set to 0 (the default value)
- AND the tape has been mounted before scanning. You must load but not mount the tape.
You can achieve that from either the jukebox GUI ...
or from the command line: "nsrjb -ln ..."
-------------------
For the next steps:
- run scanner only if no other jobs will use the jukebox. Use a standalone drive, if possible.
- use "scanner -n -mvv \\.\Tape0". This will not update the info in the media index and it will provide more info and hopefully tell you a bit more when the problem re-occurs.
- If it does, verify to which volumes the SSIDs belong using 'mminfo -q "ssid= - r "ssid,volume" '. If they are different, the tape must have been swapped.
- If so, on the next try, you can hopefully look inside the jukebox (most likely on large ones) and verify that this is really the case.
razvan2
2 Intern
•
181 Posts
0
April 11th, 2006 01:00
ble1
4 Operator
•
14.4K Posts
1
April 11th, 2006 01:00
ble1
4 Operator
•
14.4K Posts
1
April 11th, 2006 02:00
tech88kur
142 Posts
0
May 21st, 2014 03:00
Hi,
I have an ongoing issue where the tapes that are being marked prematurely full. This is recurring and not restricted to any set of savesets or pools. Daemon logs generally show it as "tape marked full" and nothing else, I have not found any reason in the logs for most of the times. I am using nsr server 8.0.2.4, windows 2008 R2, the library is attached via SCSI to nsr server. The drives are clean, no error or notifications received in the tape library GUI console. I got some of the tapes checked by the vendor and they reported that there is no issue with the tapes, all the tested tapes were found in good health.
What else reamins to be checked, both tape library and tapes have been checked by the respective vendors and nothing wrong was found. What can be attributed for this premature marking of tapes as full.
The last time it happened, I saw a couple of lines in the logs but could not search anything conclusive on-line.

In some cases I recycle the tapes again and then sometimes clones finish, the other times they become premature full.
Regards
tech88kur
bingo.1
2.4K Posts
2
May 21st, 2014 06:00
There could be a lot of issues but where to start ...
- Especially look at the cabling and termination
- Make sure that you have the latest drivers (SCSI, tape drive) installed
- Verify that the correct media type has been selected
Try to isolate the problem:
- Try to find out whether this issue only occures when the server is really busy.
- Switch the drive's CDI characteristics (default: SCSI commands) to 'not used'.
- Remove the drive from the jukebox configuration and check that the drive by itself runs fine.
Either use NetWorker routines like tapeexer.exe
If you are in doubt use another backup SW to test.
- If the issue can be repeated, add verbosity to the command or let it run in bedug mode.
tech88kur
142 Posts
0
May 26th, 2014 13:00
thanks bingo for writing
This is unexpected but I have taken that as a workaround.
Regards
tech88kur
bingo.1
2.4K Posts
1
May 27th, 2014 03:00
Thanks for sharing the information.
This sounds absolute strange. We also use LTO5s and never experienced such problems. Unfortunately we never used NW 8.0.2.
The weird thing is that it sounds like NW would somehow read the data block before overwriting it. But this is technically impossible. The only situation where NW reads the tape during a write is when he positions to the logical end-of-tape (LEOL) to append new data.
I do not deny what you have experienced - i just cannot explain how this could happen.
Also, i doubt that you use SCSI as i think there is no LTO5 drive available with SCSI at all - I think yours has a SAS interface (which might need other drivers) . But of course i do not know all brands.
The other issue could be that you have not set the correct block size and/or that something limits the block size. This is especially true for drives with SCSI/SAS interfaces. Use scanner and verify that as follows:
C:\>scanner -m \\.\Tape0
8909:scanner: using '\\.\Tape0' as the device name
93507:scanner: Cannot set \\.\Tape0 block size to 262144 bytes. Maximum size configurable is 65536 bytes.
Please modify your SCSI configuration to allow transfers of at least 262144 bytes(search for the MaximumSGList registry parameter)
93504:scanner: Cannot set \\.\Tape0 block size: 262144 bytes is outside the range 32768-65536
8936:scanner: scanning LTO Ultrium-3 tape Test.001 on \\.\Tape0
32350:scanner: adding LTO Ultrium-3 tape Test.001 to pool Default
93504:scanner: Cannot set \\.\Tape0 block size: 262144 bytes is outside the range 32768-65536
8770:scanner: fn 2 rn 0 read error The handle is invalid.
93504:scanner: Cannot set \\.\Tape0 block size: 262144 bytes is outside the range 32768-65536
8761:scanner: done with LTO Ultrium-3 tape Test.001
84255:scanner: tape_rewind rewind failed: The handle is invalid.
(6)
C:\>
Such could occur if you upgraded your hardware and only installed new tape drives along with their drivers.
On the other hand this should not be a general problem - using smaller blocks you just cannot achieve the max. capacity.
However, in case of a hardware upgrade I would never use the old tapes for new backups - even if LTO5 drives support writing to LTO4 tapes.
tech88kur
142 Posts
0
May 28th, 2014 08:00
bingo,
Just when I thought deleting the catalog entries was a workaround I got a tape which got premature full even in this case.
I noticed that the clone session was in long waiting session before I got the below. The logs are not very conclusive
1) I disabled CDI and changed it to none
2) tried scanner -m, please find beloe output
I not good at using scanner. I still have to check if the options can be changed to get a result similiar to what you showed.
I am also unknown to networker routines like tapeexr.exe. I am yet to read about them. Meanwhile I thought of putting this here in case you get something in. Please remember that the 1) issue is only with recyclable tapes 2) drives and firmwares have proper updates.
Edit :I also noticed that when the tapes get premature full the corresponding saveset shows as aborted and the saveset does not copmplete even if we insert a new volume for the pool, other savesets do clone, also this happens for saveset which are greater than 50 Gb
Regards
tech88kur
bingo.1
2.4K Posts
1
May 28th, 2014 09:00
The information I showed you has been created with a current NW Version (8.1.1.4). To make it appear in 8.0.2.4 you could try to add verbosity (scanner -mv ... or scanner -mvv ...). AFAIR the details have been added around NW 7.5.x.
The other behavior you see is correct:
If you abort a save set it has to be restarted completely. If you backup to tape, a new SSID will be assigned so the new backup can use the same media.
However, if this happens for a clone, NW will use another tape as the same SSID is used already for the previous tape (although it is incomplete). One of NW's core rule is that there can only be 1 entry for a save set on any media. And the clone will be restarted from the beginning of the SS.
Before you recycle a tape, go ahead and delete the volume from the media db (nsrmm -s volume_name). I remember another thread this week where the user stated that this helped. Athough I still cannot explain why, you may want to try that.
ble1
4 Operator
•
14.4K Posts
1
May 28th, 2014 12:00
Did you check error counts on switch ports?
tech88kur
142 Posts
0
May 28th, 2014 12:00
Yes, I am deleting the tape before re-using it (though I am not using nsrmm but the gui delete option -"i delete both media db and file index entry, if any") and it seemed to work good but not any more now. The biggest problem here is that of keeping track of aborted saveset or of the savesets which are not cloned which has to be done either by pulling mminfo output in a csv file or by using gui to check aborted saveset (it is not feasible to do this daily) on tape + the excess numbers of tapes we are using, this is making it very difficult.
Problem with me here is that I do not know what is erroneous - tapes, library, NW, since I have got the tapes analyzed ( no error was reported in multiple reads and writes on tapes, they further came back saying to check our backup software), no error on Library notification bar/ history, no conclusive NW logs.
Regards
tech88kur
bingo.1
2.4K Posts
0
May 28th, 2014 17:00
You better get used to run mminfo from the command line which gives you much more possibilities to use a more precise query and to get the better report. Due to the huge number of options it seems to be complicated but once you learned the general usage mminfo from the CLI is really helpful:
- You know exactly what you do (see below)
- If you run a manipulation in the next step you can easily verify the result (rerun the command as it is still stored in the buffer).
- You do not need to switch windows.
- You can use the command for scripting, if necessary.
For example you could check for aborted/incomplete save set or the number of copies/validcopies.
-------------
Looking closer through your scanner output i am really confused. Look at the SSIDs - it says (more or less):
scanning ssid 3514822901
No media in drive
scanning ssid 3565154534
So why should NW switch save sets in the middle of a process ? - This almost looks to me as if the tape has been unloaded during the process and it has been replaced by another one.
Is this possible? - at least the unload is possible at any time, for example by a SCSI bus reset.
However, this is also possible due to NetWorker! - It will occur if ...
- The jukebox' 'Idle device timeout' is not set to 0 (the default value)
- AND the tape has been mounted before scanning. You must load but not mount the tape.
You can achieve that from either the jukebox GUI ...
or from the command line: "nsrjb -ln ..."
-------------------
For the next steps:
- run scanner only if no other jobs will use the jukebox. Use a standalone drive, if possible.
- use "scanner -n -mvv \\.\Tape0". This will not update the info in the media index and it will provide more info and hopefully tell you a bit more when the problem re-occurs.
- If it does, verify to which volumes the SSIDs belong using 'mminfo -q "ssid= - r "ssid,volume" '. If they are different, the tape must have been swapped.
- If so, on the next try, you can hopefully look inside the jukebox (most likely on large ones) and verify that this is really the case.