dugans1

186 Posts

970

July 28th, 2009 21:00

82gb backup & Cancelled after 162 minutes

Hi Guys,

I am having some problems with backup speeds on networker 7.5.1 and was wondering if anyone has any ideas about how to investigate the cause.

I have just taken a single client out of a group and run it independently to examine the speed.

By the way this has been going on for some time now and I am sick of it.

So the client I chose was for a guy that wanted to execute a change and we agreed to do a backup first so here is how it goes.

Physical box 4X processors
1gb NIC
And
C D & E Drives

C 48gb of data
D 35gb data
E 18gb data
VSS boot 51mb
VSS Services 37gb
Vss Fileset 555mb
VSS disk 2808kb

I cancelled the backup after 162 minutes and this was at 88%

The system was located in the same data centre as the Tape Library and the Storage Node and the Networker Server. This all has a 1gb Copper backbone

The storage node is connected to the library via 4X fabric connections and offloads from copper to fabric at the storage node.

Could this be a shoe shining problem?

The drives it is writing to are SDLT600's and should be able to take 100mb P/Sec
the copper nic's in the storage node are switch assisted load balanced so 2gb feeding an 8 gb fabric pipe.

The Client that I used in this example was hitting 3% utilization at the Nic and the processors were doing bugger all.

The storage node was at 0.8 until at the NIC and processor was bugger all.

Even on the best day with all drives loaded on the tape library. I see them hit a max of 30mb Per sec per drive at most.

Responses(9)

DavidHampson

1.1K Posts

0

July 29th, 2009 02:00

If the NIC is not doing much perhaps the problem exists before that on the disk - things like fragmentation can sometimes causes things to run very slowly. Try taking Networker out of the equation by FTPing the data from the client and see if it goes slowly. You could also look at running a speed test with bigasm which will show you whether the connection to the tape device is working okay.
Modern devices are less prone to shoe-shining - they should be able to change down their speed when data is arriving slowly but it is not entirely foolproof!

benzino1

244 Posts

0

July 29th, 2009 03:00

Definitely you have to check disk subsystem. You can use ftp or any other method to check whether the disk spindles can give you the transfer rate you expected.

dugans1

186 Posts

0

July 29th, 2009 03:00

Thanks David,

I will attemp the FTP as you suggest and see how that goes. I think it is the network but i have to build credable evidence first.

never seen bigasm but will look it up. I don't think it is server related due to the client being a HP G5 (Very new Box) and thus powerfull and my storage nodes and server are HP G4's DL380's

I thought it might have been between the nic in the storage node off loading the IPO paxckets to the Fabric connection within the server it's self. but after reading your post maybe i think i will go back a step and re-look at the network. Maybe a packet sniffer might be the way to go.

the thing is it is consistent speeds in 2 different data centers in two different geographical locations within the the same state.

and that is consistently bad.

If anyone has any other ideas i would love to hear them and sure this is not davids last post.

dugans1

186 Posts

0

July 29th, 2009 03:00

Also 2 exact same tape libraries, same firmware and same drives and same fabrics.

they are HP E Series ESL 286 Tape Libraries.

dgreggs1

86 Posts

0

July 29th, 2009 06:00

You mention "drives" but don't explicitly state the number of drives. My first guess would be "shoeshining" as you mentioned. Use 1 drive when you do the backup and I would bet you get better performance.

dugans1

186 Posts

0

July 29th, 2009 14:00

Thanks Dana

Yes i have 6X drives in each library and this one job was allocated to have a single instance.

So the Job was only using one drive.
Target Sessions was set to 1
and max sessions were set to 1

dugans1

186 Posts

0

August 6th, 2009 23:00

So i just wanted to bring this back in to the spotlight, I did some testing regarding speeds and found some interesting results.
Maybe someone has some ideas of a solution or suggestion on this to.

ok i did 2X tests.

First, I allocated a lun to my SN in the same DC as the Library. Stuck 2X medimum sized files on it 5gb each. I then backed up the lun and got around 91mb/sec to a single drive.

Next i deleted the 5gb files and copied 5gb worth of up to 100kb files and backed them up throught the same route. the troughput here was about 10-12mb/s crappy

So this is interesting, i have decided to push all my big file backups through the Fabric route and leave all my smaller file sized backups on ethernet.

No would i be better off snapshotting my file servers V's Traditional backups to attemp to speed these up? maybe through VSS and implement NMM on these boxes.

What are the rest of you guys doing in relation to File servers? I would love to hear from you.

Thanks Lads and Ladies.

Hormigo

131 Posts

0

August 7th, 2009 11:00

The backup ever abort with 162 min?

If yes, you have firewall between NW Server, SN or Client?

If yes, you check NSR_KEEP_ALIVE ?

Regards,
Hormigo

ble1

2 Intern

•

14.3K Posts

0

September 3rd, 2009 07:00

If yes, you check NSR_KEEP_ALIVE ?

I assume you are referring to NSR_KEEPALIVE_WAIT= ...

View All

No Events found!

NetWorker

82gb backup & Cancelled after 162 minutes