Take a look at the Disk IOPS - that's the real key - if the IO is not exceeding about 180 IOPS, then the response times and queue length at the LUN level should be OK.
Try disabling/enabling the Write cache on the new LUN. Also, what is the size of the new LUN and when did you bind it - mayabe it's still background zeroing.
What are the numbers at the host level? It may be a good idea to capture the stats at the host level using iostat (unix) or perfmon (windows) - write IOPS per LUN? hopefully no reads are gong on for backups on these LUNs - service time/response time? - write bock size compare between the two hosts - Are they using different SP's for active IO?
If your reviewing the analyzer files, take a look at the number of full stripe writes. If the number is high on the 3+1 and low on the 5+1, it means that the data is not as sequential as you may think.
Full stripe writes allows for large write IOs on the backend in which minimal amount of parity updates are needed. If in the 5+1 the data is sequential to a point, but not sequential enough to fill the entire stripe (large stripe here than the 3+1), then there may be many parity calculations stealing performance from you.
Also, check the drive IOPs like Glen said earlier.
When you create a new LUN and the disks have been previously used for other LUNs, the new LUN needs to be "zeroed" (filled with zeros to clear all data). This takes place in the background - it is part of the LUN initialization. See in the Best Practice guide around page 30 a section called "Fastbind".
OK, you got me interested in the number of stripe write. When I look at my new lun (assuming I am looking at the correct spot), I see 2543067 stripe crossings (this is by looking at the prop of the lun and selecting statistics). The old lun also has a high number. Does this relate to disk alignment? This is another topic but VMWare states you do not need to disk align since the vi client will do that for you when you create the datastore (I bet most ppl are jumping in their seats now).
FYI, all luns were presented new.
The guys are basically dumping SQL databases to this lun, so it must be seq.
Thanks for all the responses thus far from everyone.
First off, you can open NAR files in Navisphere. Go to tools, analyzer, archive, open. One thing I would do is take a look at Navisphere help. Under help there is a section called "Analyzing storage-system performance using analyzer". This has all the performance information on what you are looking at and how to use analyzer. This is the Navisphere admin guide as they don't produce this document anymore, they just put it into Navisphere help.
As for the stripe crossings, the number you see in properties screen is historical, so it really doesn't paint the whole picture. Open a nar and check out the number of disk crossings percentage on a lun to see the actual data. Disk crossing % is an "advanced" option so you need to advanced stats enabled. Click tools, analyzer, data logging and check the advanced box. In the data logging screen I would also uncheck the box that says "initially check all tree objects".
Per the Navisphere help: Disk Crossing (%) is the percentage of requests that require I/O to at least two disks compared to the total number of server requests.
kelleg
4 Operator
•
4.5K Posts
0
May 1st, 2009 07:00
1. are you using the same test with each LUN - same IO size?
2. Are you using Analyzer to capture the data for the test? Is the Archive Interval set to 60 seconds to capture the best data?
3. Have you looked at the disk IOPS for each LUN?
4. Are either LUN using the vault drives
5. How long does the test run?
glen
kelleg
4 Operator
•
4.5K Posts
0
May 1st, 2009 08:00
Try disabling/enabling the Write cache on the new LUN. Also, what is the size of the new LUN and when did you bind it - mayabe it's still background zeroing.
glen
alokjain1
44 Posts
0
May 1st, 2009 08:00
- write IOPS per LUN? hopefully no reads are gong on for backups on these LUNs
- service time/response time?
- write bock size compare between the two hosts
- Are they using different SP's for active IO?
ironcheflouie
76 Posts
0
May 1st, 2009 08:00
2. Yes, using NaviAnalyzer, it may be at 2 min if I'm not mistaken
3. Not per disk. But at the lun level, the old lun goes up to ~ 500 - 600 Total IOPS AND the new lun will max out around ~400+.
4. Nope, no vault here. Both RG are are split between DAE's.
5. 1hr +
RyanP2
261 Posts
0
May 1st, 2009 10:00
Full stripe writes allows for large write IOs on the backend in which minimal amount of parity updates are needed. If in the 5+1 the data is sequential to a point, but not sequential enough to fill the entire stripe (large stripe here than the 3+1), then there may be many parity calculations stealing performance from you.
Also, check the drive IOPs like Glen said earlier.
-Ryan
kelleg
4 Operator
•
4.5K Posts
0
May 1st, 2009 12:00
glen
SKT2
2 Intern
•
1.3K Posts
0
May 1st, 2009 12:00
ironcheflouie
76 Posts
0
May 1st, 2009 14:00
I can try the disable/enable write cache on the new lun.
The new lun size is 1.3 TB, the old one was 700 GB.
ironcheflouie
76 Posts
0
May 1st, 2009 14:00
No read, all write. The old luns was pushing ~ 600 IOPS and the new lun ~450 IOPS, same test.
If I remembered correctly, service and response time was fine.
Same SP.
I used IOMeter once before, way back when, maybe a good test.
ironcheflouie
76 Posts
0
May 1st, 2009 15:00
OK, you got me interested in the number of stripe write. When I look at my new lun (assuming I am looking at the correct spot), I see 2543067 stripe crossings (this is by looking at the prop of the lun and selecting statistics). The old lun also has a high number. Does this relate to disk alignment? This is another topic but VMWare states you do not need to disk align since the vi client will do that for you when you create the datastore (I bet most ppl are jumping in their seats now).
FYI, all luns were presented new.
The guys are basically dumping SQL databases to this lun, so it must be seq.
Thanks for all the responses thus far from everyone.
Cheers
RyanP2
261 Posts
0
May 4th, 2009 07:00
As for the stripe crossings, the number you see in properties screen is historical, so it really doesn't paint the whole picture. Open a nar and check out the number of disk crossings percentage on a lun to see the actual data. Disk crossing % is an "advanced" option so you need to advanced stats enabled. Click tools, analyzer, data logging and check the advanced box. In the data logging screen I would also uncheck the box that says "initially check all tree objects".
Per the Navisphere help: Disk Crossing (%) is the percentage of requests that require I/O to at least two disks compared to the total number of server requests.
-Ryan