Unsolved
This post is more than 5 years old
4 Posts
0
89511
May 17th, 2010 11:00
Slow Performance with vSphere and MD3000i
Hi,
we have got a new MD3000i last week and I had really fun playing with it. I like it when hardware is simple to set up and with this tutorial it was a peace of cake getting our two esxi-servers up and running againt the storage.
But after some initial testing of failover and other stuff, I did started benchmarking the whole system. And I found out I only get about 30MByte/s read-only. So I added Jumbo-Frames, tried connecting the servers directly via crossover, enabled and disabled round robin but the best I could get with HDTune (and verified it via smCLI or esxtop) was about 50MB/s. Which is fine for most servers, but we want to add a virtualized fileserver and then 50MB/s is not enough.
Out of curiosity I then ran 2 instances of HDTune and the throughput I recieved nearly doubled. With 4 instances I had the full 110MB/s and the switch was at running 98% load.
Now comes the part where I need help:
How do I get 100MB/s just for one instance of the benchmark (or a filecopy etc.)?
Would be great if someone here could help me out ;)
Thanks in advance!
Chris
we have got a new MD3000i last week and I had really fun playing with it. I like it when hardware is simple to set up and with this tutorial it was a peace of cake getting our two esxi-servers up and running againt the storage.
But after some initial testing of failover and other stuff, I did started benchmarking the whole system. And I found out I only get about 30MByte/s read-only. So I added Jumbo-Frames, tried connecting the servers directly via crossover, enabled and disabled round robin but the best I could get with HDTune (and verified it via smCLI or esxtop) was about 50MB/s. Which is fine for most servers, but we want to add a virtualized fileserver and then 50MB/s is not enough.
Out of curiosity I then ran 2 instances of HDTune and the throughput I recieved nearly doubled. With 4 instances I had the full 110MB/s and the switch was at running 98% load.
Now comes the part where I need help:
How do I get 100MB/s just for one instance of the benchmark (or a filecopy etc.)?
Would be great if someone here could help me out ;)
Thanks in advance!
Chris
No Events found!


david4hand
5 Posts
0
May 17th, 2010 18:00
JOHNADCO
2 Intern
•
847 Posts
0
May 18th, 2010 08:00
I can do some testing as well. We just installed our first Vsphere host yesterday.
cpt86
4 Posts
0
May 18th, 2010 09:00
The funny thing just is that in the same VM running against one LUN i get the performance i wish just by starting a second or third benchmark instance.
If it would be seperate VMs against different LUNs or different controller I would have a clue but every monitoring tool says to me I get around 100-110MB/s from one VM over one switch port to the storage. Just under some strange conditions. I want these 100MB/s with just one benchmark instance ;)
Edit: IOPS and latency are fine though...
david4hand
5 Posts
0
May 18th, 2010 10:00
I would think that 7x 15K RPM SAS drives in RAID 5 would be able to saturate a 1Gb/sec link. Especially since the ReadyNAS with regular 7200RPM SATA drives comes close.
My main issue with throughput is the speed at which I can backup my data from the SAN. Maybe I need to use something other than GhettoVCB, but at 30-50MB/s takes 50% longer to backup from the SAN to our NFS server than it did from local storage.
Using HD Tune Pro I get the following IOPS (Random Access test).
512 bytes 346 IOPS / 2.9ms / 0.169 MB/s
4 KB 338 IOPS / 3.0ms / 1.321 MB/s
64 KB 253 IOPS / 3.9ms / 15.859 MB/s
1 MB 91 IOPS / 10ms / 91.627MB/s
Random 130 IOPS / 7.7ms / 66.002 MB/s
cpt86 - how does this compare to what you get?
david4hand
5 Posts
0
May 18th, 2010 10:00
My performance seems on par with others. I'm going to try disabling Jumbo Frames again. It seems many people find JF actually hinder performance (but reduce % CPU a bit).
UPDATE: I disabled JF and my performance increased by about 10-15%. CPU usage only went up 2%.
UDATE2: With JF disabled I tried enabling Round Robin (which killed performance before). Performance went up another 10% on IOMETER, but dropped on HD Tune Pro.
Kong Yang
180 Posts
0
May 18th, 2010 18:00
Your performance can vary depending on your workload profile as you have observed with multiple benchmark instances and different benchmarks. Your IOPs and throughput seem low though. A RAID 5 volume should be able sustain 150 IOps per spindle with smaller IO packet size. Are you using TCP Offload b/c offload can negatively impact performance?
Next, since you are performing backup, your IO profile should be large, sequential reads from your MD3000i and sequential writes to your NFS server. So Jumbo Frames should help you. The one caveat with JF is that you need to have it enabled end-to-end. Otherwise it can negatively impact performance. With JF enabled, does the command "vmkping -s 9000 from your ESX server return correctly?
As for RR storage path policy, your trade-off is between path thrashing and having enough IOPs to maximize the bandwidth utilization on your multiple paths. Please refer to:
http://www.delltechcenter.com/page/A+%E2%80%9CMultivendor+Post%E2%80%9D+on+using+iSCSI+with+VMware+vSphere
Particular, Question #3.
david4hand
5 Posts
0
May 20th, 2010 09:00
I could never vmkping with a size of 9000, it maxed out at 8992 I think. JF were set on the NIC, vmknic, vswitch, MD3000i and PC 5424 switches. From what I've experienced though, the headache of JF doesn't justify the performance gain (if any).
Thanks.
JOHNADCO
2 Intern
•
847 Posts
0
May 20th, 2010 10:00
I must not have downloade the same version of HDtune. the version I have gave better results, but I think it's because we run 14 drives in our disk groups.
I did seem to be able to get better aggregate throughput, but with only one HDTUNE runnung I hit about 70mb per second. I am probably not testing what I need to test here. I am going to load IO meter eventually, but this upgrade to Vsphere from 3.5 / 2.5VC is killing me.
david4hand
5 Posts
0
May 20th, 2010 11:00
JOHNADCO
2 Intern
•
847 Posts
0
May 20th, 2010 16:00
cpt86
4 Posts
0
June 2nd, 2010 02:00
You should use a RDM device and IOmeter for the benchmark and the following settings:
Disk Targets:
Maximum Disk Size: 1000
# of Outsatind I/Os: 16
Access Specification:
100% write
2MB Transfer Request Size
100% Sequential
With these settings I got full speed.
Thanks for your help though!
Jeff Sullivan
184 Posts
0
June 2nd, 2010 07:00
koen.vdvelde
14 Posts
0
June 4th, 2010 04:00
can you please let us know what you mean with "full speed" ?
tia,
Koen.
cpt86
4 Posts
0
July 9th, 2010 03:00
About 110MByte/s on the fast array, about 80MBytes/s on the slow array.
brandstockil
2 Posts
0
July 10th, 2010 07:00
Isn't that 30MB/s down to 32bit windows limitation?