Unsolved
This post is more than 5 years old
19 Posts
0
1221
September 11th, 2010 05:00
Poor MS SQL performance on CX4-120
Hi,
Let me preface this with the following: I don't have a lot of numbers (stats) to throw at you but the following is a good description. Please request further info and I'll do my best. Oh, and I'm not a DBA.
I'm running ESX4 update1 on a CX4-120 running Flare 28. I have a Windows 2008 R2 VM on a RAID5 LUN (4+1 Raid Group) . Our customer is running SQL Server 2008 R2 on it. The database is housed on the D drive which was created by Windows disk manager. I'm of the understanding that Windows 2008 aligns its disks correctly.
The customer advised of inferior performance when using CrystalDiskMark especially in the 4K read and writes with a 500 MB test size.He gave the figures and I tested and confirmed. The CX4 would only do about 1 MB/sec READ and 4.5 MB/sec on write. On his Netgear ReadyNAS Pro 3200, he can get 14 MB/sec read and 12 MB/sec write. I'm using 15k rpm FC drives in my Raid group. The Netgear uses, from what I've been able to get out of the customer and by looking at the Netgear site, 12 SATA disks in an XRAID2 setup. From what I can tell XRAID2 is a hot-expandable RAID5/6 proprietary format from Netgear.
I found this strange so I moved the VM to a RAID10 LUN on a CX4-120 and retested. I got the same results. Hmmm. I would have thought RAID 10 on 15 k rpm FC disks would blow the SATA out of the water. On sequential and 512K block tests, the CX4-120 blew the Netgear out of the water.
So, I thought the benchmark was at fault. I tried IOmeter with some tests for 16K and 64K 100% random 66% read which I saw recommended on a SQL server benchmark site. The CX4-120 killed the Netgear in those tests as well.....especially IOPs.
So, I though that spindle count must be it. I created a METALUN acrross 2 SATA RAID Groups with each RG in a 5+1 R5. I used SATA disks for this Metalun as I do not have the spare FC disks. Striped across all the disks, the 4K test got up to 9 MB/sec on read and write. Better.
I suggested the DBA (customer) load his databases and run some tests. Please note that the setup of the VM for data files and transaction logs does not come into play here as I am using the same VM on our SAN and his NAS. The DBA was disappointed with the performance.
He has done a test with SQLIO and the results are poor compared to the same test done on the VM on the Netgear Pro. I can post the results if need be.
It seems moving to the 12 disk metalun still did not give the performance the customer received on the Netgear Readynas.
Specs of the Netgear are: 16 GB RAM and 2 Quad core Xeon CPU.
The CX4: Each SP has 600 MB of cache and Core 2 Duo CPU. (Why can't EMC add more memory?? Cannot wait for FASTCache)
Would the Netgear be doing excessive caching and this affect the database results? Or is it spindle count that is winning for him. The Netgear is only running 3 VM's wheras the CX4-120 has 50 VM's running....I know I know... that is a big difference however I'm concerned that the CX4-120 is not performing to the best of its abilities. The ESX hosts connect via iscsi for some and FC for others but that does not affect the results too much.
As a side test, I have a home-made NAS of 8 SATA disks in a RAID 5 off an areca card. The server has 16 GB RAM with 2 x quad core Xeons. A copy of his VM gives good results in the 4K test...I'm yet to run SQLIO on that VM yet.
Where do I go from here as the customer is about to pull the plug in moving to our architecture as they think it won't scale if they grow?
Should I ring EMC support for this?
BTW, no, I don't have Navisphere Analyser. We have not purchased it yet. I've check the LUNs and they are not trespassed when I run the tests and the load is spread over both SP's.
Any help / suggestions will be appreciated.
Thanks,
David


dynamox
11 Legend
•
20.4K Posts
•
87.4K Points
1
September 11th, 2010 10:00
David,
i would ask your local EMC account manager to get you in touch with Clariion SPEED guru. They can download encrypted Navi Analyzer files (naz) from your CX4 and see what's going on and make recommendation.
kelleg
6 Operator
•
4.5K Posts
1
September 20th, 2010 14:00
David,
Did you ever get this resolved? Did you check that system cache was enabled for the array and for the LUNs?
15K FC disks should be able to provide at least 12MB/s for each disk and in a 4+1 that sould yield you 4 * 12mb/s = 48MB/s. If the IO size is 4KB, then you should really be looking at IO's per second - for 15K disks you should be able to get at least 180 IOPS per disk - 4+1 = 4 * 180 = 560 IOPS.
glen
davow
19 Posts
0
September 25th, 2010 00:00
Hi,
Thanks for the follow up. I've opened a ticket with EMC and they are looking at my SP Collects before requesting Analyser files.
Cheers.