Start a Conversation

This post is more than 5 years old

Solved!

Go to Solution

1034

December 19th, 2010 11:00

Optimal page size for vsphere esxi 4.1/vmfs3 use

Unfortunately there is so much online discussion of block size used within vmdk files that searches for the underlying block size that ESXi uses on its vmfs version 3 file system or what block size it uses when reading and writing to the physical disks doesn't produce any useful results.  I'm trying to determine what the optimal page size for a Clariion CX4-480 serving nothing but ESXi 4.1 hosts would be but need to figure out what the vmfs3 block size is and/or what size blocks it reads and writes with.  I figured maybe searching for optimal raid stripe size for vmware might help but that was a dead end too.  Anyone know or experimented with it enough to figure it out?

I'll post back what my tests reveal if I end up walking through all the settings.

474 Posts

December 19th, 2010 11:00

ESX/ESXi reads/writes to the physical disk with the same size IOs as the guest.  So if you are running Windows guests with 4K NTFS filesystems, the IOs will be as small as 4K.  Similarly, Windows guests with 64K NTFS Allocation sizes and SQL databases will cause larger IOs.  The block size in VMFS is really just related to file size allocation, if the block size is 1MB, then a VMDK file of 512KB will take up 1MB of space on the datastore, but the IO size is independent of that.

So the answer is, it depends on the guest OS and application.  If you have a very homogenous environment, such that every VM is running the same application, same filesystem, and same underlying OS, then setting the page size to match could help.  However, if the guest VMs are mixed OSs, mixed applications, mixed filesystems, you would be best leaving the page size at 8KB.

Hope that helps!

43 Posts

December 19th, 2010 12:00

It's CX4-480 with four 10gig ports active.  On the ESXi side we have powerpath installed.

The servers are all web servers so a lot of small reads as files are served, small mysql database queries, etc.  No 'enterprise applications' or anything big like that so our workload is probably biased more towards reads than many SAN users where they have oracle or exchange, etc..

I'm averaging 11200 random iops reported by the disk benchmark tool.

474 Posts

December 19th, 2010 12:00

Which model of CX4 are you using, and how many 10gb iSCSI ports do you have?

What is the intended workload?  Is this backup type application (ie: large sequential?) or database/transactional (ie: small block random?)  You note that increasing read cache helped performance but generally EMC doesn't recommend that.  The reason I ask is because you are measuring in MB/sec which is usually for large block sequential applications.  Most benchmarks for transactional IO tend to use IOPS and Response time measurements.

If you know what IO sizes the application prefers, you can format the filesystem using similar allocation sizes, which will help reduce IOPS required for a specific workload and increase bandwidth, assuming that's what you need.  

43 Posts

December 19th, 2010 12:00

Ah, that's great news then, it's probably 99% centos 5 guests (and will be 6 whenever that comes out) so nearly all standard ext3/ext4 which defaults to 4kb blocks on the partition sizes we use.  I've got enough benchmarks at the current setting so I'll try dropping it to 4kb page size to compare.

Still playing with cache allocation too; already found a much better performance for our typical workload with my first adjustment being down to about 2 GB mirrored write cache and the rest read cache on each SP.

The best performance average I've been able to achieve so far with extended benchmarking in a centos 5 32-bit guest has been 295 MB/sec write and 567 MB/sec reads using 10gig iSCSI to the CX4 and the guset was using a thick VMDK on a thick LUN built on a 15 drive raid pool.  The guest is using an aligned partition and paravirtual scsi adaptor.  Those are pretty nice numbers but still want to experiment with everything because the ideal combination will mean less wasted performance once we have the cluster fully populated with production guests.

474 Posts

December 19th, 2010 17:00

11200 IOPS is pretty good.  Sounds like it's mostly small-block random read.  If you can look at Analyzer, you can see cache hit ratios, IO sizes, etc at the array level and that may help you tune.  One thing to look at, is to look at prefetch usage on LUNs and if you find low or zero prefetch usage, disabling prefetch for the LUN could help reduce backend IO and possibly improve overall performance.  But if prefetch is helping, then leave it on.

Sounds like you are using Thin Pools with 15 drives in RAID5? What type of disks?

43 Posts

December 19th, 2010 19:00

Thanks for the info, I haven't delved into analyzer yet, that's next on my list.

Overall our CX4 is configured with (120) 144 GB 15krpm FC drives.  I created a raid pool of 40 drives raid 5 and one of 15 drives raid 5 and the performance was pretty much the same at the guest level so after that I just carved out more 15 drive pools figuring most of them could be put on one DAE each so in my mind that seemed safer given the performance didn't appear to be any different.

Since the thin provisioning on the vsphere side yielded better performance overall, we're doing thick on the CX4 side and thin on the vsphere side for what will be production, or thick on both sides for guests where disk has to be the absolute highest.

190 Posts

December 20th, 2010 08:00

Out of all the tinkering I've done with our VMs, the combination of the paravirtualized SCSI adapter with PowerPath on ESXi had the biggest bang. I'm sure there are more layers to the onion to mess with but using pretty much the defaults and adding these two has been more than sufficient for our needs.  Then again, I'm the storage/SAN/network/NAS/ESX/server admin so I don't have a lot of time to tweak everything   This arrangement does reduce the number of arguments between the "network" team and the "server" team.

43 Posts

December 20th, 2010 09:00

lol, yeah that sounds like me.  Not sure if you're also doing iscsi, but if you are, jumbo frames helps too on the read site, writes were about the same either way for me, reads I got about 80 MB/sec more and about 800 random iops more for block reads/writes.  We're doing 10gig from Cisco UCS to Cisco 4900M switches to the CX4, no routing, dedicated vlan, tagging on both ends.

43 Posts

December 20th, 2010 11:00

Got it, yeah we didn't have any FC and were going to have to upgrade our core to 10 gig anyway so I figured we'd try out the iscsi interfaces first to see how well they play vmware before investing in FC just for this.  So far it's looking pretty good but we'll see what happens when it goes live and hundreds of machines start beating up on it instead of just me and the benchmark programs.

190 Posts

December 20th, 2010 11:00

I have too much invested in fiber channel - and it just works.  If I had to start from scratch I might consider going another route but the environment is pretty static and the FC gear just keeps trucking along. I only use the iSCSI ports for MirrorView/A.  I have a friend that keeps telling me that FCoE is the future - but I don't live in the future!  I live in the here-and-now and until EMC stops puting FC ports on their Clariions I don't see much changing.

Dan

No Events found!

Top