Start a Conversation

Unsolved

This post is more than 5 years old

2757

September 15th, 2016 17:00

Identify as SSD within Linux guest running on VMware?

This question is about optimizing a system.  Through virtualization there are several layers of abstraction between the OS and the hardware.

We run a Compellent SC8000 with 3 Tiers of storage: Write Intensive SSD, Read Intensive SSD and two speeds of spinning HDDs. I have a (Well, several actually) Linux guest virtualized on top of VMware and I am curious about a few things such as the I/O scheduler and enabling TRIM and if I identify the drive in the Linux guest as an SSD drive? This particular guest is set to be tier 1 with progression to all tiers. As I understand it, this means that all writes happen on the write intensive SSDs however reads may end up coming from any of the drives... Of course the guest doesn't know any of this.

I wanted to get some opinions from some of the folks on this forum. Do I use an I/O scheduler that is optimized for SSD drives? Do I enable TRIM or run FSTRIM on my storage volumes? I believe I can manually identify a drive as an SSD in LInux, do I do this?

This particular host sees mainly reads, probably 80/20 reads. It is mostly random reads of a fairly small request size. Average request size is under 100Kb as reported in iostat.

While many things got easier with Virtualization, these very fine grain tuning items get a little more complicated and less straight forward.

I appreciate all the input!

-Eduard Tieseler

5 Practitioner

 • 

274.2K Posts

September 15th, 2016 17:00

Hello,  

 As you noted, there are not many layers between the VM's OS and disk.  

 Since you are running VMware (what version BTW?) at least the first disk is going to be a VMDK.  So running TRIM or FSTRIM isn't going to result in a change on the backend storage.  Since VMFS doesn't support UNMAP from the guests.   You have to run UNMAP at the ESXi node CLI.   That's only for files that VMware has deleted. 

 re: I/O scheduler.   I'm not sure what schedular would be optimized for SSDs.  Which one were you considering that would be optimal for SSDs? 

  Re:  Identifying a RAIDed volume as SSD wouldn't have any impact on performance.  Plus not all of your data will be on SSDs, they will be migrated to the spinning disks over time.  

  Most importantly, you don't need to worry about it.  CML is going to write all new data to R10 on the fastest available tier.   So very little overhead, over time reads will be moved to a striped configuration for maximum read performance.  

 In Linux READ performance can be improved by increasing the READAHEAD value.  

/sbin/blockdev --getra   will show you the current value.  Typically 256.  Which is in sectors. 

/sbin/blockdev --setra <# of blocks> will set it.    Note: this doesn't survive reboot so add to startup script. 

E,g, 

/sbin/blockdev --setra 8192 /dev/sdb1 

 I would start with values like 4096/8192/etc.  

 Regards,

Don 

  

2 Posts

September 16th, 2016 12:00

Thanks Don,

I thought we were on 5.5U3, but that is our VDI hosts only. These hosts are ESX5.5U1. We will be moving to 6.0 within the next several weeks.

The system is using the cfq scheduler. I have read that an IO scheduler like noop is better for flash as it doesn't need to waste time reorganizing IO because the physical movement is no longer a concern. But again, when it comes to reads, it still may come from a spinning disk. I wish I could tell where the read was coming from (Tier 1, 2 or 3), perhaps 95% is coming from SSD and the scheduler might still make a difference.

I will review and make adjustments to the READAHEAD value.

I appreciate your input so far.

5 Practitioner

 • 

274.2K Posts

September 16th, 2016 12:00

Hello, 

  Re: Scheduler.  Storage SANs go way beyond the physical drives installed.   Writes are being acknowledged before they hit the drives as long as there is cache to hold it.   Drives are optimized into RAIDsets.  Which GREATLY increase performance as multiple drives are being used at same time.   Since IOs are cached and organized to make the backend writes more sequential.  A common term for this is "skatter gathering"   Drives can only do this on small scale.   There's just so much going on between the storage controller and disks that trying to tune for a tier would not yield any improvement.  You really wouldn't want 95% of the reads to come from SSD.  That would be improper use of that resource.  You want that available to swallow up new writes.  

For many years I've suggested NOOP for heavy IO servers.   Improvements in Linux make the difference between CFQ and NOOP very small.  Definitely something worth trying.  

There are tools with Enterprise Manager to monitor the performance of the SAN. Support can help as well, if you believe it's not working correctly.  

Numbers like MB/sec are important but that's not what makes storage fast.  Very few applications will ever achieve maximum MB/sec.  This is even more true with Virtualized environments like VMware. You have many servers submitting multiple small, random IOs.  The ability of SANs to process those quickly is the heart of the SAN.  

If you monitor your Datastore throughput I think you'll find this very true. You wont't see hundreds of MB/sec.   Latency is what users can "feel".   Usually, a storage device will run out of IOs per second capacity long before it maxes out MB/sec.  For very high MB/sec you need 100% sequential IO with large blocksizes, 64K+   So with a benchmark it's quite easy to demonstrate high MB/sec rates.  It does not take a lot of H/W to demo that.   Start adding R/W and randomness with smaller blocks, watch MB/sec go down FAST.  If you have enough cache you can stave that off for awhile. 

Regards,

Don

No Events found!

Top