RRR2
5 Osmium

Re: Ask the Expert: Performance Calculations on Clariion/VNX

Welcome everybody! We don't pretend to know it all, but we like to think that we understand how things work. Feel free to comment if you think something needs to be added to our story and ask questions if you wonder how to do the math.

To start the discussion I would like to open with "performance, what exactly is it ?"

Performance is how users / admins perceive responsiveness of applications. One person might think everything is working just fine while another person thinks it's very bad. Can we say that I/O response times of 30 ms are bad? No we can't. "It depends" is a popular answer EMC often uses and they are right: it depends!

Storage performance in the end always comes down to the physical storage medium, whether it is a rotating disk or an EFD (SSD). If your storage array is heavily used the relative performance improvement by using the cache tends to get less and less. What I mean is that if you have a single LUN of 1 GB the cache is only used for that single LUN and performance is great, but when you have 200 TB all sliced up in LUNs and you're using them all the same amount of cache is needed for all of that, so the FIFO (first in, first out) will delete the oldest data from the cache first and with that amount of cached LUNs this might not even be that old data at all which you will notice when you need to access that "old" data again!

I always do my calculations by adding up the physical storage devices. The cache improves the outcome, which is always nice, but at least I'm being cautious and I'm on the safe side.

The magic word in performance discussions is "IOps". 1 IOps is 1 "input or output operation per second". A rule of thumb EMC uses is that a single disk rotating at 15k RPM can handle 180 IOps, a 10k disk can handle about 140 IOps, a 7200 RPM disk can do 80 and a power efficient 5400 RPM drive can only do 40 IOps. An EFD (flash drive or SSD) can handle 2500 IOps. As said, this is a rule of thumb based on small random (4kB) blocks. When the blocksize increases the number of IOps will go down, but the amount of MBps goes up. When the blocksize decreases the amount of IOps goes up and the amount of MBps goes down.

When you combine multiple disks in a raid group (RG) or a pool you simply multiply the numbers by the amount of disks.

But the issue isn't what your array can deliver at it's best, but what your applications need!! And most of the time the administrators don't know what they need . "I need 1TB and it needs to be fast!" That sounds familiar, doesn't it? But how fast is "fast"? Asking them how many IOps they need is almost always answered by some weird look on their face as if you suddenly speak some foreign language. On a Windows host you can do some measurement using the tool "perfmon". You need to find the number of read and write operations per second and you should check the I/O size as well. If the I/O size is around 4kB you can use the rule of thumb provided by EMC, if the block size is larger you need to use lower numbers. The ratio between read and write IOps is important too, since 1 write I/O will trigger more I/Os on the storage back end, depending on the raid level used for that particular LUN. This handicap for write I/Os is called the "Write Penalty" (WP).

The Write Penalty is the mechanism what makes write I/Os slower than reads. The random write WPs for the following raid levels are:

  • RAID10 = 2
  • RAID5 = 4
  • RAID6 = 6
  • There is no such thing as read penalty. A single read I/O will always be only 1 actual I/O.

Explanation:

  • For RAID10 each I/O is written to 2 disks, so that explains the WP of 2 for that particular RAID level;
  • For RAID5 the meachanism works as follows: 1 random block needs to be written, but this has an impact on the parity as well, so what happens is that the old block is read as well as the parity block; then the new data is compared to the old data and the changes in parity calculated; then the new data as well as the new parity are written to disk. So 2 read and 2 write I/Os took place, so 4 in total for this single host write I/O;
  • For RAID6 this works about the same as RAID5, only here 2 parity blocks need to be read and written, so the WP will be 6.

For sequential writes the WP is mostly less than what I mentioned earlier, since a whole stripe needs to be written and the parity will change only once. For a 4RAID5 RG (4+1) the WP for sequential write I/O is 1.25 (25% overhead because of the 1 extra disk on 4 data disks). In a 8RAID5 RG the WP is only 1.125 (12.5% overhead). In RAID10 the WP is still 2, since there is no parity, but true mirroring.

If a host has 90% reads and 10% writes only 10% of the I/Os will suffer from the WP, but if the host has only 10% reads and 90% writes, 90% of the I/Os will suffer from the WP. Choose your RAID level carefully, since choosing the wrong one can be desastrous for your performance!