Unsolved
This post is more than 5 years old
11 Posts
1
2378
October 23rd, 2014 04:00
VMAX 40K FA Limit
Hi all,
I've a strange behaviour on a VMAX 40K. We have 4Gb SAN, We have some ESX hosts, each ESX has 4HBA, each HBA is zoned with 2 FA ports.
So each LUN presented by the VMAX is viewed in VMware with 8 paths.
We have on the vSphere cluster a VM that we use for I/O stress tests, we are testing a LUN totally pinned on EFD, so we are expecting very high throughput.
If we use VMware NMP (Native Multipath) we achieve 1200MB/sec (in this phase is not important what pattern the I/O test generates)
If we use EMC PowerPath VE we achieve 1200MB/sec
if we use VMware fixed path policy we achieve no more than 150MB/sec
It seems that we have a 150MB/sec limit on the FA port: even in the 1200MB/sec result (either NMP or PP VE), we are using 8FA so 8x150=1200.
With a 4Gb/sec SAN, we expect a throughput on the single path of circa 400MB/sec, 150MB/sec is to low!
We have on schedule the upgrade of the SAN to 8Gb/sec, but if we are FA limited, the upgrade will not give improvements.


Quincy561
1.3K Posts
0
October 23rd, 2014 06:00
The ESX server NPM setting needs to be changed for VMAX. You are probably hitting the IOPs limit on the FA CPU, not the throughput limit of the channel.
See VMWare KB article 2072070
http://kb.vmware.com/selfservice/microsites/microsite.do
Quincy561
1.3K Posts
0
October 23rd, 2014 06:00
Also please let us know what your IO size is. You may be correct that going to 8gb won't give any more performance if you IO sizes are small (< 64K)
Quincy561
1.3K Posts
0
October 23rd, 2014 07:00
Ok, so large IO sizes. You may find you get better throughput with IO sizes smaller than 512K blocks, such as 256K or 128K.
The size of the data you are testing will fit entirely in cache, so you are only testing reads and writes to cache, no backend IO.
The change for the NMP should help the throughput in that case.
th3_p03t
11 Posts
0
October 23rd, 2014 07:00
We are using 512KBytes block size, we perform sequential read for 30 seconds on a test file of 16GBytes, then 30 secs of random read, 30 secs of sequential write, 30 secs of random write, always on the same test file.
The result are quite stable and repeatable.
th3_p03t
11 Posts
0
October 23rd, 2014 07:00
Yes we have jest applied the KB 2072070.
th3_p03t
11 Posts
0
October 23rd, 2014 08:00
Ok 800MB/sec on a 8Gb link. So on a 4Gb link we have to reach 400MB/sec on a single FA port. You are saying that with a single host, on a single device we can't reach 400MB/sec? It's nonsense.
Quincy561
1.3K Posts
1
October 23rd, 2014 08:00
Different architecture in the VNX than VMAX. We have dedicated CPUs for pairs of ports on the 40K. Also we do one IO at a time for each device. We need more active ports and devices to get full performance. 150MB/sec is about the maximum throughput for a single device on a single FA CPU. I hope your LUN is a striped meta device, although the meta won't help as much for the sequential IO, as it is striped at 960K.
You achieved better performance using the other path management options, so I'm sure the round robin setting is what is slowing the IO for that test.
Quincy561
1.3K Posts
0
October 23rd, 2014 08:00
BTW, how do you have the active FA CPUs distributed across directors? You should not have more than one port per FA CPU active.
th3_p03t
11 Posts
0
October 23rd, 2014 08:00
The fact the test file fit entirely in cache is perfect, we are not interested in testing backend IO, we have to know WHY using a single FA port we achieve only 150MB/sec.
Same test on different storage (fare less expensive and powerful than VMAX) has given twice the throughput (in fixed path).
Quincy561
1.3K Posts
0
October 23rd, 2014 08:00
BTW, since all the data is in cache, where the data lives on the backend is not relevant.
And you may get better throughput for such a test from a lot of FC drives over EFDs. For sequential large block reads and writes a FC drive is about the same as an EFD per spindle, and chances are you have a lot more FC spindles than EFD spindles.
And again, you may want to try different IO sizes, you may get better throughput with 64K, 128K or 256K.
How many active FA ports total? And I hope none are sharing the same FA CPU.
Quincy561
1.3K Posts
0
October 23rd, 2014 08:00
A single FA port on a 40K can get very close to the line speed, even with 8gb ports, so close to 800MB/sec. However you need more than one Symmetrix device and thread active to get this.
th3_p03t
11 Posts
0
October 23rd, 2014 08:00
I confirm you that we use IOPS=1 on VMware ESX.
Please confirm me that 150MB/sec is the higher limit of a single VMAX FA port, this is a very important fact.
Quincy561
1.3K Posts
2
October 23rd, 2014 09:00
Not nonsense. I'm just trying to help by telling you how to get the maximum performance from your VMAX. You can get 400MB/sec from a single LUN, no problem, but it should be a meta volume.
If you want an array to run a single device on a single port, maybe VMAX isn't for you. It is designed to run 10s of thousands of devices on many ports at the same time.
Maybe you should watch my EMC World "Best Practices for Performance" presentation on EMCWorld.com. Go to virtual EMC World, technical sessions, put in any email address (a@b.com) , then browse sessions, then sort by most liked, mine is at the
th3_p03t
11 Posts
0
October 24th, 2014 01:00
Hello Quincy,
I want to make no debate, I only want to know how VMAX works and understand if we have a issue or this behaviour is by design.
We have done tests with block size 4K, 8K, 64K, 128K, 512K.
We have never gone beyond the 150 MB/secs per FA ports, never.
We use VMware and we have to use RDM, a lot of RDM. We have the limit of 1024 path in vSphere so the more we can get from a single path the happier we are.
In some cases we have to use fixed path (ESX 5.5 has a bug in supporting mulitpath in MSCS) and we can't have a huge SQL Cluster with it's TLOG LUN pumping only 150MB/sec.
We need to find a solution.
Quincy561
1.3K Posts
1
October 24th, 2014 06:00
Ok, there are performance experts in the field, if I was you, I'd start with your local account team. I am also willing to help, but I need more information. STP data would be nice, but a 30 second test won't be long enough to populate a whole interval in STP.
I'd also start with a one port test, and see if you can get your 400MB/sec. I'm not sure what limit you are hitting
Also if you could send me a private message with your serial #, I could take a look at the bin file, just to see if there is something wrong with the layout.