Start a Conversation

Unsolved

P

13 Posts

954

April 30th, 2022 06:00

Equallogic and Dell EMC ME4024 Coexistence in VMWare

Hi,

We have just bought a new Dell ME4024 SAN. The networking best practises state to set up two SAN Fabrics across mulitple switches with Ports A0,A2,B0,B2 on one subnet and Ports A1,A3,B1,B3 on another subnet. Then on the ESXi host to create two vswitches with one vmkernel port each with each vmkernel port being given an address on one of the  subnets used in the SAN Fabric.

The issue we have, is that we currently have Equallogics in production. They do not funciton like this and use network port binding with vmkernel ports on the same subnet. When i tried to set up the Dell EMC SAN, I could never get the LUN to show in VMWare. I spoke to our Dell Partner who said that it was because of the network port binding. We added the two VMkernel ports for the new Dell EMC ME4204 in the network port binding area and hey presto, the lun presented itself. The problem is, that this is not a supported config as the network port binding is not supposed to work over different subnets and can cause issues.

I need to find a way for these two SANS to coexist as need to get the virtual servers off the older equallogics to the new EMC SAN. If there is a Dell/EMC engineer here, please could they advise me on if there is a supported way to confogure the new ME4204 SAN in this environment. I read this link here that suggested that you could actually configure all of the ports on the new SAN on one subnet and use network port binding:

Dell PowerVault ME4 Series Storage System Deployment Guide | Dell UK

So, is this an option to put all ports on the SAN on one network address then use network port binding?

thanks

Paul

 

4 Operator

 • 

1.7K Posts

May 1st, 2022 09:00

Yes, you need to place all Port within the same subnet on the ME4.

Regards,
Joerg

13 Posts

May 3rd, 2022 07:00

Well, I have set up the SAN now with one vSwitch, two VMkernel ports, network port binding and put everything on one subnet. I have also gone through the N4000 series switches and turned off unicast storm control, made sure jumbo frames is enabled throughout the network and vswitches etc. I can see the storage fine. Anyway, I decided to use my old iometer test to see how the ME4024 coped compared to the old Equallogics. They are not like for like. The EQL's have 24x10k sas drives (900GB) with dual 1GB ports whereas the ME4024 has 4x10GB ports on each controller with 24x10k sas drives (2.4TB each). Running an iometer test where the file is 4GB, Transfer request size is 8KB, prcent random/sequential is 40%/60% and percent read/write is 65%/35% produces some not so good results. 

On the New SAN the Total MB/s is around 10MBps and 1200ish iops. On the Equallogics it was about 25MBps with 3000 iops. 

On the New SAN I created an Adapt Raid Volume on Pool A using all 24 drives using all ports. It could well be something to do with the settings I am using in IOMeter but it does feel like the storage is underperforming. 

thanks,


Paul

 

4 Operator

 • 

1.7K Posts

May 3rd, 2022 15:00

If you only create one pool you will only utilization one controller because one pool is assign to one controller module.
So create 2 pools with 12 disk and choose raid 6,  Our ME4 always have SSD only.

If its a vSphere ESXi environment can you run esxtop, "s2", "d"  during the iometer run?

Regards,
Joerg

13 Posts

May 4th, 2022 03:00

Hi,

I appreciate the help with this. I set it up as one large volume as assumed in our environment (as no SSD's) that the bottleneck would be the number of spindles rather than the 40GB's of bandwidth from the controllers. Anyway, I rebuilt the raid with two Raid 6 volumes (12 disks each) in each pool. I then created two datastores and created a vmdk in each and added them to my test vm. The test vm c drive is on the equallogic with E being on Lun0 (Pool A) and F drive being on Lun1(PoolB). I then ran the iometer tests again on all drives seperately with the same workload. the EQL results were much, much better somewhere in the region of 3500 iops and 50 MBps. The Dell returned 6.5Mbps and 1500 iops. I The EQL is in Raid 50 24 disk whereas obviously the ME4 is on Raid 6 12 disk. I would expect somewhere around half the performance I suppose in that scenario but 6.5MBps is really bad. We have a new server coming today so i am going to set it up as per the best practise and rebuild everything then retest. ESXTOP did not show anything apart from that the writes/sec and commands/sec were all a lot lower on the ME4. There were no major issues elsewhere pointing to problems so the problem appears to be SAN performance related.

thanks

Paul

 

13 Posts

May 4th, 2022 04:00

Thanks again for the help. The jumbo frames are all ok (I have checked through all of the connections and switches etc). I tried the ping command you sent and it comes back fine with latency around 0.16ms on all ports from the SAN using both vmkernel ports. Delayed Ack is disabled and Round Robin is set to iops=3. Sequental 100% reads are very fast on the ME4 compared to the EQL (well I am getting about 30,000 iops and 950MBps), on the EQL about 750MBps and 25000 iops. Its just these random 4KB writes. We will probably run 20 VM's on this SAN (mixture of servers, usually quite low load but we do have a couple of bigger SQL servers). It's kind of hard to see where to go from here. We may have to send it back. We cannot go SAS HBA as we have 5 hosts. It will be interesting to see what the new server produces.

Paul

 

4 Operator

 • 

1.7K Posts

May 4th, 2022 04:00

Right... i also assume that the spinning disk should be the bottleneck. Of course when running only one workload its only one path, one controller involved. But now half the number of drives gives the same IOPS as the previous test? There must be something wrong

The reason i ask for esxtop is that on the most right column you should see the LATENCY values. If you searching for performance problem you should set the MTU on your ESXi VMKs back to 1500 otherwise double check form the command line if

vmkping -d -s 8972 
 

works. If you have multible VMKs use -I vmk# so specify the sender interface to you can test all of your pathes.

When trying IOmeter i use a bigger file to overcome the cache within the storage. We are a Dell shop and have EQL,MD3,CMPL,Powerstore in the House. But ME4 i only have on a view customer sites and all of them are SSD only and use the SAS Interface. I deploy a ME5 next week and can paste some numbers... but again SSD only.

Question about your iSCSI Settings on the ESXi. Do you have specified them on the swISCSI or on the connection/lun level? I ask for DelayedACK setting which should be disabled of course and same for ROUND_ROBIN and all the other stuff.

Regards,
Joerg

13 Posts

May 4th, 2022 08:00

Ok, so I have built a new ESXi host and set up the SAN and host as per the best practises guide using two SAN Fabrics, no network port binding etc. After building a VM and running the iometer tests again it is the same result. I am fairly confident after doing all of this testing that the performance is down the ME4. It does not perform as well as the Equallogic we have (PS4210). Albeit, the Raid is different (50vs6) and the number of drives is different (24 vs 12). However, as I have configured the ME4 in different ways including 1x24 adapt raid set and received the same results as the RAID6 12 disk set, I have to presume the RAID controller in the SAN or the SAN software itself cannot perform as well as the EQL under certain workloads. I have proved as best i can that it is not a network issue as I have run 100% read tests and the ME4 is very fast. It just seems to be when write random writes it cannot handle it as well. The Equallogics were always good in my opinion. I think its a real shame they are not making them anymore. My results show performance of the ME4 is around 50-70% less as good as the EQL under a 4KB Random Read/Random Write workload. 

No Events found!

Top