Start a Conversation

Unsolved

This post is more than 5 years old

5134

November 25th, 2016 02:00

Scale IO Performance Question

Hello,

i am testing scale io and have installed  3 node cluster with ubuntu 14.04. The mdm are installed with raid-1 on 10k sas disk. Each node has one samsung ssd disk with 1,92 tb for the storage. The network connection  is currently 1 gb.  Performance mode and RAM Cache is activated. When i testing with IO meter i have max. 11000 iops. Maybe I have a comprehension problem why the performance is so bad.

I hopy anyone can help me

Regards

Sterfan

306 Posts

November 25th, 2016 05:00

Hi Stefan,

We recommend to disable RAM Cache for the SSD pools, as it might affect the performance. Can you please disable it on your SP and rerun the tests?

Also, did you follow the FineTuning Guide and changed default IO scheduler for the SSD disks?

(echo noop > /sys/block//queue/scheduler)

Please also make sure that all power savings settings are disabled in your servers BIOS - you can check the other discussion at the following link (this one was for Windows, but quite interesting):

https://community.emc.com/thread/234331?tstart=0

I would suggest you change one thing at a time (disable RAM cache, change the IO scheduler, check/change BIOS settings) and run the tests after each change, so you know which had the biggest impact

Can you let us know what IOmeter parameters are you using for testing?

Also, is it possible to use 10Gb NIC instead of 1Gb? Your disks or configuration might not be throttling your performance here, 1Gb is not too much.

Thank you,

Pawel

14 Posts

November 26th, 2016 04:00

Hi Stefan,

Although I don't have that much experience yet with ScaleIO I might be able to share some experience with you about my Microsoft tuning process which is currently going on, as Pawel also pointed out (and Davide is a great help there!).

The same rules will probably go for Linux or any other system, the configuration is just different.

We really need to know what your testing parameters are, since every type of IO is quite different and requires different configurations to make it optimal.

My guess is that a 1Gbit network connection could only work out well for random IO with QD=1, as far as I've seen other types of IO really need more.

In any case I can let u know we also started with only 1Gbit network and it was not enough, especially for sequential IO but also random IO when using QD=32 or more (which is generally the most IO you would see in a system I think).

Also the rebuild / re-balance process is too slow with only 1Gbit, it can take ages.

U can easily use more NIC ports using ScaleIO load-balancing which seems to work great for me! Make sure u give every storage NIC a different subnet so ScaleIO can use the load-balancing correctly.

If you let us now your testing parameters I'm sure we can help u further.

Kings regards, Paul

68 Posts

November 26th, 2016 22:00

Hi Stefan,

could you send us a description of the hardware you are using. More specifically it is important for us to know:

- the model of the SAS controller

- the SSD model

- the network card model

- the switch model used for interconnections

I read that "Performance mode" is enabled in your setup: Paul, that replied just before me, noticed that with 1 Gbps network adapters, setting the SDS Performance profile to "default profile" helps to get better performances. You can find the report of his benchmark in the thread linked by Pawel some post above.

Another informantion that can help us to understand what is causing problems with your configuration is with what type of workload you are getting 11000 IOPS (IO size and the queue depth).

Could you post benchmark results for the workloads listed below?

- Sequential Reads 4KB QD32

- Sequential Writes 4KB QD32

- Sequential Reads 4KB QD1

- Sequential Writes 4KB QD1

- Random Reads 4KB QD32

- Random Writes 4KB QD32

- Random Reads 4KB QD1

- Random Writes 4KB QD1

Are you using jumbo frames on your infrastructure?

Please send us some specific information about your hardware and setup so we can try to help you!

Kind regards,

Davide

10 Posts

November 28th, 2016 01:00

Hi Pawel, Paul, Davide,

thanks a lot for your response.

My idea is to use our old esx server for scaleio.

The modells are

- fuijtus rx 200 s6,

- 96 gb ram,

- lsi mega-raid sas-controller.

Each node has at the moment one samsung sm863 ssd with 1,92 gb.

Now the servers has 8 port gigabit ports, but only one are for the test in use.

Jumbo frame are not activated. If we use scaleio productively, we buy 10 gb card with two interfaces, and for each host a second ssd with 1.92 gb.

In the moment my infrastructur is hosted on a netapp 2240. We have about 100 vm.

I already know that one 1 gb connection are not enough, but before it is determined whether or not we buy scale io are my means of testing limited.

After i change the default IO scheduler "echo noop > /sys/block/sdg/queue/scheduler" i have max 972 iops / 3,86 MB no matter what test I do. Sorry for my badly written english.   I will reboot the nodes and check the bios energy saving.

Regards

Stefan

68 Posts

November 28th, 2016 21:00

Hello Stefan,

for your testing infrastructure did you installed the Ubuntu 14.04 on three VMware VMs or you have already installed the Ubuntu 14.04 directly on the physical hardware?

The hardware you have is surely good to be used for ScaleIO, I'm working with a similar setup: three Cisco Servers on which I installed Ubuntu and I'm using 3 SM863 on each node. I used HBA instead of RAID controllers, you can read the reason and the best way to configure your LSI RAID controller for ScaleIO on the thread that Pawel linked, here you can find the link that talks about LSI RAID controller observations and settings: https://community.emc.com/message/959684#959684

Please let me know if your Ubuntu is installed directly on the Physical hardware or on VMs. On VMs there are a lot of things you have to take in account to get good performance.

The SDC from which you run the benchmarks on which server is installed?

Could you try to run an iperf from the SDC to every SDS and post here the results. Test TCP with only 1 concurrent connection. We have to start to exclude network slowdown.

Kind regards,

Davide

10 Posts

November 29th, 2016 00:00

Hi Davide,

i am testing on physikal servers with ubuntu 14.04. I have problems to install the raid1 for the ubuntu system on lsi controller, and have now installed the system on ubunutu software raid.
I do not know exactly whether that is decisive for the performance. The io benchmarks has i startet on windows server vm placed on the SSD StoragePool.

Now the network connection from SDC to SDS are good for 1 Gb Network.

[  4] local 192.168.12.10 port 5001 connected with 192.168.12.72 port 56600
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.0 sec  1.09 GBytes   940 Mbits/sec
[  5] local 192.168.12.10 port 5001 connected with 192.168.12.67 port 12727
[  5]  0.0-10.0 sec  1.08 GBytes   929 Mbits/sec
[  4] local 192.168.12.10 port 5001 connected with 192.168.12.78 port 37410
[  4]  0.0-10.0 sec  1.07 GBytes   914 Mbits/sec
[  5] local 192.168.12.10 port 5001 connected with 192.168.12.71 port 17287
[  5]  0.0-10.0 sec  1.09 GBytes   932 Mbits/sec


[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.0 sec   928 MBytes   777 Mbits/sec
[  5] local 192.168.12.11 port 5001 connected with 192.168.12.78 port 40519
[  5]  0.0-10.0 sec   907 MBytes   761 Mbits/sec
[  4] local 192.168.12.11 port 5001 connected with 192.168.12.67 port 16087
[  4]  0.0-10.0 sec   840 MBytes   704 Mbits/sec
[  5] local 192.168.12.11 port 5001 connected with 192.168.12.72 port 55018
[  5]  0.0-10.0 sec   750 MBytes   629 Mbits/sec
[  4] local 192.168.12.11 port 5001 connected with 192.168.12.71 port 61650
[  4]  0.0-10.0 sec   956 MBytes   802 Mbits/sec


[  4] local 192.168.12.12 port 5001 connected with 192.168.12.67 port 53770
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.0 sec  1.01 GBytes   871 Mbits/sec
[  5] local 192.168.12.12 port 5001 connected with 192.168.12.72 port 53285
[  5]  0.0-10.0 sec  1.09 GBytes   938 Mbits/sec
[  4] local 192.168.12.12 port 5001 connected with 192.168.12.78 port 53083
[  4]  0.0-10.0 sec  1.09 GBytes   936 Mbits/sec
^[[A[  5] local 192.168.12.12 port 5001 connected with 192.168.12.71 port 53312
[  5]  0.0-10.0 sec  1.09 GBytes   937 Mbits/sec


I am  start an copy job, my test-vm from scaleio storage to my netapp filer (nfs-storage) with 1.540 iops and 90 Mb/sec. In this case the performace are good for 1 gb network.

Now i will read your link with LSI Raid controller.


Davide thanks for your help.

King regards
Stefan

68 Posts

November 29th, 2016 20:00

Hi Stefan,

I noticed that the second iperf you posted seems not so stable as others. This isn't the main problem of course but it is better to investigate.

Could you monitor the network load on the ethernet adapters on all three SDS nodes and, above all, on the SDC during a benchmark? You can use ifstat on linux and the process monitor on windows.

I'm doing some calculation, in the first post you were talking about a throughput of 11000 IOPS. If the block size is 4K you have 4K * 11000 = 44000K that is 42,97 MB/sec.

ScaleIO regarding writes works differently from a NetApp since your SDC wrote a double copy of every block on two different SDS so on the network adapter installed on the SDC you have 85,94 MB/s of real throughput. You aren't using jumbo frames so you have more or less 2,30 MB/s of TCP/IP overhead.

85,94 MB/s + 2,30 MB/s = 88,24MB/s ( 705Mbit/s )

One of your nodes in the test performed more or less this speed. If one node works  and can cause general performance issue.

Paul on the other thread improved a lot the performance raising TX and RX buffers on the network adapter on all the infrastructure.

Another suggestion: did you disable delayed ack and nagle algorithm on the windows (TcpAckFrequency and TcpNoDelay) as suggested in the ScaleIO Performance Fine-Tuning guide?

Here is the relevant part (TcpAckFrquency and TcpNoDelay are two different registry keys):

pftg.JPG.jpg

This can make a lot of difference.

This is the summary of the things you can try to do:

- investigate on the network of the second node (iperf results not so stable)

- raise the network TX and RX buffers on all the network cards of all the infrastructure

- measure the network throughput during a real storage benchmark on the windows SDC and on all SDS nodes

- Disable delayed ack and nagle algorithm on the SDC adding the values attached on the screenshot above.

Please let me know if you can get better results following the suggetions I sent you.

Kind regards,

Davide

10 Posts

November 30th, 2016 07:00

Hi Davide,

great, many thanks again.

My SDS are Ubuntu serves, i cannot disable delayed ack and nagle algorithm.

The network cards from the sds server shows no TX or RX errors. I still had a 10 gb card from another server, after the installation had 3 Gbits/sec with iperf.

For the other two SDS I have orderd 10 gb cards. I have installed on each sds server a second network connection.

The last IO Meter tests:

29.000 IOPS and 115 MB/sec with 4 K 100% Read. I have full speed for 1 GB, 115 MB = 920 MBit.

This performence its much better. Now i will waiting for the other two 10g cards an i will see the performence.

Apparently 10 gb is crucial. I will report again soon.

I have read your "Upgradepath" from ubuntu 14.04 to 16.04, would you recommend before the productive use to update the ubuntu and scaleio version?

You have a similar infrastrucktur as we.

King regards

Stefan

68 Posts

November 30th, 2016 09:00

Hi Stefan,

3 Gbit/s with iperf using a 10 Gbit/s adapter is a bit slow, I'm getting 9.90 Gbit/s (using jumbo frames) and 9.30Gbit/s (with jumbo frames disabled). I had a similar issue that was solved disabling all processor C-states in the BIOS of the server. It helped also to disable all power saving settings in the BIOS, that was achieved setting wherever possible the"High Performance" profile in the BIOS.

I suggest to use the lastest ScaleIO version on your production environment, I'm keeping an eye on the ScaleIO knowledge base every day and I read that a lot of minor issues were solved in the lastest releases.

I saw that you are running the benchmark from a windows server and I thought that the SDC component runs actually on Windows so I suggested you to disable Nagle algorithm and delayed acks on this specific server. On linux a lot of parameters are automatically tuned by ScaleIO when the daemons are run.

Let me know if you can get better network results tuning your BIOS parameters, I was getting only 2 Gbit/s before disabling Processor C-States in the BIOS. That was the tuning paramenter that gave us the best performance improvement among all.

Kind regards,

Davide

10 Posts

December 5th, 2016 04:00

Hello Davide,

sorry it took a little time. When i tested from sds node 1 to sds node 2 i have 9,3 Gbit/s too. But when i test to my sdc i have only 3-6 Gbit/s. My Infrastruktur has four esx hosts, with 2 netapp filers with10 gb and few other iscsi stores. There runs a lot of  tasks and we have about 100 vms. I am not sure whether I get more throughput with iperf. I am disable all engery safings on my sds hosts. I'm still waiting for the third 10 gb card. In the moment i have destroy my cluster :-(

MDM to cluster, MDM role is not a Slave but a Tie-Breaker,

King regards

Stefan

EDIT: the cluster problem are resolved

No Events found!

Top