Unsolved

This post is more than 5 years old

714

December 8th, 2014 13:00

vnxe3150 thin/thick provisioning and performance issues

Hello

This is my fist SAN project and also my first post here. So please bear with me.

First let my give you some information about our hardware:

We have 2 HP DL380p G8 E5-2690v2 (2 x 10 cores) servers, both with 128G RAM, 2 x 10Gb ethernet, 4 x 1Gb ethernet and a 8GB SD card to run VmWare ESXi 5.5 from. (p/n: 709943-xx1)

2 x HP 2920-24G switches both with 2 x 2 port 10Gb modules

2 x HP 2530-48G switches

And an EMC VNXe3150 with 2 SP's and 2 x 10Gb. The disks in the EMC are 7 x 600GB 10K SAS and 12 x 900GB 10K SAS.

We created 2 storage pools:

* Performance_RAID5 (10+1 x 900GB, 1 spare) Total space: 7.998TB

* Performance_RAID10 (6+6 x 600GB, 1 spare) Total space: 2.994TB

Some more information about the SAN:

We will be using 2 VLAN's on the HP2920-24G switches (They are stacked.)

70 iSCSI1

80 iSCSI2

One switch is used for VLAN70, 192.168.0.70 and the other is used for VLAN80, 192.168.80.0.

We created an iSCSI server on each SP both with 2 IP addresses in different subnets.

iSCSISPA:

eth10: 192.168.70.1

eth11: 192.168.80.1

iSCSISPB:

eth10: 192.168.70.2

eth11: 192.168.80.2

No VLAN tagging is used on the EMC.

On the ESXi servers we created one vSwitch with 2 VMKernel ports:

ESXI host 1:

VMKernel_iSCSISPA: 192.168.70.3, VLAN70 (vmnic0 active, vmnic1 unused)

VMKernel_iSCSISPB: 192.168.80.3, VLAN80 (vmnic1 active, vmnic0 unused)

ESXI host 2:

VMKernel_iSCSISPA: 192.168.70.4, VLAN70 (vmnic0 active, vmnic1 unused)

VMKernel_iSCSISPB: 192.168.80.4, VLAN80 (vmnic1 active, vmnic0 unused)

We added the iSCSI software HBA in VmWare. Both VMKernel port groups are binded to it.

We created the following VmWare Datastores:

(Because of the 2TB limitation and because we wanted to make sure that the load is balanced on the 2 SP's)

iSCSISPA_RAID5_01 (Total space: 1.952TB)

iSCSISPA_RAID5_02 (Total space: 1.952TB)

iSCSISPA_RAID10 (Total space: 1.462TB)

iSCSISPB_RAID5_01 (Total space: 1.952TB)

iSCSISPB_RAID5_02 (Total space: 1.952TB)

iSCSISPB_RAID10 (Total space: 1.462TB)

I used Thin provisioning for all the LUN's.

The datastores were automatically created on our ESXi host but because they were formatted with VMFS3 we deleted and re-created them using VMFS5. We also created one RAID5 datastore for each SP by extending the first LUN with the second. So:

iSCSISPA_RAID5 (iSCSISPA_RAID5_01+iSCSISPA_RAID5_02) (Total space: 3.9TB)

iSCSISPB_RAID5 (iSCSISPB_RAID5_01+iSCSISPB_RAID5_02) (Total space: 3.9TB)

iSCSISPA_RAID10 (Total space: 1.46TB)

iSCSISPB_RAID10 (Total space: 1.46TB)


The multipathing policy is changed to Round Robin.


Now that you have an idea about our setup, I can start talking about our experiences.

I wanted to test the performance so I created a new 5GB LUN on SPA_RAID5 and one on SPB_RAID10.

I created a new virtual machine and added both LUNs using Raw Device Mapping. I then run both HD_Speed and IOMETER which gave me the following results:

SPA_RAID5:

* HD_Speed: avarage 271.7MB throughput

* IOMETER (Max throughput):

transfer request size to 64K, percent read/write distribution to 100% read, percent random/sequential distribution to 100% sequential

Total IOPS: 3058

Total MB/s: 200

Average IO Response: 0.3 ms

Max IO Response: 239 ms

* IOMETER (Max IOPS):

transfer request size to 512bytes, percent read/write distribution to 100% read, percent random/sequential distribution to 100% sequential

Total IOPS: 6441

Total MB/s: 3.3

Average IO Response: 0.15 ms

Max IO Response: 13 ms


SPB_RAID10:

* HD_Speed: avarage 295.9MB throughput

* IOMETER (Max throughput):

transfer request size to 64K, percent read/write distribution to 100% read, percent random/sequential distribution to 100% sequential

Total IOPS: 3188

Total MB/s: 208

Average IO Response: 0.3 ms

Max IO Response: 238 ms

* IOMETER (Max IOPS):

transfer request size to 512bytes, percent read/write distribution to 100% read, percent random/sequential distribution to 100% sequential

Total IOPS: 6623

Total MB/s: 3.39

Average IO Response: 0.14 ms

Max IO Response: 23 ms


I have no idea if I did these test correctly but the first thing I noticed is that there isn't much difference between the RAID5 and RAID10 pool. I was also expecting a lot more throughput on 10Gb.

I'm also not sure if these results are good or not good. I read somewhere that there could be a performance impact because I used Thin provisioning. So I started changing my LUN's from Thin to Thick provisioning. It worked for all the LUN's except two. I received the following error: The system does not have enough storage to fulfill this request. I also received this error a couple of times when I created a new VM.

Any feedback would be appreciated.

Thx

Filip

December 9th, 2014 12:00

We had a session with EMC today to have a look at the thin/thick provisioning issues. Apparently, this is by design.

I ended up re-creating all the LUN's which wasn't really necessary but I wanted to know how it works.

When you create a LUN, storage is claimed from the pool and made available for the SP on which you created the LUN. There is however some extra space claimed in the back end.

So in our case:

For storage pool Performance_RAID5 with a total capacity of 7.998TB

Total: 7.998TB

7.998 / 2 = 3.999 /2 = 1.999TB (We divide by two to balance the load on the SP's.)

- 1.999TB (iSCSISPA_RAID5_01)

- 1.999TB (iSCSISPB_RAID5_01)

7.998 - 1.999 - 1.999 should be 4000. However, SPA and SPB claimed some extra space from our pool by creating the first LUN's. There was only 3.880TB left to provision. So 3.880 / 2 = 1.940TB

- 1.940TB (iSCSISPA_RAID5_02)

3.880 - 1.940 should be 1.940 but again SPA claimed some extra space from the pool. Now, there was only 1.827TB available for SPB.

- 1.827TB (iSCSISPB_RAID5_02)

We lost 120GB by creating iSCSISPA_RAID5_01 and iSCSISPB_RAID5_01. So each SP claimed some extra 60GB of storage from the pool.

Creating the second LUN on SPA made that there was only 1.827TB left for SPB. We lost another 113GB.

In the end, we lose +/- 225GB storage on our RAID5 pool.

Or is it 7.998 - 1.999 - 1.999 - 1.940 - 1.827 = 233?

I really hope somebody can further explain. Or correct me if I'm wrong.

Grtz

Filip

No Events found!

Top