EQL mixed speed integration/migration ?

Question

Currently we have an esxi cluster based on a 100% 1GbE environment (hosts initiators, switches and arrays) .

New Equallogic 10GbE based have been purchased as well as 10GbE initiators and switches, and as far as I know the possibilities I have are

Connect all arrays to the 10GbE switches in different pools, and the 10GbE initiators would access both 1GbE and/or 10GbE arrays
Connect the 10GbE arrays to the 10GbE switches, and the 1GbE arrays to the 1GbE arrays, and create a LAG between switches. The 10GbE initiators would always connect to the 10GbE switches as well
Keep 1GbE initiators and add the 10GbE initiators, meaning that the 1GbE initiator would connect to the 1GbE switches/arrays, and the 10GbE initiator would connect to the 10GbE swicthes/arrays.

Based on those possibilities what are the recommendations for keeping both environments accessible with the minimum amount of performance degradation and/or issues. Thank you, Bruno

Bruno Sousa · Answer

Hi Don,

Thanks for the answer and I have read the document as well.

My current idea is to have the esxi hosts equipped with a 10GbE card, that connects to a 10GbE switch (Dell 8164). The 10GbE switch would have 10GbE EQL arrays (PS6110E) and 1GbE EQL arrays (PS6100E) connected, with 2 pools , 1 pool with 1GbE members and 1 pool with 10GbE members.

Is this a valid approach, or do you see possible issues with this?

Thanks,

Bruno

Bruno Sousa · Answer

Hi Don,

Thanks for the feedback provided, however a key idea here resides on the migration path.

We will not stop using the 1GbE arrays, since they are fairly new and fully functional , within the next few years therefore I need to have a solution that would allow the esxi hosts to access the 1GbE and 10GbE arrays without issues.

In short we need to maintain access to 1GbE and 10GbE from esxi hosts equipped only with 10GbE network cards, where 1GbE and 10GbE arrays and the esxi hosts are connected to the 10GbE switches.

Having said that my idea is based on the following :

Network

install 10GbE network cards on the esxi hosts
connect the 1GbE and 10GbE arrays to the 10GbE switches
connect the 10GbE esxi hosts to the 10GbE switches

EQL configuration

Add the 10GbE arrays to the existing Group Name -- EQL_Prod
Create a new pool , let's say EQL_10GbE , where all 10GbE members are placed

Based on this my EQL Group -- EQL_Prod would look like :

Pool EQL_10GbE with 2 members (this is the new pool)
- 2 x PS6110E
Pool EQL_1GbE with 2 members (this is the existing pool)

1 x PS6100E
2 x PS6100 XV

From the VMware side I would have :

1 Dual Port 10GbE network card , with port 01 connected to Switch #01 and port 02 connected to Switch [tag:02]
2 vswitches , each with 1 physical link associated
EQL Hit installed for MPIO
iSCSI connection to the Group EQL_Prod IP
access to multiple volumes, based on the 1GbE and on the 10GbE arrays

With this layout I have the following :

different pools based on the EQL arrays network speed (1GbE and 10GbE)
usage of 1 single type of switch and network cards on the esxi hosts - 10GbE on both
access to volumes residing on the 1GbE based arrays and on the 10GbE based arrays

Is this a valid configuration, or am I getting myself into problems ? ;)

Thank you,

Bruno

Bruno Sousa · Answer

Don, thanks for the clarification and indeed the iSCSI aren't teamed, therefore all problems above mentioned would arise.

Having said that the best way to keep 1GbE and 10GbE access would be to have the esxi hosts configured with 1GbE and 10GbE NIC's and each NIC would connect to it's respective switch in terms of speed (1GbE nic connected to the 1 GbE switch, and the 10GbE nic connected to the 10GbE switch).

Regarding the ISL between 1 and 10GbE switches a minimum of 4 x 1GbE in LAG or a preferable 2 x 10GbE LAG would be accepted. I assume that EQL replication between a 10GbE array and a 1GbE would use only 1 x 1 GbE . Can you confirm if this okay?

Bruno

Bruno Sousa · Answer

Hi Don,

I have been thinking on a solution based on the usage of 1GbE NICs and 10GbE NICs on the ESXi hosts connected to different groups, and I guess this is what you are recommending.

In this case I could have , for example, a Group EQL_Prod_1GbE with IPs configured on the 192.168.100.0/24 and another Group EQL_Prod_10GbE with IPs configured on the 192.168.110.0/24 .

Regarding the connection between 1GbE and 10GbE i don´t see the need to create any sort of connection with each other, do you agree?

Thanks

Bruno

Bruno Sousa · Answer

Indeed i would need ISL to achieve data replication from the volumes hosted in the 10GbE Group to the 1GbE Group.

The disadvantage on this setup is that I have 4 switches to manage and also several vswitches/port groups to manage within the VMware.

Regarding the 10GbE access to the 1GbE arrays, I would like to explore a bit deeper this situation.

At our environment I don't expect VM's able to saturate 4 x 1 GbE since they will be mainly test/development machines, and any production based VMs would be placed on the 10GbE based datastores.

Since the 1GbE arrays are 4x1GbE , as long as the performance demand from the VM's hosted on the 1GbE based datastores don't go beyond 80% of 4x1GbE, the environment should not have issues related to packet loss/retransmissions and any other type of network issues.

Is this a fair line of thought or am I missing something here?

Bruno

Bruno Sousa · Answer

Don,

The EQL arrays/switches/ESXi nics would all belong to the same subnet, based on their speed.

In my case I would have the following:

Group EQL_Prod_1GbE with EQL arrays/switches/ESXi nics IPs configured on the 192.168.100.0/24 with VMkernel ports to the SW iSCSI adapter

Group EQL_Prod_10GbE with EQL arrays/switches/ESXi nics IPs configured on the 192.168.110.0/24 with VMkernel ports to the SW iSCSI adapter

The ISL would be probably 4 x 1GbE and is okay to have replication tasks based on 1GbE speeds .

Thanks for the help provided !

Bruno

Bruno Sousa · Answer

As a final question, instead of having 4 switches (2 switches 1GbE and 2 swithes 1or 10GbE with ISL) would it be possible advisable to connect all ESXi iscsi initiators (1 and 10GbE) only to the 10GbE switches as long as initiators and arrays run on different subnets ?

With this I would avoid the need to maintain 2 different types of switches , lowering the management costs and so on.

What would be your take on this?

Thanks,
Bruno

Bruno Sousa · Answer

Hi Don,

Thanks for your valuable feedback, but in this case the ESXi servers already have the 10Gb cards..

The point here is why to keep using 4years old 1Gbe switches rather than using the newer 10Gbe switches and make use of the bigger packet buffer size ?

In our case the 1Gb arrays would be used for test/dev, and the migration of the current production would be done by means of VMware Storage vMotion.

All esxi hosts with 2 network cards
- Network card A is a Quad-Port 1Gb
- Network card B is a Dual port 10Gb
Connection between the esxi hosts and the EQL arrays (6100E and 6110E , for example) to the 8164 switch
Each type of EQL array, would run on different subnets as well as the iscsi initiators
- The 1Gb arrays would run on the 192.168.100.xxx /24 subnet as well as the initiators based on the 1Gb Quad-Port network cards
- The 10Gb arrays would run on the 192.168.110.xxx /24 subnet as well as the initiators based on the 10Gb Dual-Port network cards

This setup , as I see it, meets the following requirements:

10Gb initiators do not connect to the 1Gb based arrays , avoid possible oversubscription on the ports and sub consequent packet loss, retransmissions, etc..
The 1Gb and 10Gb environments are split by the means of usage of different subnets (if needed , we can use VLANS on the switches/arrays/hosts)
The replication from the 10GbE to the 1GbE based array/pool is possible, and all traffic would be handled within the switch, without the need to cross LAG’s across the 1Gb and 10Gb switches
The settings of jumbo frames will be configured to all ports, since all ports will carry iscsi traffic
The settings of traffic flow, and others, can be done port per port , or per group of ports , meeting the requirements of each type of array behind the port, either the 1Gb or the 10Gb array

- Bruno

Bruno Sousa · Answer

Hi Don,

Thanks for your valuable feedback, but in this case the ESXi servers already have the 10Gb cards..

The point here is why to keep using 4years old 1Gbe switches rather than using the newer 10Gbe switches and make use of the bigger packet buffer size ?

In our case the 1Gb arrays would be used for test/dev, and the migration of the current production would be done by means of VMware Storage vMotion. ◦All esxi hosts with 2 network cards ◾Network card A is a Quad-Port 1Gb ◾Network card B is a Dual port 10Gb

◦Connection between the esxi hosts and the EQL arrays (6100E and 6110E , for example) to the 8164 switch ◦Each type of EQL array, would run on different subnets as well as the iscsi initiators ◾The 1Gb arrays would run on the 192.168.100.xxx /24 subnet as well as the initiators based on the 1Gb Quad-Port network cards ◾The 10Gb arrays would run on the 192.168.110.xxx /24 subnet as well as the initiators based on the 10Gb Dual-Port network cards

This setup , as I see it, meets the following requirements: ◦10Gb initiators do not connect to the 1Gb based arrays , avoid possible oversubscription on the ports and sub consequent packet loss, retransmissions, etc.. ◦The 1Gb and 10Gb environments are split by the means of usage of different subnets (if needed , we can use VLANS on the switches/arrays/hosts) ◦The replication from the 10GbE to the 1GbE based array/pool is possible, and all traffic would be handled within the switch, without the need to cross LAG’s across the 1Gb and 10Gb switches ◦The settings of jumbo frames will be configured to all ports, since all ports will carry iscsi traffic ◦The settings of traffic flow, and others, can be done port per port , or per group of ports , meeting the requirements of each type of array behind the port, either the 1Gb or the 10Gb array

Bruno

Bruno Sousa · Answer

Hello,

I understand your point and usually I also prefer to keep things split however in this case despite the added complexity to setup I think the advantages overcome the disavantages.

To be fair the 6224 are not recent switches, and having hosts with 1 and 10GbE based initators would avoid the issue of having 10Gb initators connecting to 1Gb targets and vice-versa.

Since I would need to maintain the 1Gb together with the 10Gbe environment , it would be good if at least the switching environment is kept to a single type of switch.

So bottom line from a technical side it would be possible the mentioned configuration , and I fully agree that IP subnet would be combined with VLANs.

From Dell engineering, what do you normally see in the field ? Customers do keep 1Gb and 10Gb switches separated while using 1Gb and 10Gb initiators as well?

Bruno

Bruno Sousa · Answer

Hi Don,

I share with you the point of risk mitigation, and indeed my approach would be to start off with the usage of 1Gb and 10Gb switches , meaning that the current production environment would not be affected and once everything is running from the 10GbE switches I would then start to move away from the 1GbE switches.

To be fair the 6224 switches are working fine , and I just wonder what would be the best way to support 1 and 10GbE from the same hosts, either by splitting to different switches and create ISL or having everything managed on the same switch and see the needed configuration in terms of subnet/vlans/etc.

On an ideal world, this kind of environment could be tested by Dell and be part of a recommendation papper , with the proper conclusion regarding if it works / is recommend or not.

Thanks for all the feedback provided. I hope in some weeks time I can provide some real life feedack about the path choosen and lessons learned :)

Bruno

Bruno Sousa · Answer

Indeed that are way too many possibilities based on the different types of switches, arrays, firmware , usage of lack of DCB and so on..

My point is that the documentation currently available gives the impression, at least to me, that once a customer has a 10Gb array/switches/hosts, they would stop using 1GbE arrays, or they must keep supporting their current 1GbE based switches.

It would be interesting to have a solution where the hosts/switches are 10GbE and arrays may be 1GbE and 10GbE.

Let's see how the end result will look like, and I will provide some feedback .

Thanks !

Bruno

Bruno Sousa · Answer

Hello again (the never ending story) ,

I just went back to my design whiteboard and I would like to share some ideas and see where my mistakes are.

Current environment :

8 servers with 4 x 1GbE NIC's running on 192.168.100.xxx network
1 EQL group with 3 members, all 1GbE based
2 dell 6224 1GbE switches used for the SAN fabric

Future environment :

all of the above environment plus
8 servers with 1 dual port 10GbE NIC's running on to define network
1 EQL group with 3 members, all 1GbE based
2 new dell 8164 10GbE switches used for the SAN fabric

I have been thinking on the usage of a single 10GbE switch and just the 10GbE NICs on the ESXi hosts, while connecting to the 1GbE and 10GbE arrays, and here's how I see things to work.

10 GbE initiator talks to 10GbE array

Within this situation I really don't see any issues, besides the , remote, possibility that the ESXi hosts can generate such a big amount of data that would saturate it's 2 x 10GbE interfaces. In this case generic network rules apply, by means of packet drops, retransmission and so on

10 GbE initiator talks to 1GbE array

Let's say that ESXi host A initializes an iSCSI connection to the EQL member_A , on port 01 .
While this session is bellow, let's say 80% of the 1GbE bandwidth, no problems in terms of packet loss, TCP/IP pause, retransmissions and so , should occour
On the same host (ESXi host A) a new iSCSI connection is created to the same EQL member_A, port 01 . At this moment the total bandwitdh used increases to over 90%, and here's what I (hope) it would happen :
- iSCSI sessions may drop due to high port usage and consequent packet drops, and so on
- flow control on the switch and arrays, would "slow down" the rate of data sent from the ESXi host and with this the following happens
  - EQL array detects that port01 on member_A is overloaded and the network load balancer mechanism of the EQL kick's in
  - The EQL HIT and the EQL load balancers "arrange" that a new iSCSI session is established to the EQL member_A but on port 02, since port 01 is still servicing a very high usage session (the initial one still uses 80% of the total 1GbE bandwitdh)
While the traffic control kicks in it should affect only 1 of the 10GbE ports on the ESXi host, so any traffic running on the remaining port should not be affected, and any session running on the initial 10GbE port would be affected until the traffic usage are lowered (at this moment we should have normal usage levels, since the MPIO and EQL load balancers kicked in and assigned sessions to other 1GbE ports on the array)

Bottom line, is that I expect (but my expectations can be wrong due to lack of knowledge) that the EQL network load balancer would distribute iSCSi sessions across all the network ports even when all sessions are origineted from the same 10GbE initiator. As soon as all ports on the EQL are , say, 90% used the flow control on the array and switch would kick in and "slow down" the traffic.

Another key point for my specific envrionment at least, is that I don't expect to have 1GbE sessions to use more than 60-70% of the 1GbE bandwidth . However if all the 1GbE sessions from all the ESXi 10GbE nics are assigned to same port in one of my 1GbE arrays, then I'm sure that I would overload the array port.

Looking forward for feedback...

Bruno

Bruno Sousa · Answer

Hi Don,

Thanks for the quick and straight forward answer.

Bottom line , the best would be to keep the 1Gbe initiators connecting to the 1Gbe arrays via the 10GbE switch (if this doesn't work I always go back and use the current working 1GbE switches) and the 10GbE initiators would connect to the 10GbE arrays.

The "trick" would be to separate the different speed initiators by means of VLAN's and subnets . This method would allow me to use only one single type of switches while keeping the ESXi hosts directly connected to their arrays without routing or gateways.

Do you have an idea how the HIT MPIO would work with 2 types of initiators with diffeent speeds and on different speeds? Would it still balance sessions correctly, in this case the 1GbE sessions balanced across the 1GbE initiators and the 10GbE sessions balanced across the 10GbE initiators?

Thanks,

Bruno

Bruno Sousa · Answer

That's what I thought but always goodv to have a confirmation from experts.

I guess that the usage of different subnets , VLANS, 2 EQL groups with iSCSI binding , but "converged" to the same 10GbE switches would work out.

It would be a solution that I still need to keep for some years so I will try to perform as many tests as possible before going live. I still need to work out how the vMotion / svMotion can be configured to make usage of correct network interfaces.

Bruno

FluidFS

EQL mixed speed integration/migration ?

Was this post helpful?