This post is more than 5 years old

21431

December 11th, 2011 01:00

Help regarding Subnets and performance

Hi Guys, first time posting here so be gentle.

 

I am having some issues with the performance of my backups and I think to some extent the iSCSI speed from the VM Host.

 

I'll give you the setup as it is and try and cover all configs to get plenty of details here for you.

 

First off, the Enviroment is based around vSphere 5 Essentials Plus Pack with ESXi 5.0 Hosts.  All devices have been loaded with the latest firmware, SAN, Servers, Switches.

Servers = 2x Dell R710 with an aditional Intel Quad Port Nic for iSCSI. Backup server is a Dell 2950, Windows Server 2008 R2 Std, 16GB RAM, 8x 3.0 Ghz CPU and a Dual Port Intel Nic for iSCSI, Were using symantec Backup EXEC 2010 R3 with the VMware agent. Backup server is also running the vCenter Server software.  It backs up to a DAS MD1000 with 12x500GB 7200 SAS Drives over a SAS cable, cant remember if its 5 or 6.  The DAS can read and write at 200MB/s, tested the system using ATTO benchmark. The is also a Dell 2850 Server running VM hosts also connected to the iSCSI network using an Intel Quadport NIC.

Switches = 2x Dell 6224 PowerConnect.  I have configured then as per the Dell Best practices, Seperate VLAN for iSCSI traffic, Jumbo frames are enabled all the way through from host to SAN, flow control is enabled too.  These switches are dedicated to iSCSI traffic only.

SAN = Dell Equallogic PS4000XV Fully Loaded with SAS 10,000RPM Disks, 450GB.  The SAN has 2 controllers, each controller has 2 1GB Links and a 100MB management connector.  SAN is confidured in RAID 50.

 

I have cabled the iSCSI network for redundancy so its connected has follows.

VM Host 1 = I have 4 iSCSI connections, Ports 1 & 2 go to Switch 1, ports 1 & 2.

                                                                  Ports 3 & 4 go to switch 2, ports 3 & 4

VM Host 2 = I have 4 iSCSI connections, Ports 1 & 2 go to Switch 2, ports 1 & 2.

                                                                  Ports 3 & 4 go to Switch 1, ports 3 & 4

VM Host 3 = I have 4 iSCSI connections, Ports 1 & 2 go to Switch 1, ports 5 & 6.

                                                                  Ports 3 & 4 go to Switch 2, ports 5 & 6

Backup Server = I have 2 iSCSI Connections, Port 1 goes to Switch 1 Port 23

                                                                          Port 2 goes to Switch 2 Port 23

SAN =  The 2x Controllers are Active / Passive so, Eth0 Act & Pas go to Port 20 & 21 on Switch 1

                                                                                 Eth1 Act & Pas go to Port 20 & 21 on Switch 2

 

I have setup the VM hosts to use round robin all the way and have reduced the que depth to 3 per data store as recommended by Dell, this is to help get more load going over all the links faster insted of it only working for large files.

I have disabled Delayed Ack on the VM hosts and the 2008 backup server.

If I benchmark an iSCSI connected test volume formatted to NTFS from the backup server is get 240MB's read and about 220MB's write so its clearly maxing out the iSCSI link speed when checked against a network monitor on the NICs handeling iSCSI. MPIO is installed on the backup server.

 

Now, my problem is this.  I see in the web interface for the 6224 switches that port 1 on the switch is over 4x the ammount of data throughput on it VS the othe 3 ports comming from the VM hosts.  I have configured each VM hosts iSCSI so each port has its own VM kernel and its own single nic linked to it and then from there each NIC VM kernel is bound to the VMHBA.

When I perform a backup, I only get a throughput of 2000MB/Min - 2500MB/Min when Backing up over SAN Transport.  I can see both Nics are being used but they only ever do about 20MB/s each.  I get around 47MB/s when using NBD on a single NIC. 

If I copy a file from my laptop (has an SSD) to the backup server via 1GB network to the DAS I max out the copy of the file at over 105MB/s when copying a 30GB file.

If I upload the same file to one of the Datastores through VMware I cant copy the file any faster than 40 - 47MB/s. Since the slowest factor in this copy process to the SAN should my network link I should be maxing that out at around 100MB/s. 

The vCenter Server is running on the same server as the backup software but its worth noting that the vCenter server was once running as VM on the Hosts and the backup speed was still slow then.  I have also copied the sam 30GB file direct to the datastore while connected directly to the ESXi host rather than via the vCenter server and the speed was the same. So this eliminated the issue being with the backup / vCenter server. 

The final part of this is, the subnetting.  I have used the IP address range of 40.0.0.1 - 200 for these devices.

SAN has 40.0.0.10 for the discovery portal, SAN nics are 40.0.0.2 & 3

VM host example is

One IP for each iSCSI Nic port, so its 40.0.0.11, 21, 31, 41 on Subnet mask of 255.255.0.0

Should I be using a different subnet for each nic port and if so, could someone give an example of a way that would be better?

 

I am very interested in the subnetting as there seems to be people with different views and I want to know if I have done something wrong here to the point that its causing me these issues.

 

On this link below there is a dell foru  post that says something about different subnets causing data to flow only down the first nic it finds on that subnet or somthing like this. I dont understand the relivence of it though I am sorry to say, see link below.

http://en.community.dell.com/support-forums/storage/f/1216/t/19266499.aspx

 

 

If anyone can help here this would be great.

 

Kind Regards

 

SyLvEsTeR20007

 

 

 

203 Posts

December 11th, 2011 10:00

I'll share with you a common theme at this year's Dell Storage Forum.  Many presentations that focused on optimization and tuning all suggested the very same thing.  Keep your SAN storage network flat (no subnetting), simple, and make sure you have your switches stacked (instead of LAG'd), and configured to be rock solid.  

203 Posts

December 11th, 2011 11:00

With LAGs, you introduce several issues that can come up in each LAG that is set up.  They work okay when everything is absolutely perfect, but it is tough to understand what "perfect" really is, especially when dealing with iSCSI traffic.  I can't speak to the actual 10G links, as I didn't do that.  But like many LAGs, you can be okay on the bandwidth of the trunk itself, but how it runs is another matter.  A stack is basically just extending the circuitry of the board, and doesn't have to acknowledge the rules of ethernet.

Take a look at my post here:  itforme.wordpress.com/.../reworking-my-powerconnect-6200-switches-for-my-iscsi-san  As I detailed step by step how I approached it.  

4 Operator

 • 

9.3K Posts

December 12th, 2011 04:00

Why are you using a non-private 'range' for iSCSI instead of a whole (private) subnet?

I'd recommend to fully isolate iSCSI traffic (maybe leave 1 port on vlan 1 or so on each switch so you can manage them from the LAN, and put the rest in the iSCSI VLAN) and use a private IP range for iSCSI.

Private Class A range 10.x.x.x

Private Class B range 172.16.x.x-172.16.31.x

Private Class C range 192.168.x.x

Like sketchy00 mentioned; there should be no routing to your iSCSI network, so there's no need to use a public IP schema.

December 11th, 2011 10:00

Hi buddy, first off, thats useful stuff know as I have been thinking about if subnetting was part of my problems.  Glad to know I can forget about them.  The next question would be how much difference is made with stacking over LAG'd.  I have the two switches linked with a 10GB RJ45 uplink module (Slot 2 Module) the two 10G links are lagged together.  Is this not the best way to do it. Will this causing performance issues? Would I be better off using a single link on XG3 and not linking XG4 and not lagging the two together?

Is there any reason for the heavy throughput on link 1 of 4 of the iSCSI links?

 

Thanks again,

 

 

December 11th, 2011 13:00

Wow, thats a fantastic writeup and there are a few things I have done differently, for starters I have Spanning tree disabled accross the board, I didnt seperate all these things into these vlans, I only seperated the iSCSI traffic into its own Vlan.  

Ill go back over your notes again tomorrow when im not so tired and look to implimenting them. Do you have any other writeups regarding backups or VM's, I would be interested to read your expirances.

Many thanks for your link.

TTFN

December 12th, 2011 05:00

Hi,

Thanks for the responce Dev Mgr,

I am sorry to say that im still learning and as a result I cant fully understand the reasons that are for and against different ranges of IP / private and non private.  I figured that since the iSCSI network is running on its own switches that it didnt matter if it was a private or non private range of address.  With this in mind, is there still a reason not to use the 40.0.0.x addressesing im using?  

I have left on vlan 1 a single port for management so I can manage the switch via the HTTP interface or telnet, its IP

is 10.0.0.201 and 202 as there are two switches for iSCSI.  

I failed to mention that I have the two switches connected via a 10G uplink module and that the module has two ports, XG3 and XG4 and both are based on RJ45 and I lagged those two ports.  Not sure if this is a mistake based on what sketchy00 has been kind enough to share with me.

Once again, I am learning and will take all advise withj the greatest of thanks.

Syl

4 Posts

December 13th, 2011 02:00

it is tough to understand what "perfect" really is, especially when dealing with iSCSI traffic.

December 13th, 2011 06:00

iippers, Truer words dont seem to have ever been spoken.

I though I would update the thread, I have now changed our iSCSI network address to a Class C address range.  I enabled spanning tree  with RSTP and eneabled port fast on each iSCSI port (all ports basicly).  Infact, to be clear, I have used sketchy00's setup as a guideline so I infact wipped my configs and started over.  I also got my hands on the engineering release of the new firmware (engineering release) for the Dell 6224 switches.  I am told its the release firmware but its not released via dells website as they havnt finished the internal help files for the firmware so it still contains the documentation for the previous firmware.  

For those interested, the firmware rev is 3.3.2.3 and brings the following enhancements below.

I cant confirm if these changes have rendered me any benifit as when I turned the switches back on after having them powered off, the 10G RJ45 uplink module in switch 2 decided it has had enough of line and so its dead. Dell are sending me a replacement and I should have it in the next day or two so for now, the switches are connected with a single 1GB link.  I can report that my read speeds on VM disks have dropped from 220MB's to 180MB's and write speeds are down to 50MB's from 170MB's.  My backups are still un-affected in as much as I still only get about 40MB's so nothing new there.  

Since this thread was originally about subnetting, I have decided to mark sketchy00's answer as the answer to my question but would still like to get any feedback if anyone can offer any more tips for improving my VM iSCSI network.

Syl

No Events found!

Top