Start a Conversation

Unsolved

This post is more than 5 years old

63875

July 16th, 2013 13:00

nic bonding through 2 unstacked 6284 switches

Hello,

I'm trying to connect a server (on linux, using two NICs) through 2 unstacked powerconnect 6284 switches, for redundancy. I'd like to use balance-alb mode on the linux bonding driver, but for some reason I get an odd behavior - lots of dropped packets... about half of all traffic.

Does anyone have an idea what might be causing the problem?

802 Posts

July 16th, 2013 13:00

Bonding in considered the same as teaming the Nic's together or Link Aggregation on the switch side..  When teaming the individual physical ports need to end at the same device.  In your case this is not happening.  If the 6248 switches were stacked you can have one cable going to each physical switch.  This will give you physical redundancy if a switch were to go down for some reason.

This would be the reason for the dropped packets.

July 17th, 2013 03:00

Well, according to the linux docs on the matter the balance-alb bonding/teaming mode does not require anything special in terms of switch config. So, I'm not using LAG / LACP on the switches.

As far as I understand the linux algorithm chooses a NIC mac address for every outgoing connection and also replies to ARP queries in a similar way - the goal is to spread peers equally over all NICs (every peer has its NIC but not all peers share the same NIC). Which gives you both redundancy and load sharing. And I like that :)

Anyway, I'm not really used to Dell switches, if I would like to get deeper insight on what exactly is happening ... can you give me some suggestions on where to look? Any network diagnostic tools I can use with the switches? Or logging, packet capture, etc? ARP monitoring, maybe the linux ARP 'tricks' are somehow messing up the switches?

802 Posts

July 17th, 2013 10:00

Wireshark would be a good tool to capture the packets for analysis.

I do see where it talks about not needing any special switch configuration for the alb bonding.  in this document

www.kernel.org/.../bonding.txt

Although, I do not see where you can span/split the bond members to 2 different physical switches.  The essence of bonding/teaming is connecting 2 physical devices with multiple members of the bond/team.  It is my opinion that this is the reason for the dropped packets.  If you moved the member cables of the bond to a single switch would you still see the same behavior?

July 19th, 2013 02:00

Right, ok, yesterday I found some settings on the switches that control dynamic ARP inspection, seems to be some sort of a rate-limiter for ARP traffic. I disabled it (thinking that linux does GARP and stuff) and that solved most of the problems. For instance, at the moment both switches see the correct mac address at their ports, if I unplug one of the cables the arp tables on the switches change and traffic is not interrupted.

Unfortunately, I still have a problem - traffic between some host pairs is impossible. When I do a ping from one server to another tcpdump shows that requests are received but replies disappear somewhere ... Even though the arp tables on the switches seem to be correct.

Any thoughts? Ideas about other arp-related settings on the switches?

802 Posts

July 19th, 2013 10:00

How are you physically connecting the cables and hardware at this point?  How exactly did you disable the garp?  The "no garp timer" command just resets the parameters to default based on this information I pulled from the CLI User Guide.

garp timer

Use the garp timer command in Interface Configuration mode to adjust the

GARP application join, leave, and leaveall GARP timer values. To reset the

timer to default values, use the no form of this command.

Syntax

garp timer {join | leave | leaveall} timer_value

no garp timer

I'm not seeing many options in manipulating the arp feature other than changing the timers.  I will keep looking at what options we may have on configuring it differently.

July 20th, 2013 02:00

Nooo, perhaps I wan't as clear as I could. I didn't disable the GARP timer, what I was doing was browsing the web interface of the switch ... looking for inspiration or something :) And I found this 'dynamic ARP inspection' section under 'switching'. And in it there's some filter, that apparently applies limits to ARP traffic, by default. The help says you can turn it off by setting the port to 'trusted' state. This is what I did.

5 Practitioner

 • 

274.2K Posts

July 22nd, 2013 11:00

Just trying to gather some more information.

What firmware are the switches at?

Are we only seeing the lost replies when using this specific bond mode? If we do a single connection without bonding do the replies make it back properly?

To confirm, you are seeing the replies leave the server? Or a reply from the server is never seen?

Thanks

No Events found!

Top