Unsolved
This post is more than 5 years old
8 Posts
0
168697
January 11th, 2016 07:00
stack-mac persistent
Hello all,
Our company recently acquired four Dell N4064F switches. We have two in one COLO facility and two in another, both places the switches are in a stack of two acting as one logical switch. It has come to our attention through testing that the cross stack LACP port-channels connecting the two stacks and connected to our corporate headquarters network temporarily go down when the stack master has a failure (powered down, etc). When one of the switches in the stack goes down, traffic should be able to continue passing through the active link in the LAG, but there is a long pause until the active link starts passing traffic again.
So when this master goes down it looks to be that spanning-tree sees a topology change and begins to go through its calculations. Once the port-channel goes to a forwarding state traffic begins passing again, but not until about 20-30 seconds later.
What I didn't understand about this is why the port-channels connecting to other switches are going down at all. Sure one of the links goes down, but not both which I would think should keep the port-channel link status still up. I came across some forums where people had this same issue with Cisco 3750s and how the command 'stack-mac persistent timer 0' can solve this. Does anyone know if there is something comparable to this on the Dell N4000 series switches? I've opened a case with Dell's support and am still waiting to hear back on it. Below is one of the forums where I found this:
supportforums.cisco.com/.../stack-mac-persistent-timer-lacp-port-channels-c3750


MTSIA5988
8 Posts
0
January 11th, 2016 11:00
Hi Daniel--I appreciate the response.
Yes I've looked into the NSF feature on these switches. The feature looks like it's turned on by default on these switches and I have confirmed we have it on.
I could be wrong but Ultimately it does look like the LACP port-channel has to reset or renegotiate once the stack MAC address changes (master fails/reboots). Once this happens traffic does not pass through the LAG until STP has gone through its calculations. What I think I need is something stopping the LACP LAGs connected to other switches from restarting and causing STP to kick in.
MTSIA5988
8 Posts
0
January 12th, 2016 06:00
Hi Daniel,
I tried out changing both sides of the stack to short timeouts but this does not appear to have solved the problem. When the master dies the LACP port-channel interface still dies and then starts up again, causing an STP reconvergence.
The stack firmware is the newest at 6.2.7.2. I did try testing the LAG to static and when the master dies the port-channel interface still has a short 1-2 second blip still causing STP calculations. When I disabled spanning-tree completely on both stacks there is a much better behavior. The 1-2 second blip of the port-channel interface going down is the only down time you see. It comes right back up and everything is in a forwarding state instead of listening, learning, etc.
MTSIA5988
8 Posts
0
January 12th, 2016 07:00
I have tested this with different spanning-tree versions as well. Obviously I can run RSTP or Rapid-PVST and the LAG outage connecting both switches shortens. Ultimately though I want to know if when the stack master/management plane dies if there is a way on these switches to continue passing traffic on the remaining port-channel member without it going down for even the 1-2 seconds. The non-stop forwarding (NSF) feature does not look to be doing this.