Unsolved
This post is more than 5 years old
108 Posts
0
57940
spanning tree protocol and isolated switch
Hi,
We have a network with two 6248s, two M6220s and four M8024s, a few PowerEdge blade/rack servers and a few EqualLogic iSCSI arrays.
There are few redundant paths in the network topology, but that shouldn't be an issue, since the switches are running by default the rapid spanning tree protocol.
But yesterday we hit an big issue.
After reloading one of the M8024s, it got completely isolated from the network!
The M8024 in question (switch-c1) is connected to another M8024 (switch-b1) and to a 6248 (switch2).
It appears that the spanning tree protocol instance on the latter two switches is blocking the ports connected to the switch in question, see the diagnostic output and the logs below.
And being isolated from the network, the switch in question is running its own spanning tree protcol instance.
Before the reload, the switch was apparently working just fine within the network.
We would expect that only one of the two switches connect to it would block the connection to it to prevent a loop in the network, but not both of them at the same time!
The switch has now been isolated from the network for more than a day, a new STP log entry is created on each of the two switches connected to it every two seconds.
It looks like the issue is not going to resolve automatically.
The 6248s are running firmware version 3.3.5.5.
The M6220s and M8024s are running firmware version 5.0.1.3.
What's wrong with our setup?
Thanks.
P.S.: The MAC addresses and the IP addresses in this message are masked.
switch2#show spanning-tree blockedports
Spanning tree Enabled (BPDU flooding : Disabled) mode rstp
CST Regional Root: 80:00:00:19:B9:11:11:11
Regional Root Path Cost: 0
###### MST 0 Vlan Mapped: 1, 10, 31-46, 150
ROOT ID
Address 00:19:B9:11:11:11
This Switch is the Root.
Hello Time 2 Sec Max Age 20 sec Forward Delay 15 sec
Interfaces
Name State Prio.Nbr Cost Sts Role PortFast RestrictedPort
------ -------- --------- ---------- ---- ----- -------- -------
1/xg4 Enabled 128.52 2000 DSC Desg No No
switch2# show logging
<189> MAR 08 18:19:18 111.111.111.111-1 TRAPMGR[152561184]: traputil.c(611) 54331 %% 1/0/52 is transitioned from the Forwarding state to the Blocking state in instance 0
<189> MAR 08 18:19:20 111.111.111.111-1 TRAPMGR[152561184]: traputil.c(611) 54332 %% 1/0/52 is transitioned from the Forwarding state to the Blocking state in instance 0
<189> MAR 08 18:19:22 111.111.111.111-1 TRAPMGR[152561184]: traputil.c(611) 54333 %% 1/0/52 is transitioned from the Forwarding state to the Blocking state in instance 0
switch-b1#show spanning-tree blockedports
Spanning tree Enabled (BPDU flooding : Disabled) mode rstp
CST Regional Root: 80:00:A4:BA:DB:22:22:22
Regional Root Path Cost: 0
###### MST 0 Vlan Mapped: 1, 10, 31-45, 150
ROOT ID
Priority 32768
Address 0019.B911.1111
Path Cost 3000
Root Port Te1/1/4
Hello Time 2 Sec Max Age 20 sec Forward Delay 15 sec
Bridge ID
Priority 32768
Address A4BA.DB22.2222
Hello Time 2 Sec Max Age 20 sec Forward Delay 15 sec
Interfaces
Name State Prio.Nbr Cost Sts Role RestrictedPort
------ -------- --------- --------- ---- ----- --------------
Te1/1/3 Enabled 128.19 2000 DSC Desg No
switch-b1#show logging
<189> MAR 08 18:08:11 222.222.222.222-1 TRAPMGR[199127520]: traputil.c(637) 57162 %% Te1/1/3 is transitioned from the Forwarding state to the Blocking state in instance 0
<189> MAR 08 18:08:09 222.222.222.222-1 TRAPMGR[199127520]: traputil.c(637) 57161 %% Te1/1/3 is transitioned from the Forwarding state to the Blocking state in instance 0
<189> MAR 08 18:08:07 222.222.222.222-1 TRAPMGR[199127520]: traputil.c(637) 57160 %% Te1/1/3 is transitioned from the Forwarding state to the Blocking state in instance 0
switch-c1#show spanning-tree blockedports
Spanning tree Enabled (BPDU flooding : Disabled) mode rstp
CST Regional Root: 80:00:5C:26:0A:33:33:33
Regional Root Path Cost: 0
###### MST 0 Vlan Mapped: 1, 10, 31-45, 150
ROOT ID
Priority 32768
Address 5C26.0A33.3333
This Switch is the Root.
Hello Time 2 Sec Max Age 20 sec Forward Delay 15 sec
Interfaces
Name State Prio.Nbr Cost Sts Role RestrictedPort
------ -------- --------- --------- ---- ----- --------------
DELL-Willy M
802 Posts
0
March 8th, 2013 11:00
I would suggest setting the span root priority on the current root switch to 4096. This would keep the switch as the root. We can look at running the clear spanning-tree detected-protocols command and see if the network returns to normal after the reconvergence. Or you could manually pull one of the redundant cable connecting switch-c1 which would cause a convergence also.
console(config)#spanning-tree priority 4096
Syntax
clear spanning-tree detected-protocols [{gigabitethernet unit/slot/port | port-channel port-channel-number | tengigabitethernet unit/slot/port}]
Example
The following example restarts the protocol migration process (forces the renegotiation with neighboring switches) on 1/0/1.
console#clear spanning-tree detected-protocols gigabitethernet 1/0/1
pzero
108 Posts
0
March 9th, 2013 02:00
Hi Willy,
About changing the priority for the root switch, is that really required?
Even if all switches have the same priority (the default one), wouldn't the different MAC address result in one of them to always be the root switch anyway?
Before we try to clear the the protocols or pull of the cables, we'd like to understand how the isolated switch got stuck in such situation, to avoid a similar issue in the future, after all it was apparently working fine before the reload.
Do you have any ideas on what could have caused the issue and how to prevent it from happening again?
Do you think there is something wrong with our setup? Or could I be a bug in the firmware of one of the switches?
Thanks.
DELL-Willy M
802 Posts
0
March 11th, 2013 13:00
I'm still researching the possible reasons for the isolation of the switch.
One question I have is how are the trunk links configured on the different switch connections? Something to consider is that with the 6224 specifically you should have a general mode switchport instead of a trunk switchport to allow management traffic onto the switch over the PVID. If you use Trunk mode, you will not have the default VLAN on those ports. The ports will only allow tagged traffic.
If these connections are not set properly then the multicast of the BPDU is blocked and can cause issues like you are seeing.
DELL-Willy M
802 Posts
0
March 11th, 2013 13:00
Would it be possible to get a topology and the show tech output for all your switches?
We can communicate thru email.
If loop guard is set up incorrectly you could also see a situation like this.
DELL-Willy M
802 Posts
0
March 12th, 2013 13:00
We are seeing in the configurations that you provided that it is possible an issue with the trunking/general ports. Since you are running all tagged VLANs and no untagged VLANs, then spanning tree should still be going over VLAN 1.
This would make since why the switch would drop off from not receiving any BPDUs. The weird thing is that it is you are getting BPDUs on B1. Like you have stated that it is working fine.
The options that are available to tweak the configurations are:
1) on the 2 M8024s switch to trunk (this will bring VLAN 1 back into the allowed state, allowing BPDU traffic on VLAN 1)
2) stick with general switchport mode and change the PVID on all four ports to VLAN 1
pzero
108 Posts
0
March 13th, 2013 08:00
Hi Willy,
We didn't understand why with the current configuration switch-c1 would be dropping the BPDUs, could you elaborate a bit on that?
Regarding the suggested options:
1) Doesn't general mode allow VLAN 1 by default (as does trunk mode)?
The show interface detail outputs below seem to confirm that VLAN 1 is already allowed on all the interconnection ports.
2) The outputs also appear to confirm that all ports already have PVID set to 1.
We cannot still figure out why the sent/received BPDU counters keep increasing for all interconnection ports, except for those on the isolated switch, for which only the sent counter keeps increasing, while the received counter is always at 0.
What could be causing that? Anything else we can try to figure out the isolation cause?
Thanks.
switch-c1#show interfaces detail Te1/1/3
Port Name Duplex Speed Neg MTU Admin Link
State State
--------- --------------------------- ------ ------- ---- ----- ----- -----
Te1/1/3 Full 10000 Off 9216 Up Up
Port Description
-------- ----------------------------------------------------------------------
Te1/1/3
Flow Control:Enabled
Port: Te1/1/3
VLAN Membership mode:General Mode
Operating parameters:
PVID: 1
Ingress Filtering: Enabled
Acceptable Frame Type: Admit All
Default Priority: 0
GVRP status:Disabled
Protected:Disabled
Port Te1/1/3 is member in:
VLAN Name Egress rule Type
---- --------------------------------- ----------- --------
1 default Untagged Default
10 VLAN0010 Tagged Static
31 VLAN0031 Tagged Static
32 VLAN0032 Tagged Static
33 VLAN0033 Tagged Static
34 VLAN0034 Tagged Static
35 VLAN0035 Tagged Static
36 VLAN0036 Tagged Static
37 VLAN0037 Tagged Static
38 VLAN0038 Tagged Static
39 VLAN0039 Tagged Static
40 VLAN0040 Tagged Static
41 VLAN0041 Tagged Static
42 VLAN0042 Tagged Static
43 VLAN0043 Tagged Static
44 VLAN0044 Tagged Static
45 VLAN0045 Tagged Static
150 VLAN0150 Tagged Static
Static configuration:
PVID: 1
Ingress Filtering: Enabled
Acceptable Frame Type: Admit All
Port Te1/1/3 is statically configured to:
VLAN Name Egress rule
---- --------------------------------- -----------
10 VLAN0010 Tagged
31 VLAN0031 Tagged
32 VLAN0032 Tagged
33 VLAN0033 Tagged
34 VLAN0034 Tagged
35 VLAN0035 Tagged
36 VLAN0036 Tagged
37 VLAN0037 Tagged
38 VLAN0038 Tagged
39 VLAN0039 Tagged
40 VLAN0040 Tagged
41 VLAN0041 Tagged
42 VLAN0042 Tagged
43 VLAN0043 Tagged
44 VLAN0044 Tagged
45 VLAN0045 Tagged
150 VLAN0150 Tagged
Forbidden VLANS:
VLAN Name
---- ---------------------------------
Port Te1/1/3 Enabled
State: Forwarding Role: Designated
Port id: 128.19 Port Cost: 2000
Port Fast: No (Configured: no ) Root Protection: No
Designated bridge Priority: 32768 Address: 5C26.0A33.3333
Designated port id: 128.19 Designated path cost: 0
CST Regional Root: 80:00:5C:26:0A:33:33:33 CST Port Cost: 0
BPDU: sent 260144, received 0
switch-c1#show interfaces detail Te1/1/4
Port Name Duplex Speed Neg MTU Admin Link
State State
--------- --------------------------- ------ ------- ---- ----- ----- -----
Te1/1/4 Full 10000 Off 9216 Up Up
Port Description
-------- ----------------------------------------------------------------------
Te1/1/4
Flow Control:Enabled
Port: Te1/1/4
VLAN Membership mode:General Mode
Operating parameters:
PVID: 1
Ingress Filtering: Enabled
Acceptable Frame Type: Admit All
Default Priority: 0
GVRP status:Disabled
Protected:Disabled
Port Te1/1/4 is member in:
VLAN Name Egress rule Type
---- --------------------------------- ----------- --------
1 default Untagged Default
10 VLAN0010 Tagged Static
31 VLAN0031 Tagged Static
32 VLAN0032 Tagged Static
33 VLAN0033 Tagged Static
34 VLAN0034 Tagged Static
35 VLAN0035 Tagged Static
36 VLAN0036 Tagged Static
37 VLAN0037 Tagged Static
38 VLAN0038 Tagged Static
39 VLAN0039 Tagged Static
40 VLAN0040 Tagged Static
41 VLAN0041 Tagged Static
42 VLAN0042 Tagged Static
43 VLAN0043 Tagged Static
44 VLAN0044 Tagged Static
45 VLAN0045 Tagged Static
150 VLAN0150 Tagged Static
Static configuration:
PVID: 1
Ingress Filtering: Enabled
Acceptable Frame Type: Admit All
Port Te1/1/4 is statically configured to:
VLAN Name Egress rule
---- --------------------------------- -----------
10 VLAN0010 Tagged
31 VLAN0031 Tagged
32 VLAN0032 Tagged
33 VLAN0033 Tagged
34 VLAN0034 Tagged
35 VLAN0035 Tagged
36 VLAN0036 Tagged
37 VLAN0037 Tagged
38 VLAN0038 Tagged
39 VLAN0039 Tagged
40 VLAN0040 Tagged
41 VLAN0041 Tagged
42 VLAN0042 Tagged
43 VLAN0043 Tagged
44 VLAN0044 Tagged
45 VLAN0045 Tagged
150 VLAN0150 Tagged
Forbidden VLANS:
VLAN Name
---- ---------------------------------
Port Te1/1/4 Enabled
State: Forwarding Role: Designated
Port id: 128.20 Port Cost: 2000
Port Fast: No (Configured: no ) Root Protection: No
Designated bridge Priority: 32768 Address: 5C26.0A33.3333
Designated port id: 128.20 Designated path cost: 0
CST Regional Root: 80:00:5C:26:0A:33:33:33 CST Port Cost: 0
BPDU: sent 260151, received 0
switch-b1#show interfaces detail Te1/1/3
Port Name Duplex Speed Neg MTU Admin Link
State State
--------- --------------------------- ------ ------- ---- ----- ----- -----
Te1/1/3 Full 10000 Off 9216 Up Up
Port Description
-------- ----------------------------------------------------------------------
Te1/1/3
Flow Control:Enabled
Port: Te1/1/3
VLAN Membership mode:General Mode
Operating parameters:
PVID: 1
Ingress Filtering: Enabled
Acceptable Frame Type: Admit All
Default Priority: 0
GVRP status:Disabled
Protected:Disabled
Port Te1/1/3 is member in:
VLAN Name Egress rule Type
---- --------------------------------- ----------- --------
1 default Untagged Default
10 VLAN0010 Tagged Static
31 VLAN0031 Tagged Static
32 VLAN0032 Tagged Static
33 VLAN0033 Tagged Static
34 VLAN0034 Tagged Static
35 VLAN0035 Tagged Static
36 VLAN0036 Tagged Static
37 VLAN0037 Tagged Static
38 VLAN0038 Tagged Static
39 VLAN0039 Tagged Static
40 VLAN0040 Tagged Static
41 VLAN0041 Tagged Static
42 VLAN0042 Tagged Static
43 VLAN0043 Tagged Static
44 VLAN0044 Tagged Static
45 VLAN0045 Tagged Static
150 VLAN0150 Tagged Static
Static configuration:
PVID: 1
Ingress Filtering: Enabled
Acceptable Frame Type: Admit All
Port Te1/1/3 is statically configured to:
VLAN Name Egress rule
---- --------------------------------- -----------
10 VLAN0010 Tagged
31 VLAN0031 Tagged
32 VLAN0032 Tagged
33 VLAN0033 Tagged
34 VLAN0034 Tagged
35 VLAN0035 Tagged
36 VLAN0036 Tagged
37 VLAN0037 Tagged
38 VLAN0038 Tagged
39 VLAN0039 Tagged
40 VLAN0040 Tagged
41 VLAN0041 Tagged
42 VLAN0042 Tagged
43 VLAN0043 Tagged
44 VLAN0044 Tagged
45 VLAN0045 Tagged
150 VLAN0150 Tagged
Forbidden VLANS:
VLAN Name
---- ---------------------------------
Port Te1/1/3 Enabled
State: Discarding Role: Designated
Port id: 128.19 Port Cost: 2000
Port Fast: No (Configured: no ) Root Protection: No
Designated bridge Priority: 32768 Address: A4BA.DB22.2222
Designated port id: 128.19 Designated path cost: 3000
CST Regional Root: 80:00:A4:BA:DB:22:22:22 CST Port Cost: 0
BPDU: sent 260423, received 260416
switch2#show interfaces detail ethernet 1/xg4
Port Type Duplex Speed Neg Admin Link
State State
----- ------------------------------ ------ ------- ---- ----- -----
1/xg4 10G - Level Full 10000 Auto Up Up
Port Description
---- --------------------------------------------------------------------------
1/xg4
Flow Control:Enabled
Port: 1/xg4
VLAN Membership mode:General Mode
Operating parameters:
PVID: 1
Ingress Filtering: Enabled
Acceptable Frame Type: Admit All
Default Priority: 0
GVRP status:Disabled
Protected:Disabled
Port 1/xg4 is member in:
VLAN Name Egress rule Type
---- --------------------------------- ----------- --------
1 Default Untagged Default
10 Tagged Static
31 Tagged Static
32 Tagged Static
33 Tagged Static
34 Tagged Static
35 Tagged Static
36 Tagged Static
37 Tagged Static
38 Tagged Static
39 Tagged Static
40 Tagged Static
41 Tagged Static
42 Tagged Static
43 Tagged Static
44 Tagged Static
45 Tagged Static
46 Tagged Static
150 Tagged Static
Static configuration:
PVID: 1
Ingress Filtering: Enabled
Acceptable Frame Type: Admit All
Port 1/xg4 is statically configured to:
VLAN Name Egress rule
---- --------------------------------- -----------
10 Tagged
31 Tagged
32 Tagged
33 Tagged
34 Tagged
35 Tagged
36 Tagged
37 Tagged
38 Tagged
39 Tagged
40 Tagged
41 Tagged
42 Tagged
43 Tagged
44 Tagged
45 Tagged
46 Tagged
150 Tagged
Forbidden VLANS:
VLAN Name
---- ---------------------------------
Port 1/xg4 Enabled
State: Discarding Role: Designated
Port id: 128.52 Port Cost: 2000
Port Fast: No (Configured: no ) Root Protection: No
Designated bridge Priority: 32768 Address: 80:00:00:19:B9:11:11:11
Designated port id: 128.52 Designated path cost: 0
CST Regional Root: 80:00:00:19:B9:11:11:11 CST Port Cost: 0
BPDU: sent 2155962, received 260267
DELL-Willy M
802 Posts
0
March 13th, 2013 12:00
As far as elaborating on the cause. We are getting some unusual reactions from the set up that you have.
1) Specifically on the 6200 models you have to use general mode and specify the PVID as 1 in order to allow management traffic across the connection. On all the other switches trunk mode is a default “all” VLANs including the default VLAN 1. You would have to explicitly remove a VLAN if you do not want it on the trunk.
This is what is confusing, because you have been receiving BPDUs on 1 switch but not the other.
2) I do see the outputs have PVID as 1. The problem has to be how the trunk/general connections are configured between the switches. This is how the BPDUs are sent and received.
If you look at the excerpt from the running config on your 6248. I do not see VLAN 1 listed as added
interface ethernet 1/xg4
mtu 9216
switchport mode general
switchport general allowed vlan add 10,31-46,150 tagged
We can go to the port and add the line (and to all other ports connecting to other switches)
switchport general allowed vlan add 1 untagged
pzero
108 Posts
0
March 14th, 2013 10:00
Hi Willy,
As you suggested, we added the following line to the configuration for switch2 port 1/xg4, switch-b1 port Te1/1/3 and switch-c1 ports Te1/1/3 and Te1/1/4:
switchport general allowed vlan add 1 untagged
Unfortunately that didn't help: switch-c1 is still isolated and its port Te1/1/3 and Te1/1/4 still show 0 BPDU received.
Also, executing show running-config on the switches, the listed configuration didn't change at all: the above line is not shown, probably because VLAN 1 is allowed by default, so there no explicit configuration line needed for it.
We cannot really understand why switch-c1 is not receiving any BDPUs from switch-b1 and switch2.
Could it be a firmare bug?
Anything else we can do the debug the issue?
Thanks.
DELL-Willy M
802 Posts
0
March 14th, 2013 16:00
I think at this point it will do us a great service to get a topology of the switches. We have a pretty good understanding. Because of the circumstances surrounding this case it would be good to on top of everything.
switch-c1 M8024 - has these ports connecting to what exactly?
Te1/1/3
Te1/1/4
switch-b1 M8024 - has these ports connecting to what exactly?
Te1/1/3
switch2 6248 - has these ports connecting to what exactly?
1/xg4
I'm still researching on what options we have towards resolution. Sorry to back up a little I just want to make sure we have clear understanding.
pzero
108 Posts
0
March 15th, 2013 10:00
Hi Willy,
Those are the interconnection ports, here is the relevant information from my email from the other day:
"The isolated switch is switch-c1, its port Te1/1/3 is connected to port Te1/1/3 on switch-b1 and its port Te1/1/4 is connected to port xg4 on switch2."
Please let us know as soon as you find out something, we really need to find what's causing this isolation issue and how to prevent it from happening again.
Thanks.
DELL-Willy M
802 Posts
0
March 15th, 2013 16:00
I'm still in discussion with engineers. I will reply back when I have further information.
pzero
108 Posts
0
March 19th, 2013 03:00
Hi Willy,
Did the engineers find out anything about this issue?
Thanks.
pzero
108 Posts
0
March 21st, 2013 05:00
Hi Willy,
Is there any additional information we can provide you to help your engineers debug this issue?
Thanks.
DELL-Willy M
802 Posts
0
March 21st, 2013 11:00
Sorry for the delayed response.
What we are still seeing that VLAN 1 untagged is still not allowed thru the port.
Port Te1/1/3 is statically configured to:
VLAN Name Egress rule
---- --------------------------------- -----------
10 VLAN0010 Tagged
We can change the ports on the 8024 to trunk mode. This will default to allowing VLAN 1 untagged.
(Make sure NOT to add a statement explicitly removing VLAN 1)
pzero
108 Posts
0
March 23rd, 2013 05:00
Hi Willy,
VLAN 1 traffic is already untagged on Port Te1/1/3 on switch-b1 and switch-c1 with the current configuration, using general mode, see below:
Port Te1/1/3 is member in:
VLAN Name Egress rule Type
---- --------------------------------- ----------- --------
1 default Untagged Default
If we change it to trunk mode, VLAN 1 traffic would instead be tagged instead.
We're getting a bit confused about the different suggestions to use general/trunk modes on this post and at this point we're afraid that's may not be the cause of our issue.
Maybe the issue lies somewhere else or could this actually be a bug in the firmware?
Thanks.