Start a Conversation

Unsolved

This post is more than 5 years old

34023

November 8th, 2012 03:00

Understanding PC6248 log with regards to LAG.

Interesting start to today, when:

a) I get an email saying the veritas b2d over iSCSI failed
b) the LAG to our vmware and storage environment is down and won't come up (6248 to 6248)

Got even weirder when even after removing the LAG cables and using a different port, on each switch, that single cable was being blocked/discarding.

Turns out that for some reason, the Thecus N8800Pro (running over 10GB doing out b2d to iSCSI volumes) was doing something to get the ports blocked. I think it's probably totally my fault, as it used to use its own vlan/lag, between switches, then 10GB to it, but yesterday I reconfigured it to use MPIO and run over our normal vlan/lag too. But I didn't change any of the multicast/broadcast storm settings, so I think it ended up getting our normal vlan/lag blocked.

During my investigations clearly I was looking at the logs, and now it's all working again I'm trying to confirm that it is all working as it should be. What I can't make sense of is the STP/LAG lines.

Lag 1, using ports 1,2 is one end of the lag on the .9 switch, but the below shows 1/g2 and 1/g1 forwarding, but then 1/g2 goes into blocking before lag 1 comes up. So do we have a proper lag over both or not?

2012-11-08 10:24:48 Local7.Notice 172.16.100.9  NOV 08 10:24:47 172.16.100.9-1 TRAPMGR[152599600]: traputil.c(611) 533 %% 1/0/1 is transitioned from the Learning state to the Forwarding state in instance 0
2012-11-08 10:24:48 Local7.Notice 172.16.100.9  NOV 08 10:24:47 172.16.100.9-1 TRAPMGR[125548272]: traputil.c(611) 534 %% Spanning Tree Topology Change: 0, Unit: 1
2012-11-08 10:24:49 Local7.Notice 172.16.100.9  NOV 08 10:24:49 172.16.100.9-1 TRAPMGR[152599600]: traputil.c(611) 535 %% Link Up: 1/0/2
2012-11-08 10:24:49 Local7.Notice 172.16.100.9  NOV 08 10:24:49 172.16.100.9-1 TRAPMGR[152599600]: traputil.c(611) 536 %% 1/0/2 is transitioned from the Forwarding state to the Blocking state in instance 0
2012-11-08 10:24:52 Local7.Notice 172.16.100.9  NOV 08 10:24:52 172.16.100.9-1 TRAPMGR[152599600]: traputil.c(611) 537 %% 1/0/2 is transitioned from the Forwarding state to the Blocking state in instance 0
2012-11-08 10:24:53 Local7.Notice 172.16.100.9  NOV 08 10:24:52 172.16.100.9-1 TRAPMGR[152599600]: traputil.c(611) 542 %% Instance 0 has elected a new STP root: 8000:0021:9bbb:02f0
2012-11-08 10:24:53 Local7.Notice 172.16.100.9  NOV 08 10:24:52 172.16.100.9-1 TRAPMGR[152599600]: traputil.c(611) 543 %% 0/3/1 is transitioned from the Learning state to the Forwarding state in instance 0

The other end of the lag is on .10, and is lag 3 over ports 47,48.

2012-11-08 10:24:47 Local7.Notice 172.16.100.10  NOV 08 10:24:47 172.16.100.10-3 TRAPMGR[152557488]: traputil.c(611) 2900 %% Link Up: 3/0/47
2012-11-08 10:24:47 Local7.Notice 172.16.100.10  NOV 08 10:24:47 172.16.100.10-3 TRAPMGR[152557488]: traputil.c(611) 2901 %% 3/0/47 is transitioned from the Forwarding state to the Blocking state in instance 0
2012-11-08 10:24:47 Local7.Notice 172.16.100.10  NOV 08 10:24:47 172.16.100.10-3 TRAPMGR[152557488]: traputil.c(611) 2902 %% 3/0/47 is transitioned from the Learning state to the Forwarding state in instance 0
2012-11-08 10:24:47 Local7.Notice 172.16.100.10  NOV 08 10:24:47 172.16.100.10-3 TRAPMGR[125538992]: traputil.c(611) 2903 %% Spanning Tree Topology Change: 0, Unit: 1
2012-11-08 10:24:49 Local7.Notice 172.16.100.10  NOV 08 10:24:49 172.16.100.10-3 TRAPMGR[152557488]: traputil.c(611) 2904 %% Link Up: 3/0/48
2012-11-08 10:24:49 Local7.Notice 172.16.100.10  NOV 08 10:24:49 172.16.100.10-3 TRAPMGR[152557488]: traputil.c(611) 2905 %% 3/0/48 is transitioned from the Forwarding state to the Blocking state in instance 0
2012-11-08 10:24:52 Local7.Notice 172.16.100.10  NOV 08 10:24:52 172.16.100.10-3 TRAPMGR[152557488]: traputil.c(611) 2906 %% 3/0/48 is transitioned from the Learning state to the Forwarding state in instance 0
2012-11-08 10:24:52 Local7.Notice 172.16.100.10  NOV 08 10:24:52 172.16.100.10-3 TRAPMGR[152557488]: traputil.c(611) 2907 %% 3/0/48 is transitioned from the Forwarding state to the Blocking state in instance 0
2012-11-08 10:24:52 Local7.Notice 172.16.100.10  NOV 08 10:24:52 172.16.100.10-3 TRAPMGR[152557488]: traputil.c(611) 2908 %% 3/0/47 is transitioned from the Forwarding state to the Blocking state in instance 0
2012-11-08 10:24:52 Local7.Notice 172.16.100.10  NOV 08 10:24:52 172.16.100.10-3 TRAPMGR[152557488]: traputil.c(611) 2909 %% 0/3/3 is transitioned from the Forwarding state to the Blocking state in instance 0
2012-11-08 10:24:52 Local7.Notice 172.16.100.10  NOV 08 10:24:52 172.16.100.10-3 TRAPMGR[152557488]: traputil.c(611) 2910 %% 0/3/3 is transitioned from the Learning state to the Forwarding state in instance 0

Reading this, makes me think that both of the individual links should show as blocking, before the lag comes up, in which case I appear to be missing 1/g1 going blocking on the .9 switch.

Both ports showing blocking makes sense, in the same way as STP/MTU/etc being applied to the lag, not the ports it sits on, in which case I'm just missing the 1/g1 line showing it as blocking.

Am I understanding this correctly?

Thanks
Chris

5 Practitioner

 • 

274.2K Posts

November 8th, 2012 06:00

This all looks normal to me, especially if Port Fast is enabled. Port Fast immediately brings an interface configured as an access or trunk port to the forwarding state. I would ensure the switch firmware is up to date, monitor, and if you would like we can take a glance at the running config on the switches.

Here is the latest firmware for the switch.

www.dell.com/.../powerconnect-6248

Thanks

November 8th, 2012 06:00

Port fast isn't enabled on the switch to switch LAG (the ports/lag mentioned above) and we're at the latest firmware.

So it is normal for the underlying ports of a lag to show blocking, when the lag comes up?

5 Practitioner

 • 

274.2K Posts

November 8th, 2012 07:00

Yes this is normal operation of spanning tree protocol. The port comes up first in the blocking state, then listening, learning, and ideally then forwarding.

November 8th, 2012 07:00

But with the ports used to create a lag, they end up in blocking, not forwarding?

November 8th, 2012 08:00

That's what I figured, but I've just checked the logs on the .9 unit itself, in case the syslod server missed some entries, and can confirm that the underlying port on one of the lags's last entry, was blocking, before the lag came up.

show run = http://www.4mb.co.uk/dell/showrun9.txt

show interfaces status = http://www.4mb.co.uk/dell/showifstatus9.txt

show spanning-tree = http://www.4mb.co.uk/dell/showstp9.txt

The main stack (3 x 6248, .10) which hosts all the desktops (and one server) is linked to the .9 unit via the 2 lags, 1 for vlan1 and one for vlan50 (iscsi) although the vlans aren't setup 100% yet as they're "access" so one of the lags needs one STP end disabled to get it up.

show run = http://www.4mb.co.uk/dell/showrun10.txt

show interfaces status = http://www.4mb.co.uk/dell/showifstatus10.txt

show spanning-tree = http://www.4mb.co.uk/dell/showstp10.txt

5 Practitioner

 • 

274.2K Posts

November 8th, 2012 08:00

If the ports end up staying in a blocking state, then this is not normal. I would suggest grabbing the running config from the switches and the port status for us to look at.

These commands should get us a good start

Show run

Show interfaces status

Show spanning-tree

Thanks

5 Practitioner

 • 

274.2K Posts

November 8th, 2012 10:00

I am looking at the config and the LAGs show to be up, the ports in the LAG show to be up and forwarding. Connection should be good and operational. Once the LAG is established like it is now, have there been any further connectivity issues or ports going up and down?

The one thing I notice is something you have already brought up and that is the use of access mode on these LAGs, this will not allow for addition of more VLANs across that connection and can hinder network growth. Something like General or Trunk mode would be a better suite to connecting two switches.

November 9th, 2012 02:00

The lags show to be up and it's working, without issue, my concern/question was the logging of the underlying ports for the lag and 100% confirmation that even though the log shows the containing ports of the lag blocking, that they were in fact being used to pass traffic on the lag.

Access mode will change to general as soon as I can plan some maintenance time to reconfigure everything.

Thanks

Chris.

5 Practitioner

 • 

274.2K Posts

November 9th, 2012 06:00

During the process of the LAGs coming up It is normal to have ports go through brief periods of blocking, learning and forwarding. Once everything is up, online, and forwarding , it should stay that way as long as there is no change in the network.

If the LAGs have been up for a while and logging shows ports in the LAG continue to transition to blocking, then we may need to look deeper into why they are transitioning into blocking, after the LAG has been establish and operational.

No Events found!

Top