Please post the config file the recent versions of firmware that does not work for you.
Here is all the information I collected during my tests this morning:
With the working firmware (1.0.0.47):
Working Configuration:
interface ethernet g17 channel-group 2 mode auto exit interface ethernet g18 channel-group 2 mode auto exit interface ethernet g21 channel-group 3 mode auto exit interface ethernet g22 channel-group 3 mode auto exit interface vlan 1 ip address 172.16.10.249 255.255.255.0 exit ip default-gateway 172.16.10.1 logging buffered debugging aaa authentication enable default line aaa authentication login default line ip ssh server ip https server clock source sntp sntp unicast client enable sntp server 172.16.10.180
LACP information on the switch:
console# sh lacp port-channel 2 Port-Channel ch2 Port Type Gigabit Ethernet Attached Lag id: Actor System Priority:1 MAC Address: 00:13:72:4a:bf:d2 Admin Key: 26 Oper Key: 26 Partner System Priority:65535 MAC Address: 00:30:48:70:43:48 Oper Key: 17
# one of the port part of port-channel 2:
console# sh lacp ethernet g17 g17 LACP parameters: Actor system priority: 1 system mac addr: 00:13:72:4a:bf:d2 port Admin key: 26 port Oper key: 26 port Oper number: 17 port Admin priority: 1 port Oper priority: 1 port Admin timeout: LONG port Oper timeout: LONG LACP Activity: ACTIVE Aggregation: AGGREGATABLE synchronization: TRUE collecting: TRUE distributing: TRUE expired: FALSE Partner system priority: 65535 system mac addr: 00:30:48:70:43:48 port Admin key: 0 port Oper key: 17 port Oper number: 1 port Admin priority: 0 port Oper priority: 255 port Oper timeout: LONG LACP Activity: ACTIVE Aggregation: AGGREGATABLE synchronization: TRUE collecting: TRUE distributing: TRUE expired: FALSE g17 LACP statistics: LACP Pdus sent: 5609 LACP Pdus received: 5586 g17 LACP Protocol State: LACP State Machines: Receive FSM: Current State Mux FSM: Collecting Distributing State Periodic Tx FSM: Slow Periodic State Control Variables: BEGIN: FALSE LACP_Enabled: TRUE Ready_N: FALSE Selected: SELECTED Port_moved: FALSE NNT: FALSE Port_enabled: TRUE Timer counters: periodic tx timer: 6 current while timer: 65 wait while timer: 0
Still same config, working firmware, the linux server gives the following information
Bonding Mode: IEEE 802.3ad Dynamic link aggregation Transmit Hash Policy: layer2 (0) MII Status: up MII Polling Interval (ms): 100 Up Delay (ms): 200 Down Delay (ms): 200
802.3ad info LACP rate: slow Active Aggregator Info: Aggregator ID: 1 Number of ports: 2 Actor Key: 17 Partner Key: 26 Partner Mac Address: 00:13:72:4a:bf:d2
Slave Interface: eth0 MII Status: up Link Failure Count: 0 Permanent HW addr: 00:30:48:70:43:48 Aggregator ID: 1
Slave Interface: eth1 MII Status: up Link Failure Count: 0 Permanent HW addr: 00:30:48:70:43:49 Aggregator ID: 1
Now the same information with firmaware 2.0.1.03 (non working LACP)
Configuration
interface range ethernet g(17-18) channel-group 2 mode auto exit interface range ethernet g(21-22) channel-group 3 mode auto exit interface range ethernet g(17-18,21-22) lacp timeout short exit interface vlan 1 ip address 172.16.10.249 255.255.255.0 exit ip default-gateway 172.16.10.1 logging buffered debugging aaa authentication enable default line aaa authentication login default line ip ssh server ip https server clock source sntp sntp unicast client enable sntp server 172.16.10.180
Note: the switch when converting the config to version supported by the firmware transformed my port-channels in ON mode instead of AUTO. I manually changed the config to have LACP on. It also added a lacp timeout short for the 4 involved ports. I tried with the LONG timeout, but it doesn't change anything.
Switch LACP information
console# sh lacp port-channel 2 Port-Channel ch2 Port Type Gigabit Ethernet Attached Lag id: Actor System Priority:1 MAC Address: 00:13:72:4a:bf:d2 Admin Key: 26 Oper Key: 26 Partner System Priority:0 MAC Address: 00:00:00:00:00:00 Oper Key: 0 console# sh lacp ethernet g17 g17 LACP parameters: Actor system priority: 1 system mac addr: 00:13:72:4a:bf:d2 port Admin key: 26 port Oper key: 26 port Oper number: 17 port Admin priority: 1 port Oper priority: 1 port Admin timeout: SHORT port Oper timeout: SHORT LACP Activity: ACTIVE Aggregation: INDIVIDUAL synchronization: FALSE collecting: FALSE distributing: FALSE expired: FALSE Partner system priority: 65535 system mac addr: 00:30:48:70:43:48 port Admin key: 0 port Oper key: 17 port Oper number: 1 port Admin priority: 0 port Oper priority: 255 port Oper timeout: LONG LACP Activity: ACTIVE Aggregation: AGGREGATABLE synchronization: TRUE collecting: FALSE distributing: FALSE expired: FALSE g17 LACP statistics: LACP Pdus sent: 2 LACP Pdus received: 56 g17 LACP Protocol State: LACP State Machines: Receive FSM: Current State Mux FSM: Detached State Periodic Tx FSM: Slow Periodic State Control Variables: BEGIN: FALSE LACP_Enabled: TRUE Ready_N: FALSE Selected: UNSELECTED Port_moved: FALSE NNT: FALSE Port_enabled: TRUE Timer counters: periodic tx timer: 4 current while timer: 2 wait while timer: 0
Bonding Mode: IEEE 802.3ad Dynamic link aggregation Transmit Hash Policy: layer2 (0) MII Status: up MII Polling Interval (ms): 100 Up Delay (ms): 200 Down Delay (ms): 200
802.3ad info LACP rate: slow Active Aggregator Info: Aggregator ID: 2 Number of ports: 2 Actor Key: 17 Partner Key: 26 Partner Mac Address: 00:13:72:4a:bf:d2
Slave Interface: eth0 MII Status: up Link Failure Count: 5 Permanent HW addr: 00:30:48:70:43:48 Aggregator ID: 2
Slave Interface: eth1 MII Status: up Link Failure Count: 5 Permanent HW addr: 00:30:48:70:43:49 Aggregator ID: 2
So it seems the linux server seems to see the switch, but not the reverse
Any idea?
the switch has the following characteristics: (taken after a reload to 1.0.0.47 and a new fresh configuration)
********************************************************************* *** Running SW Ver. 1.0.0.47 Date 02-Nov-2005 Time 09:53:18 *** *********************************************************************
HW version is 00.00.02 Base Mac address is: 00:13:72:4a:bf:d2 Dram size is : 64M bytes Dram first block size is : 40960K bytes Dram first PTR is : 0x1800000 Flash size is: 16M Loading running configuration. Number of configuration items loaded: 0
Loading startup configuration. Number of configuration items loaded: 0 Device configuration: Prestera based system Slot 1 - powerConnect 5324 HW Rev. 0.0 Tapi Version: v1.2.10 Core Version: v1.2.10
I set this up (no with Linux, with another switch) and I was unable to reproduce. There is one odd thing I notice in your data for the failing case on the 5324:
Aggregation: INDIVIDUAL
The mib says: "Returns a value of TRUE if more than 1 port is configured in the channel; otherwise, returns a value of FALSE. "
The config file obviously has 2 ports in this port-channel. So I suspect the automatic configuration file conversion when going from 1.0.0.47 to 2.0.1.3
Can you load 2.0.1.3, delete the confi file, reboot, manually type in only the LACP config lines and see if it works with Linux?
I did the test with against a newer model dell switch (5448).
If you can capture and compare LACP BPDUs between the 1.0.0.47 and 2.0.1.3 firmware, that would be interesting data.
Since the config file conversion is giving odd results, I think deleting the config with 2.0.1.3 and then re-entering a minimal config will give us some good data.
Do you have a newer Dell switch you can try? 54xx or 35xx?
I set this up (no with Linux, with another switch) and I was unable to reproduce.
is it with another Dell switch or another brand/model?
I suspect the switch doesn't conform exactly as the standard. I don't have a proof of that of course, I'd have to capture the 802.3ad PDU to see that, which I'll certainly do if that's possible to capture such packets from a non-active bond on linux (or individual members).
There is one odd thing I notice in your data for the failing case on the 5324:
Aggregation: INDIVIDUAL
The mib says: "Returns a value of TRUE if more than 1 port is configured in the channel; otherwise, returns a value of FALSE. "
The config file obviously has 2 ports in this port-channel. So I suspect the automatic configuration file conversion when going from 1.0.0.47 to 2.0.1.3
Can you load 2.0.1.3, delete the confi file, reboot, manually type in only the LACP config lines and see if it works with Linux?
Yes, I'll try this tomorrow morning (the switch is used as a core switch for a small network so I can't interrupt traffic whenever I want).
The thing is, after conversion, the port-channel is in mode ON (no LACP), so I have to remove the individual ethernets from the port-channel and form again the port-channel. OK, that's not quite the same thing as a full erase.
I don't have support anymore on this switch, but is there a way I can submit a bug report to dell, since I don't think the engineers who designed the firmware browse this forum?
For an indepth the default LAG hashing algorithm used by most switches (this is written for a blade server switch, but the concepts are simple enough to apply to a standalone switch. The switch in this doc and the 5324 have same hashing options):
I upgraded a PowerConnect 5324 switch to version 2.0.1.4 (boot version 1.0.2.02) and three Linux hosts running LACP stopped working. Restarting the switch didn't help. Restarting the Linux hosts didn't help.
YUK WORKAROUND (as suggested above):
- delete the startup configuration
- restart the switch
- restore the configuration
I manually cut-n-pasted small sections of the configuration through the console.
bh1633
909 Posts
0
February 9th, 2009 13:00
Did you update the bootcode when you updated the firmware?
bh1633
909 Posts
0
February 9th, 2009 14:00
Please post the config file the recent versions of firmware that does not work for you.
Masterzen
7 Posts
0
February 9th, 2009 14:00
Yes, otherwise you get a non-bootable switch and you have to connect with a serial cable to clean the mess :-)
Masterzen
7 Posts
0
February 11th, 2009 00:00
Hi,
Here is all the information I collected during my tests this morning:
With the working firmware (1.0.0.47):
Working Configuration:
LACP information on the switch:
Still same config, working firmware, the linux server gives the following information
Now the same information with firmaware 2.0.1.03 (non working LACP)
Configuration
Note: the switch when converting the config to version supported by the firmware transformed my port-channels in ON mode instead of AUTO. I manually changed the config to have LACP on. It also added a lacp timeout short for the 4 involved ports. I tried with the LONG timeout, but it doesn't change anything.
Switch LACP information
and on the linux side
So it seems the linux server seems to see the switch, but not the reverse
Any idea?
the switch has the following characteristics: (taken after a reload to 1.0.0.47 and a new fresh configuration)
Thanks for any help!
bh1633
909 Posts
0
February 11th, 2009 09:00
Odd.
I set this up (no with Linux, with another switch) and I was unable to reproduce. There is one odd thing I notice in your data for the failing case on the 5324:
Aggregation: INDIVIDUAL
The mib says: "Returns a value of TRUE if more than 1 port is configured in the channel; otherwise, returns a value of FALSE. "
The config file obviously has 2 ports in this port-channel. So I suspect the automatic configuration file conversion when going from 1.0.0.47 to 2.0.1.3
Can you load 2.0.1.3, delete the confi file, reboot, manually type in only the LACP config lines and see if it works with Linux?
bh1633
909 Posts
0
February 11th, 2009 10:00
I did the test with against a newer model dell switch (5448).
If you can capture and compare LACP BPDUs between the 1.0.0.47 and 2.0.1.3 firmware, that would be interesting data.
Since the config file conversion is giving odd results, I think deleting the config with 2.0.1.3 and then re-entering a minimal config will give us some good data.
Do you have a newer Dell switch you can try? 54xx or 35xx?
Masterzen
7 Posts
0
February 11th, 2009 10:00
is it with another Dell switch or another brand/model?
I suspect the switch doesn't conform exactly as the standard. I don't have a proof of that of course, I'd have to capture the 802.3ad PDU to see that, which I'll certainly do if that's possible to capture such packets from a non-active bond on linux (or individual members).
Yes, I'll try this tomorrow morning (the switch is used as a core switch for a small network so I can't interrupt traffic whenever I want).
The thing is, after conversion, the port-channel is in mode ON (no LACP), so I have to remove the individual ethernets from the port-channel and form again the port-channel. OK, that's not quite the same thing as a full erase.
I don't have support anymore on this switch, but is there a way I can submit a bug report to dell, since I don't think the engineers who designed the firmware browse this forum?
Thanks for your help.
Masterzen
7 Posts
0
February 12th, 2009 08:00
Hi,
I removed the configuration before upgrading as you suggested and indeed it fixed the issue!
Thank you very much for your help.
Now I have to convince the switch to load balance the traffic :-)
bh1633
909 Posts
0
February 12th, 2009 08:00
Look at the "Configuring Load Balancing" section of this doc:
http://support.dell.com/support/edocs/network/pc5324/en/UG_Ad/UGAdd.pdf
For an indepth the default LAG hashing algorithm used by most switches (this is written for a blade server switch, but the concepts are simple enough to apply to a standalone switch. The switch in this doc and the 5324 have same hashing options):
http://support.dell.com/support/edocs/network/LAG1855/LAGConsiderationv0.5.pdf
gb1001
1 Message
0
May 5th, 2013 04:00
I upgraded a PowerConnect 5324 switch to version 2.0.1.4 (boot version 1.0.2.02) and three Linux hosts running LACP stopped working. Restarting the switch didn't help. Restarting the Linux hosts didn't help.
YUK WORKAROUND (as suggested above):
- delete the startup configuration
- restart the switch
- restore the configuration
I manually cut-n-pasted small sections of the configuration through the console.