Hey guys hoping to get some insight from the community on this. We run a 2 member N4064 Stack in one of our datacenters and yesterday the master unit locked up completely. We accessed the member via the console port and there was a few characters visible however non responsive. Thankfully all of the traffic failed over to the 2nd member seamlessly. I contacted the datacenter and had them powercycle the first unit and immediately started displaying the boot sequence on the console.
We had another stacking issue back in February which prompted us to upgrade to 126.96.36.199. That issue was much worse because traffic did not failover to the responsive member properly. Things have been fine since we completed that last maintenance.
I am aware of the newer branch of the the firmware and I am wondering if the fixes for some of the stacking issues in the newer branch would be appropriate to handle the issue we had happen yesterday. Appreciate any insight the group may have on this. I am trying to add the relevant logs however they keep getting marked as spam. Will work on that-
INFO Backup manager removed.
INFO Backup unit gone
INFO No Potential unit to configure as Standby when unit 1 left
NOTE Unit 1 is removed from the stack
NOTE Stack-port link down: Index: 230 Unit: 2 Tag: Fo2/0/1
NOTE Unit 2 identified a link failure in the stack. Fo2/0/1
I have been having stacking issues with Dells N4000 series for literally years (6x N4000 in a single stack). Quite literally nearly every release not for each firmware upgrade says it addresses this issue.
Do not upgrade to 6.5.x unless you are trying to stack an N4000 and another model of switch. I have had over 15 tickets with Pro Support on the issues that come with the 6.5.x firmware.
Dell Support is the one who told me this "Do not upgrade to 6.5.x unless you are trying to stack an N4000 and another model of switch."