Start a Conversation

Solved!

Go to Solution

1 Rookie

 • 

2 Posts

24

March 8th, 2024 12:30

Detection of Management Unit Failure in a Stack + Non-standard Scenarios

Hello,

I have a couple of questions regarding two SW N1124T-ON units in a stack, connected via SFP. Despite searching the manual, I couldn't find a definitive answer.

  1. How does a stack member determine if the management unit is truly dead? Specifically, I'm concerned about scenarios where the SFP connections are lost, either by physical disconnection or damage, but they remain in the network. In such a situation, if the slave switch assumes the role of the master, including the stack IP address, there will be two devices with the same address in the network. Although STP would intervene, this doesn't directly address the query.

    Additionally, in the case of overload on the main switch leading to port cycling, it's possible that hello messages still flow, indicating to failover switches that the main switch is operational, but they may not be aware of its faulty state. I understand there are safeguards in place, but there seems to be none against a faulty switch.

  2. I understand that switch configurations synchronize based on the master. When switches disconnect, the secondary becomes the primary, config changes are made on it, and then the original master reconnects. However, won't the config change made during failover be lost in such a scenario?

  3. In general, do you have experience with other, less probable, "catastrophic" scenarios?

Thank you.

Moderator

 • 

3.4K Posts

April 8th, 2024 17:32

Hello,

 

These are 2 different design decisions and will have to pick one or the other because it matters how you setup the devices that connect to the switches.

--If you have a stack – then you can have port-channels connecting to the end devices where one link can be plugged into the master and the other into the stack member. That ensures redundancy at link level and switch level.  The stack is considered one logical entity with more ports. 

--If you have 2 separate switches (that connect with 1Gb internet), you cannot have a port-channel to an end device where one link is plugged into one switch and the other link into the other switch. So you lose the redundancy and high availability benefit and the ability to manage as one. A port-channel has to go to only one switch. 

Moderator

 • 

3.4K Posts

March 8th, 2024 21:45

Hello,

 

The scenarios described above are covered by the different redundancy features implemented in the Dell N-series software. There are different design mechanisms to monitor if a component is down or failed, heartbeat/hello messages is one common mechanism, like you have mentioned. If the user has ensured that all components are redundant, we are not aware of any remaining scenarios.  Here are some of the Redundancy features:

  1. Switch redundancy - stacking feature supports a standby or backup unit that assumes the stack primary role if the stack primary fails. The backup unit continues to use the MAC addresses of the original management unit in the stack to minimize disruption.
  2. Forwarding plane redundancy - The Nonstop Forwarding (NSF) feature allows the forwarding plane of stack units to continue to forward packets while the control and management planes restart as a result of a power failure, hardware failure, or software fault on the management unit in the stack.
  3. Redundant Layer 2 network -   Multichassis Link Aggregation (MLAG) allows redundant links between two switches to be bundled together in a Link Aggregation Group (LAG) in a way that all links are active and forwarding traffic (rather than one of them being blocked by spanning-tree.). MLAG ensures reliable connectivity by preventing downtime due to link failures or misconfigurations.
  4. Link redundancy - by distributing a LAG's member ports across multiple units, the stack can quickly switch traffic from a port on a failed unit to a port on a surviving unit.
  5. Stacking cable redundancy - Stack units should always be connected with a ring topology (or other redundant topology), so that the loss of a single stack link does not divide the stack into multiple stacks.
  6. Layer 3 redundancy - Virtual Router Redundancy Protocol (VRRP) provides hosts with redundant routers

 

 

1 Rookie

 • 

2 Posts

April 8th, 2024 14:59

Thanks for the response, I'd like to clarify something further.

For simplicity, let's say someone physically disconnected the SFPs that were connecting the switches into a stack. At this moment, the stack member does not see the master, so it becomes the master itself. Nothing changes for the original master. In the network, there are thus two "identical devices". The heartbeat has no way to pass through. Could a solution then be to connect the switches with a 1GB Ethernet so they can continue to communicate?

No Events found!

Top