Unsolved

1861

March 31st, 2021 01:00

Dell OS10 - BGP Idle Bug?

Hello,

I'm currently running Dell OS10-Enterprise (10.5.2.3) on some Dell S5248F-ON switches and I'm experience a weird issue where BGP sessions are stuck in an IDLE state for no apparent reason and are not actively attempting to re-establish a BGP session when the session does down, and I was wondering if anyone else has experienced similar problems? 

When initially configuring the BGP peer this establishes without any problems, although when the neighbouring BGP peer's interface flaps or the node is rebooted, or we perform a manual shutdown/no shutdown on the BGP peer; the BGP session obviously goes down, although, it will constantly stay in an "IDLE" state on my switch and will never attempt to re-establish a session.

A packet capture shows the neighbouring peer initiate/establish the TCP handshake and send the initial OPEN message but the OS10-Enterprise switch responds with an RST packet which is expected if our BGP state is stuck in IDLE.

It's definitely not a Layer 1-3 issue since we have routes to the BGP peer and ARP/ping is working perfectly fine without any problems. Also our control plane ACLs are fine as well with the traffic being permitted.

Performing a shutdown/no shutdown or a "clear ip bgp x.x.x.x" on the neighbour on the OS10 switch still does not resolve the problem and the neighbour remains in an IDLE state.

It's only when I delete and re-configure the BGP neighbour again does it resolve the problem and we can bring the BGP peer online. Although, if the BGP peer was to go down again it would remain in IDLE.

 

7 Practitioner

 • 

9.7K Posts

 • 

48K Points

March 31st, 2021 13:00

Kernelroute,

 

 

Would you confirm if you have upgraded firmware recently, and if so was this  working on an older firmware? If so then would you please Private Message me the svc tag. 

 

Let me know.

 

Thanks.

 

 

8 Posts

April 5th, 2021 12:00

Don't know why I was having the same problem inspite of all updated drivers and all.

April 6th, 2021 01:00

It is due to Dell's implementation of BGP; during the Idle state, it is not listening for any inbound BGP connections from the peer when it's actively trying to establish a TCP connection with the peer.

This becomes a problem when the neighbouring peer is not accepting TCP/179 inbound and is silently dropping any inbound connections (iptables DROP rule).

Just make sure the remote peer is accepting TCP/179 inbound and you should be able to bounce your sessions and see them reconnect. 

0 events found

No Events found!

Top