Recent disruptions to two undersea internet cables in the Baltic Sea have yet again...
BGP’s hold and keepalive timers, detecting dead neighbors and BFD
When the BGP session between two routers is established, the two routers exchange prefixes and then start sending traffic for those prefixes to the neighboring router.
But all good things must come to an end, including BGP sessions. When that happens, the router removes all prefixes received over the terminated session, and reroutes traffic over other paths, if those are available. It’s always possible to manually bring down a BGP session:
router bgp 65065
neighbor 192.0.2.31 shutdown
!
And then bring it back up again:
router bgp 65065
no neighbor 192.0.2.31 shutdown
!
The shutdown command can also be applied to a peergroup rather than a neighbor address, bringing down all sessions to neighbors that are members of the peergroup.
When the administrator of a neighboring router shuts down the BGP session, the neighboring router will normally send a NOTIFICATION message with the “cease” error code. Whenever a router sends the NOTIFICATION message type, that’s its last BGP message so the TCP session that BGP runs over is terminated and both routers remove the prefixes previously learned over the BGP session and reroute over alternate paths.
Obviously, it would be bad if a router kept forwarding packets to a BGP neighbor when that neighbor is no longer available, either because there’s something wrong with the neighboring router or because the connection between the routers has gone down.
The first line of defense against dead neighbors is the interface status. The various Ethernet standards as well as most non-Ethernet interfaces can detect whether there’s a signal on the wire. When that signal disappears, the interface is marked as “down” and the router will then assume all the external BGP (eBGP) sessions that run over that interface are down, too. This works really well when two BGP routers are connected over a point-to-point link without any switches between them. If there are switches between the routers, then if router A goes down, router B doesn’t notice because the switch between A and B is still up so B still sees a link active signal on its interface.
Tracking the interface status can be problematic when the interface has a tendency to “flap” and go down but come back up almost immediately. In those cases, it can be better to turn off the interface tracking mechanism:
router bgp 65065
no bgp fast-external-fallover
!
The main mechanism BGP uses to make sure neighbors are still alive is using the hold time and KEEPALIVE messages. The hold time specifies how long a router will wait for incoming BGP messages before it assumes the neighbor is dead. When a router has no UPDATE messages to send, it periodically sends KEEPALIVE messages, which, well, keep the BGP session alive.
The default value for the hold time suggested in the BGP specification (RFC 4271) is 90 seconds, and keepalives should be sent at intervals of one third the hold time (30 seconds). However, Cisco uses defaults of 180 and 60 seconds. So when two Cisco routers have established a BGP session and exchanged prefixes, 60 seconds later they’ll each send a KEEPALIVE message. Upon reception of the keepalive by the other router, that router’s hold time for the session will have counted down from 180 to 120, but it now gets reinitialized to 180. This continues every 60 seconds. However, should router A lose power, then router B won’t see any keepalives. So after 180 seconds, router B decides that router A is dead, sends a NOTIFICATION message and tears down the session.
Of course 180 seconds (three minutes) is a rather long time to keep sending traffic to a dead router. Fortunately, it’s possible to change the BGP timers, both the defaults and on a per-neighbor basis:
router bgp 65065
timers bgp 3 15
neighbor 192.0.2.31 timers 10 32
!
The timers bgp 3 15 command makes the router send keepalives every three seconds and use a hold timer of 15 seconds by default. For the session to neighbor 192.0.2.31 a keepalive interval of ten seconds is used, and a hold time of 32 seconds.
In the OPEN message, BGP routers exchange the hold time they want to use. According to the BGP RFC, a router may reject the neighbor’s hold time and refuse the establish the session, but in practice that never happens. When the session establishes, then the lower of the two hold times announced by the two routers is used by both. The keepalive interval is set to a maximum of a third of the hold time. So with timers bgp 3 15 in effect, and assuming the neighbor didn’t ask for a hold time under 15 seconds, both we and the neighbor will be using a hold time of 15 seconds. We’ll be using the configured keepalive interval of three seconds as that’s lower than 15 / 3 = 5, but the neighbor will probably use five seconds.
By using timers 10 32 rather than timers 10 30, the other side will use 32 / 3 = 10 for the keepalive interval, so we should expect keepalives after 10, 20 and 30 seconds, and after 32 seconds we should have seen all three. With timers 10 30, on the other hand, we could be tearing down the BGP session after 30 seconds just as the neighboring router is sending that third KEEPALIVE message, hence the additional two seconds.
If the hold time is zero, then no keepalives are sent and the session will never go down. Values of one or two are illegal, so the minimum working value for the hold time is three seconds. However, very short hold times (under about ten seconds) may lead to disconnected BGP sessions if BGP packets are buffered longer than expected and/or one of the routers is too busy to run the BGP process often enough to send timely keepalives.
Should you require faster rerouting after failures, you may want to use the BFD protocol along with BGP. Bidirectional Forwarding Detection (RFC 5880) is a protocol that detects whether neighboring routers are operational similar to how the BGP hold time / keepalive mechanism works. However, it’s designed to do this much faster than BGP, automatically adapting to slower systems. BFD first needs to be enabled on an interface:
interface FastEthernet2/0
bfd interval 50 min_rx 50 multiplier 5
!
In this example, BFD packets are sent every 50 milliseconds (the minimum interval allowed) and expected every 50 milliseconds. When five packets are missed (so after 250 ms) the neighbor is considered dead. BFD then must be applied to each individual BGP session as desired:
router bgp 65065
neighbor 192.0.2.31 fall-over bfd
!
BFD must be enabled on both routers in order to be used for the BGP session between them.
Boost BGP Performance
Automate BGP Routing optimization with Noction IRP
SUBSCRIBE TO NEWSLETTER
You May Also Like
From Idle to Established: BGP states, BGP ports and TCP interactions
Understanding BGP states is essential to grasp how BGP operates. Similar to interior gateway protocols (IGPs) like...
ACK and NACK in Networking
In networking, communication between devices relies on the efficient exchange of data packets. Among the essential...
BGP and asymmetric routing
What is asymmetric routing? Asymmetric routing is a network communication scenario where the forward and reverse paths...