Recent disruptions to two undersea internet cables in the Baltic Sea have yet again...
Is BGP multi-homing enough for WAN network performance?
BGP routing information, is largely based on AS hops, and manually configured static preferences. BGP has no capability to discover any other performance characteristics . As a result, metrics such as packet loss, latency, throughput, link capacity and congestion, historical reliability, and other business characteristics are not addressed by this protocol. BGP has no ability to actively discover any of these characteristics, and thus it has no ability to make routing decisions based on them. The routers relying on BGP cannot make dynamic performance-optimized decisions.
Settlement-free peering and best-effort traffic delivery are vital for the efficiency and relatively low cost of operating and connecting to the Internet. The best-effort hovewer has its flaws – congestion. Congestion occurs because of some transit providers port oversubscription, ddos attacks, daily peaks and even congested public traffic exchanges. Other problems can be caused by BGP’s inherent sense of trust between peering partners. This implied trust means that all route updates are considered valid and are treated as such. Hovewer, due to convergence delay, misconfiguration, external protocol interaction and lots of other reasons, not all updates are valid. Invalid updates in the worst cases can lead to routing loops or blackouts. Blackouts happen during an outage in a transit provider network, while the upstream provider still announces the routes to their customers, making them send the traffic in a blackhole. If the blackout is total, the network engineers will notice this and shutdown the BGP session. A partial routing blackout is hard to diagnose and troubleshoot, because of the routing asymmetry in Internet.
Since BGP is focused on reachability and its own stability, in case some problems occur the traffic may only be rerouted due to hard failures. Hard failures are total losses of reachability as opposed to degradation. This means that even though service may be so degraded that it is unusable for an end user, BGP will continue to assume that a degraded route is valid until and unless the route is invalidated by a total lack of reachability. BGP as a dynamic routing protocol is, unfortunately, reactive, only in cases of total failure.
Multi-homing avoids downtime by providing redundancy, however it does not address performance and congestion-related problems that occur in the “middle-mile”, linking backbone networks. Therefore, simple BGP multihoming is not enough.
Boost BGP Performance
Automate BGP Routing optimization with Noction IRP
SUBSCRIBE TO NEWSLETTER
You May Also Like
From Idle to Established: BGP states, BGP ports and TCP interactions
Understanding BGP states is essential to grasp how BGP operates. Similar to interior gateway protocols (IGPs) like...
ACK and NACK in Networking
In networking, communication between devices relies on the efficient exchange of data packets. Among the essential...
BGP and asymmetric routing
What is asymmetric routing? Asymmetric routing is a network communication scenario where the forward and reverse paths...