Recent disruptions to two undersea internet cables in the Baltic Sea have yet again...
In the BGP path selection algorithm, the MED comes into play when there are multiple paths to a destination prefix that have the same local preference and the same AS path length. And, unless we configure always-compare-med, MEDs for different paths are only compared for paths towards the same neighboring AS. In other words, the purpose of the MED is to select the best path when there are multiple connections between two autonomous systems. Today, we’re going to focus on optimizing traffic flow between two networks that interconnect using multiple routers connected to one internet exchange, but the same applies to interconnecting using multiple internet exchanges in the same region.
When you connect two routers to the same internet exchange—or, more generally, use two routers in the same location to connect to external networks—it’s best practice to set up BGP sessions to/from each router. So if networks A and B interconnect over an exchange and both networks have two routers connected to the IX, this will result in four BGP sessions as show in figure 1.
Figure 1: BGP sessions between two networks with two routers connected through an internet exchange.
If you then let the BGP path selection algorithm choose the best path without adjusting any of the BGP attributes, all the attributes will be the same.
So for incoming traffic, the path selection algorithm will end up relying on the penultimate tiebreaker rule: select the route(s) advertised by the BGP speaker with the lowest BGP identifier. For outgoing traffic, the decision will come down to the last tiebreaker: select the route received from the BGP speaker with the lowest BGP identifier.
On Cisco routers, the BGP identifier defaults to the highest IPv4 address configured on a loopback interface, or the highest IPv4 address configured on a physical interface if no loopback interfaces are configured.
Warning: Should the router run IPv6 only, you’ll need to configure a BGP identifier explicitly using the bgp router-id command. |
So without further action, most of both incoming and outgoing traffic will flow through the same router. This may actually be desirable when capacity isn’t an issue, as this makes the traffic flow predictable and debugging problems easier. Having all traffic flow through one router can also be useful if that router is faster, or has a higher capacity link to the IX or to the internal network. However, in that case you’ll probably want to explicitly choose which router handles the traffic rather than depend on the BGP tie breaking rules.
For incoming traffic, you can do this by advertising your prefixes with different MEDs through both routers. For instance, this configuration sets the MED to 10:
router bgp 65549
neighbor 10.0.0.1 remote-as 65550
neighbor 10.0.0.1 route-map setmed10 out
neighbor 10.0.0.2 remote-as 65550
neighbor 10.0.0.2 route-map setmed10 out
!
route-map setmed10 permit 10
set metric 10
!
With this on the primary router and an equivalent configuration that sets the MED to 20 on the backup/secondary router, all incoming traffic will flow towards the primary router. At least, if neighboring networks honor your MEDs. It’s generally considered polite to accept the preference of neighboring networks unless you have a good reason to overrule this preference.
Note: Some networks simply overwrite the MED values they receive, so sending MEDs to influence traffic flow won’t work towards those networks. |
Also, remember that the MED only survives one inter-AS hop. So sending different MEDs towards a neighboring AS will influence how you receive traffic from that AS, but networks that are two or more AS hops away won’t see the MEDs; if you want to influence routing decisions in those networks, as well as in networks that ignore advertised MEDs, you’ll have to perform AS path prepending or announce more specifics.
If we want to influence outgoing traffic, we’ll have to adjust incoming MEDs. This could be done by applying the same setmed10 route map from the example above as in on a BGP session. However, that makes you one of these impolite networks that doesn’t honor their neighbor’s MEDs. So usually, it’s better to adjust the MEDs your neighbor sends you rather than simply overwrite them:
router bgp 65549
neighbor 10.0.0.1 remote-as 65550
neighbor 10.0.0.1 route-map addmed10 in
neighbor 10.0.0.2 remote-as 65550
neighbor 10.0.0.2 route-map addmed10 in
!
route-map addmed10 permit 10
set metric +10
!
With this configuration in effect and set metric +20 on the backup router, if your neighbor sends you a MED of 0 or no MED, the MED will be 10 for prefixes received by the primary router and 20 for those prefixes on the backup router. So other routers in the network will send traffic to the destinations in question through the primary router. Even if the traffic ends up at the backup router, the backup router will send it to the primary router rather than deliver it to the neighboring AS itself, as the MED step in the BGP path selection algorithm comes in just before the “prefer eBGP over iBGP” step.
If the neighboring network has a similar configuration, they will be sending an MED of 10 from their primary router and 20 from their secondary router. This will result in the following MED values:
Local router | Local MED | Remote router | Remote MED | Resulting MED |
R1 | +10 | R1 | 10 | 20 |
R1 | +10 | R2 | 20 | 30 |
R2 | +20 | R1 | 10 | 20 |
R2 | +20 | R2 | 20 | 40 |
So under normal circumstances, the traffic will flow from R1 to R1, with either R1 to R2 or R2 to R1 as a backup when there is an outage, and only from R2to R2 as a last resort.
If you don’t really care that much which router handles the traffic, you may choose to optimize traffic flow slightly by prioritizing shorter, more local paths. For instance, suppose the IX is present in locations A and B, and you’re connected in location A. Some of your peers will have routers in both locations A and B. All else being equal, it would be somewhat more preferred to exchange traffic with a neighboring AS through their router in location A. The path will be somewhat shorter, there is less layer 2 equipment in the middle that could fail, and if your router and the remote router are connected to the same Ethernet switch, the only bottleneck between the routers (other than their IX connection) will be the backplane of the IX switch. Switch backplanes are rarely a bottleneck, but the backhaul link between the IX switches in two locations could be. So in this case, you’ll have to determine the location for each neighboring AS’ router and apply different MEDs based on that like in this configuration:
router bgp 65549
neighbor ix-location-a peer-group
neighbor ix-location-a route-map addmed10 in
neighbor ix-location-a route-map setmed10 out
neighbor ix-location-b peer-group
neighbor ix-location-b route-map addmed20 in
neighbor ix-location-b route-map setmed20 out
neighbor 10.0.0.1 remote-as 65550
neighbor 10.0.0.1 peer-group ix-location-a
neighbor 10.0.0.2 remote-as 65550
neighbor 10.0.0.2 peer-group ix-location-b
!
Note that this configuration is equally useful if you only have a single router connected to the internet exchange.
If you have two routers and optimizing traffic flow to keep it local isn’t relevant, you can use a similar configuration to better balance traffic between the two routers as follows. On one router, we apply the same MED to all neighbors to have a consistent baseline:
route-map addmed15 permit 10
set metric +15
!
route-map setmed15 permit 10
set metric 15
!
Then, on the other router, distribute the neighbors over peer groups med10 and med20 until traffic is reasonably balanced:
router bgp 65549
neighbor med10 peer-group
neighbor med10 route-map addmed10 in
neighbor med10 route-map setmed10 out
neighbor med20 peer-group
neighbor med20 route-map addmed20 in
neighbor med20 route-map setmed20 out
neighbor 10.0.0.1 remote-as 65550
neighbor 10.0.0.1 peer-group med10
neighbor 10.0.0.2 remote-as 65550
neighbor 10.0.0.2 peer-group med20
!