The Mess we call BGP

Ever wonder why BGP seems to be such a complicated protocol to administer? It seems pretty straightforward to set up. Some commands, and you have a BGP session. Easy huh? BGP is one of those things where the more BGP feeds you bring in, the more complex traffic management becomes. Why? Take a look at the following graphic.

What you are looking at is a small visualization of some of the AS connections to Hurricane Electric (AS6939) in North America. This is not all of them, just what I could fit on the screen for this article. Some of these are “transit ASes” which means they sit between Hurricane and another network or networks. This is important to understand because they can influence how your traffic reaches customers or resources on Hurricane electric if they are between you and them. The same thing goes for Hurricane Electric. They are a transit AS between companies and resources. Their policies in terms of BGP traffic can influence your traffic. This is just one AS. There are thousands and thousands of others.

Now imagine you have 4 upstream providers with various peerings and upstream peers. Each one of them can do various manipulations to the same destination. Your routers will pick the best path, but that path may have Congestion or a host of other influences on your traffic.

For myself, as a network engineer, being able to diagnose and troubleshoot path issues is an art, just as it is a science.

A tool to find out if BGP is lying to you

APNIC has a bog article on detecting “bgp lies”.
Do you ever wonder whether you can really trust other networks, such as your provider(s) and peers? More precisely, wouldn’t you like to be able to tell if the traffic you send always flows through the paths received in the Border Gateway Protocol (BGP)? Could it be that, for some prefixes, the forwarding path might differ?

BGP, a single /24 and two diverse non-connected exit points

I am starting to see the following scenario more and more as IPv4 space is hard to get, but isn’t.

With ARIN it is still possible to get an IPv4 allotment. Many smaller ISPs qualify for a /24 and can get one if they wait long enough on the ARIN waiting list. a /24 of IPv4 space is the smallest block that 99% of the Internet allows to be advertised on the Capital I Internet. There are filter rules in place that drop smaller prefixes because that is the agreed upon norm.

So what happens if you are an ISP and you have a shiny new /24 but you have two networks which are not connected. Let’s look at our scenario.

The above network have no connectivity between the two of them on the internal side. These could be half way across the world or next door. If they were half way across the world it would make sense to try and get another /24. Maybe they are either side of a big mountain or one is down in a valley and there is no way to get a decent link between the two networks.

So what is a way you can use this /24 and still be able to assign IP addresses to both sides of the network? One way is to use a tunnel between your two edge routers.

Without the tunnel the scenario is traffic could come into network1, but if the IP is assigned on network 2 it will come back as unreachable. BGP is all about networks finding the shortest path to other networks. You don’t have much control over how networks find your public IP space if you have two providers advertising the same information. Some of the Internet will come in Network2 and some will come in Network1.

By running a tunnel between the two you can now subnet out that /24 into two eqal /25s and assign one /25 Network1 and one /25 to Network2 or however you want to. You can make the tunnel a GRE, EOIP, or other tunnel type. If I am using Mikrotik I prefer to use EOIP. If it’s another vendor I tend to use GRE.

Once the tunnel is established you can use static routing, OSPF, or your favorite IGP (interior Gateway Protocol) to “tell” one side about the routes on the other side. Let’s look at a fictional use.

In the above example our fictional ISP has an IPv4 block of They have two networks separated by a tall mountain range in the center. It’s too cost prohibitive to run fiber or a wireless backhaul between the two networks so they have two different upstream providers. The ISP is advertising this /24 via BGP to Upstream1 from the Network 1 router. Network 2 router is also advertising the same /24 via BGP to Upstream 2.

We now create a Tunnel between the Mikrotiks. As mentioned before this can be EOIP, GRE, etc. We won’t go into the details of the tunnel but let’s assume the ISP is using Mikrotik. We create an EOIP tunnel (tons of tutorials out there) between Network 1 router and Network 2 router. Once this is established we will use as our “Glue” on our tunnel interfaces at each side. Network 1 router gets Network 2 router gets

To keep it simple we have a static route statement on the Network 1 Mikrotik router that looks like this:

/ip route add dst-address= gateway=

This statement routes any traffic that comes in for via ISP 1 to network1 across the tunnel to the Network 2 router. The Network 2 router then send it to the destination inside that side of the network.

Conversely, we have a similar statement in the Network 2 Mikrotik router

/ip route add dst-address= gateway=

This statement routes any traffic that comes in for via ISP 2 to network2 across the tunnel to the Network 2 router. The Network 2 router then send it to the destination inside that side of the network.

It’s as simple as that. You can apply this to any other vendor such as Cisco, Juniper, PFSense, etc. You also do not have to split the network into even /25’s like I did. You can choose to have os of the ips available on one side and route a /29 or something to the other side.

The major drawback of this scenario is you will takef a speed hit because if the traffic comes in one side and has to route across the tunnel it will have to go back out to the public internet and over to the other ISP.


Mikrotik Connection tracking and CPU usage

This content is for Patreon subscribers of the j2 blog. Please consider becoming a Patreon subscriber for as little as $1 a month. This helps to provide higher quality content, more podcasts, and other goodies on this blog.
To view this content, you must be a member of Justin's Patreon
Already a qualifying Patreon member? Refresh to access this content.

FD-IX: Local-pref and default routes

I just finished up an article over on the FD-IX blog about local-prefs, default routes, and Internet exchanges.

Not everyone on the Internet needs full feeds from their provider. In this case, how does learning routes from an Internet Exchange such as FD-IX benefit you if all you are doing is default routes?

So let’s take a scenario. You are a local hosting company. You don’t provide Internet to customers, you just do hosting of websites and data. You have a couple of providers you are buying Internet from, mainly for redundancy. One of these is primary and the other is a backup. You are doing BGP just because. All you are receiving from these providers is a default route and that is it. Why would you want to receive all these routes from an IX?

What is routing? MANRS
The Internet has over 68,000 publicly visible networks, which means it’s impractical to know about the existence of every other network or how they’re connected. Networks can also appear and disappear, whilst connections are constantly coming and going due to various faults and reconfigurations. This makes it too complex to take manual decisions about how to route packets across the Internet.

Hurricane Electric Route Filtering Algorithm

The following is from . This outlines the criteria HE.NET uses for filtering routes from peers and customers.

This is the route filtering algorithm for customers and peers that have explicit filtering:

1. Attempt to find an as-set to use for this network.
1.1 Inspect the aut-num for this ASN to see if we can extract from their IRR policy for what they would announce to Hurricane by finding export or mp-export to AS6939, ANY, or AS-ANY.
1.2 Also see if they set what looks like a valid IRR as-set name in peeringdb.

2. Collect the received routes for all BGP sessions with this ASN. This details both accepted and filtered routes.

3. For each route, perform the following rejection tests:
3.1 Reject default routes and ::/0.
3.2 Reject paths using BGP AS_SET notation (i.e. {1} or {1 2}, etc). See draft-ietf-idr-deprecate-as-set-confed-set.
3.3 Reject prefix lengths less than minimum and greater than maximum. For IPv4 this is 8 and 24. For IPv6 this is 16 and 48.
3.4 Reject bogons (RFC1918, documentation prefix, etc).
3.5 Reject exchange prefixes for all exchanges Hurricane Electric is connected to.
3.6 Reject routes that have RPKI status INVALID_ASN or INVALID_LENGTH based on the origin AS and prefix.

4. For each route, perform the following acceptance tests:
4.1 If the origin is the neighbor AS, accept routes that have RPKI status VALID based on the origin AS and prefix.
4.2 If the prefix is an announced downstream route that is a subnet of an accepted originated prefix that was accepted due to either RPKI or an RIR handle match, accept the prefix.
4.3 If RIR handles match for the prefix and the peer AS, accept the prefix.
4.4 If this prefix exactly matches a prefix allowed by the IRR policy of this peer, accept the prefix.
4.5 If the first AS in the path matches the peer and path is two hops long and the origin AS is in the expanded as-set for the peer AS and either the RPKI status is VALID or there is an RIR handle match for the origin AS and the prefix, accept the prefix.

5. Reject all prefixes not explicitly accepted

Don’t try this at home kids. Automated BGP Optimization
Conclusion? Do not try to optimize the routes with automated software – BGP is a distance-vector routing protocol that has proved, throughout the years, its ability to handle the traffic. Software, wanting to “optimize” the system involving thousands of members would never be smart enough to compute all the possible outcomes of such manipulation.