BGP Monitoring RFC 7854

https://tools.ietf.org/html/rfc7854

   This document defines the BGP Monitoring Protocol (BMP), which can be
   used to monitor BGP sessions.  BMP is intended to provide a
   convenient interface for obtaining route views.  Prior to the
   introduction of BMP, screen scraping was the most commonly used
   approach to obtaining such views.  The design goals are to keep BMP
   simple, useful, easily implemented, and minimally service affecting.
   BMP is not suitable for use as a routing protocol.

How I learned to love BGP communities and so can you

This content is for Patreon subscribers of the j2 blog. Please consider becoming a Patreon subscriber for as little as $1 a month. This helps to provide higher quality content, more podcasts, and other goodies on this blog.
To view this content, you must be a member of Justin Wilson's Patreon at "Access to patro..." or higher tier
Already a Patreon member? Refresh to access this post.

BGP Confederations

In network routing, BGP confederation is a method to use Border Gateway Protocol (BGP) to subdivide a single autonomous system (AS) into multiple internal sub-AS’s, yet still advertise as a single AS to external peers. This is done to reduce the number of entries in the iBGP routing table.  If you are familiar with breaking OSPF domains up into areas, BGP confederations are not that much different, at least from a conceptual view.

And, much like OSPF areas, confederations were born when routers had less CPU and less ram than they do in today’s modern networks. MPLS has superseded the need for confederations in many cases. I have seen organizations, who have different policies and different admins break up their larger networks into confederations.  This allows each group to go their own directions with routing policies and such.

if you want to read the RFC:https://tools.ietf.org/html/rfc5065

The problem with peering from a logistics standpoint

Many ISPs run into this problem as part of their growing pains.  This scenario usually starts happening with their third or 4th peer.

Scenario.  ISP grows beyond the single connection they have.  This can be 10 meg, 100 meg, gig or whatever.  They start out looking for redundancy. The ISP brings in a second provider, usually at around the same bandwidth level.  This way the network has two pretty equal paths to go out.

A unique problem usually develops as the network grows to the point of peaking the capacity of both of these connections.  The ISP has to make a decision. Do they increase the capacity to just one provider? Most don’t have the budget to increase capacities to both providers. Now, if you increase one you are favoring one provider over another until the budget allows you to increase capacity on both. You are essentially in a state where you have to favor one provider in order to keep up capacity.  If you fail over to the smaller pipe things could be just as bad as being down.

This is where many ISPs learn the hard way that BGP is not load balancing. But what about padding, communities, local-pref, and all that jazz? We will get to that.  In the meantime, our ISP may have the opportunity to get to an Internet Exchange (IX) and offload things like streaming traffic.  Traffic returns to a little more balance because you essentially have a 3rd provider with the IX connection. But, they growing pains don’t stop there.

As ISP’s, especially WISPs, have more and more resources to deal with cutting down latency they start seeking out better-peered networks.  The next growing pain that becomes apparent is the networks with lots of high-end peers tend to charge more money.  For the ISP to buy bandwidth, they usually have to do it in smaller quantities from these types of providers. Buying this way introduces the probably of a mismatched pipe size again with a twist. The twist is the more, and better peers a network has the more traffic is going to want to travel to that peer. So, the more expensive peer, which you are probably buying less of, now wants to handle more of your traffic.

So, the network geeks will bring up things like padding, communities, local-pref, and all the tricks BGP has.  But, at the end of the day, BGP is not load balancing.  You can *influence* traffic, but BGP does not allow you to say “I want 100 megs of traffic here, and 500 megs here.”  Keep in mind BGP deals with traffic to and from IP blocks, not the traffic itself.

So, how does the ISP solve this? Knowing about your upstream peers is the first thing.  BGP looking glasses, peer reports such as those from Hurricane Electric, and general news help keep you on top of things.  Things such as new peering points, acquisitions, and new data centers can influence an ISPs traffic.  If your equipment supports things such as NetFlow, sflow, and other tools you can begin to build a picture of your traffic and what ASNs it is going to. This is your first major step. Get tools to know what ASNs the traffic is going to   You can then take this data, and look at how your own peers are connected with these ASNs.  You will start to see things like provider A is poorly peered with ASN 2906.

Once you know who your peers are and have a good feel on their peering then you can influence your traffic.  If you know you don’t want to send traffic destined for ASN 2906 in or out provider A you can then start to implement AS padding and all the tricks we mentioned before.  But, you need the greater picture before you can do that.

One last note. Peering is dynamic.  You have to keep on top of the ecosystem as a whole.

Transit, peer, upstream. What do they all mean?

As a service provider, you have a mountain of terms to deal with. As you dive into the realm of BGP, you will hear many terms in regards to peers.  Knowing their names AND your definition of them will serve you well.  I emphasized the and in the last sentence because many people have different definitions of what these terms means. This can be due to how long they have been dealing with networks, what they do with them, and other such things.  For example, many content providers use the term transit differently than an ISP.  So, let’s get on to it.

Transit or upstream
This is what you will hear most often.  A transit peer is someone who you go “through” in order to reach the internet.  You transit their network to reach other networks.  Many folks use the term “upstream provider” when talking about someone they buy their internet from.

Downstream
Someone who is “downstream” is someone  you are providing Internet to.  They are “transiting” your network to reach the Internet.  This is typically someone you are selling Internet to.

Peer
This is the term which probably needs the most clarification when communicating with others about how your BGP is setup.  A peer is most often used as a generic term, much like Soda (or pop depending on where you are from). For example someone could say:
“I have a peer setup with my upstream provider who is Cogent.” This is perfectly acceptable when used with the addition of “my upstream provider”.  Peers are often referred to as “neighbors” or “BGP neighbors”.

Local or Private Peer
So what is a local peer? A local peer is a network you are “peering” with and you are only exchanging routes which are their own or their downstream networks.  A local peer usually happens most often at an Internet Exchange (IX) but can happen in common points where networks meet. The most important thing that defines a local peer is you are not using them to reach IP space which is not being advertised form their ASN.   Your peering relationship is just between the two of you. This gets a little muddy when you are peering on an IX, but thats being picky.

I have trained myself to qualify what I mean by a peer when talking about them. I will often say a “transit peer” or a “local peer”. This helps to add a little bit of clarity to what you mean.

Why is this all important? For one, it helps with keeping everyone on the same page when talking about peering.  I had a case a few weeks ago where a Content provider and I wasted configuration time because our definition of transit was different.  Secondly, you want to be able to classify your peers so you can apply different filter rules to them. For example, with a downstream peer you only want to accept the IP space they have shown you which is their own.  That way you are not sending your own transit traffic over their network. This would be bad.  However, if you are accepting full routes from your transit provider, you want your filters to accept much more IP than a downstream provider. So if you have a team being able to be on the same page about peers will help when it comes to writing filters, and how your routers “treat” the peer in terms of access lists, route filters, etc.

BGP communities and traffic engineering

BGP communities can be powerful, but an almost mystical thing.  If you aren’t familiar with communities start here at Wikipedia.  For the purpose of part one of this article, we will talk about communities and how they can be utilized for traffic coming into your network. Part two of this article will talk about applying what you have classified to your peers.

So let’s jump into it.  Let’s start with XYZ ISP. They have the following BGP peers:

-Peer one is Typhoon Electric.  XYZ ISP buys an internet connection from Typhoon.
-Peer two is Basement3. XYZ ISP also buy an internet connection from Basement3
-Peer three is Mauler Automotive. XYZ ISP sells internet to Mauler Automotive.
-Peer four is HopOffACloud web hosting.  XYZ ISP and HopOffACloud are in the data center and have determined they exchange enough traffic amongst their ASN’s to justify a dedicated connection between them.
-Peer five is the local Internet exchange (IX) in the data center.

So now that we know who our peers are, we need to assign some communities and classify who goes in what community.  The Thing to keep in mind here, is communities are something you come up with. There are common numbers people use for communities, but there is no rule on what you have to number your communities as. So before we proceed we will need to also know what our own ASN is.  For XYZ we will say they were assigned AS64512. For those of you who are familiar with BGP, you will see this is a private ASN.  I just used this to lessen any confusion.  If you are following along at home replace 65412 with your own ASN.

So we will create four communities .

64512:100 = transit
64512:200 = peers
64512:300 = customers
64512:400 = my routes

Where did we create these? For now on paper.

So let’s break down each of these and how they apply to XYZ network. If you need some help with the terminology see this previous post.
64512:100 – Transit
Transit will apply to Typhoon Electric and Basement3.  These are companies you are buying internet transit from.

64512:200 – Peers
Peers apply to HopOffACloud and the IX. These are folks you are just exchanging your own and your customer’s routes with.

64512:300 – Customers
This applies to Mauler Automotive.  This is a customer buying Internet from you. They transit your network to get to the Internet.

64512:200 – Local
This applies to your own prefixes.  These are routes within your own network or this particular ASN.

Our next step is to take the incoming traffic and classify into one of these communities. Once we have it classified we can do stuff with it.

If we wanted to classify the Typhoon Electric traffic we would do the following in Mikrotik land:

/routing filter
add action=passthrough chain=TYPHOON-IN prefix=0.0.0.0/0 prefix-length=0-32 set-bgp-communities=64512:100 comment="Tag incoming prefixes with :100"

This would go at the top of your filter chain for the Typhoon Electric peer.  This simply applies 64512:100 to the prefixes learned from Typhoon.

In Cisco Land our configuration would look like this:

route-map Typhoon-in permit 20  
match ip address 102  
set community 64512:100

The above Cisco configuration creates a route map, matches a pre-existing access list named 102, and applies community 64512:100 to prefixes learned.

For Juniper you can add the following command to an incoming peer in policy-options:

set community Typhoon-in members 64512:100

Similar to the others you are applying this community to a policy.

So what have we done so far, we have taken the received prefixes from Typhoon Electric and applied community 64512:100 to it.  This simply puts a classifier on all traffic from that peer. We could modify the above example to classify traffic from our other peers based upon what community we want them tagged as.

In our next segment we will learn what we can do with these communities.

Are you ready for 768k day?

As the global routing table increases, routers use more and more memory to hold these tables in memory. Most routers use what is called TCAM memory to hold routing tables. TCAM memory is much faster than normal RAM, which makes it ideal for accessing large routing tables on a router.  However, TCAM is generally viewed as a more expensive type of memory.

According to the CIDR report, the global routing table as of April 15 2019 was 772,711 routes. But Justin, you are warning me about 768,000 routes.  This is more than that already.  The short answer is many providers attempt to do some sort of aggregation with prefixes, which shrinks this number.

So why is this number important? in 2014 a similar situation arose.  This was called “512k day”.  Many vendors released patches and advisories recommending folks to raise their limit to 768,000 routes.  Why not just raise this to a million you say? Remember that TCAM memory is expensive so it’s not like normal ram.  As more and more folks run IPV6, it takes away TCAM memory and allocates it to the IPV6 routing table. In the “old days” we just had to worry about one IPV4 table taking up the TCAM memory. Now we have an ever-growing IPV6 table, which takes up memory as well. 768,000 was recommended by many vendors as a decent tradeoff of memory utilization.

Many experts do not expect this 768k day to be as service impacting as 512k day was.  Firmware updates, newer hardware, and increased awareness are some factors more operators are aware of.  However, there is a bunch of older hardware out there. One of the biggest concerns is the TCAM memory in the Cisco 6500 and 7600 routing platforms. These platforms simply do not have more memory to allocate.

If you own a 6500/7600 platform and are taking in full routes there are a few things you can do to help mitigate this. Obviously upgrading hardware is a choice.  Not everyone can do that.  One of the methods of dealing with this is to receive a default route from your upstream providers in addition to the full routes.  If you do this you can filter out /24 routes and decrease the routing table your router has to keep track of.  Anything that is a /24, which won’t be in the routing table at this point, will be sent to the default route.  You won’t have as much control over your routes to these destinations, but at least your router won’t be puking on itself.

 

As more and more smaller ISPs buy /24 allocations on the secondary market, we will see this problem increase.  IPV4 is not going away. Smaller ISPs are buying blocks to service their growing customer base and can’t afford to buy large allocations all at once.  So now we are seeing ISPs end up with four or five /24’s that can not be aggregated down like they could in the past.