A tool to find out if BGP is lying to you

APNIC has a bog article on detecting “bgp lies”.

https://blog.apnic.net/2021/05/24/a-tool-to-detect-bgp-lies/
Do you ever wonder whether you can really trust other networks, such as your provider(s) and peers? More precisely, wouldn’t you like to be able to tell if the traffic you send always flows through the paths received in the Border Gateway Protocol (BGP)? Could it be that, for some prefixes, the forwarding path might differ?

Common ISP outage causes

Over the years I have been able to narrow the most common reasons a service provider goes down or has an outage. This is, by no means, an extensive list.   Let’s jump in.

Layer1 outages
Physical layer outages are the easiest and where you should always start. If you have had any kind of formal training you have ran across the OSI model.  Fiber cuts, equipment failure, and power are all physical layer issues.  I have seen too many engineers spend time looking at configs when they should see if the port is up or the device is on.

DNS related
DNS is what makes the transition from the man world to the machine world (queue matrix movie music). Without DNS we would not be able to translate www.j2sw.com into an IP address the we-servers and routers understand. DNS resolution problems are what you are checking when you do something like:

PING j2sw.com (199.168.131.29): 56 data bytes
64 bytes from 199.168.131.29: icmp_seq=0 ttl=52 time=33.243 ms
64 bytes from 199.168.131.29: icmp_seq=1 ttl=52 time=32.445 ms
--- j2sw.com ping statistics ---
2 packets transmitted, 2 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 32.445/32.844/33.243/0.399 ms

Software bugs
Software bugs typically are always a reproducible thing.  The ability to reproduce these bugs is the challenge.  Sometimes a memory leak happens on a certain day.  Sometimes five different criteria have to be met for the bug to happen.

Version mismatches
When two or more routers talk to each other they talk best when they are on the same software version. A later version may fix an earlier bug.  Code may change enough between version numbers that certain calls and processes are speaking slightly differently.  This can cause incompatibilities between software versions.

Human mistakes
“Fat fingering” is what we typically call this. A 3 was typed instead of a 2. This is why good version control and backups with differential are a good thing. Things such as cables getting bumped because they were not secured properly are also an issue.

What can we do to mitigate these issues?
1.Have good documentation.  Know what is plugged in where what it looks like and as much detail as possible.  You want your documentation to stand on its own. A person should be able to pick it up and follow it without calling someone.
2.Proactive monitoring.  Knowing problems before customers call is a huge deal. Also, being able to identify trends over time is a good way to troubleshoot issues.  Monitoring systems also allow you to narrow down the problem right away.
3.When it comes to networking know the OSI model and start from the bottom and work your way up.

Books can and are written about troubleshooting,  This has just been a few of the common things I have seen.

New Speed Test server for Patreons

This content is for Patreon subscribers of the j2 blog. Please consider becoming a Patreon subscriber for as little as $1 a month. This helps to provide higher quality content, more podcasts, and other goodies on this blog.
To view this content, you must be a member of Justin Wilson's Patreon at $4 or more
Already a qualifying Patreon member? Refresh to access this content.

The problem with speedtests

Imagine this scenario. Outside your house, the most awesome superhighway has been built.  It has a speed limit of 120 Mile Per Hour.  You calculate at those speeds you can get to and from work 20 minutes earlier. Life is good.  Monday morning comes, you hop in your 600 horsepower Nissan GT-R, put on some new leather driving gloves, and crank up some good driving music.  Your pull onto the dedicated on-ramp from your house and are quickly cruising at 120 Miles an hour. You make it into work before most anyone else. Life is good.  

Near the end of the week, you notice more and more of your neighbors and co-workers using this new highway.  Things are still fast, but you can’t get up to speed like you could earlier in the week.  As you ponder why you notice you are coming up on the off-ramp to your work.  Traffic is backed up. Everyone is trying to get to the same place.  As you are waiting in the line to get off the superhighway, you notice folks passing you by going on down the road at high rates of speed.  You surmise your off-ramp must be congested because it is getting used more now.

Speedtest servers work the same way. A speedtest server is a destination on the information super-highway. Man, there is an oldie term.  To understand how these servers work we need a quick understanding of how the Internet works.   The internet is basically a bunch of virtual cities connected together.  Your local ISP delivers a signal to you via Wireless, Fiber, or some sort of media. When it leaves your house it travels to the ISP’s equipment and is aggregated with your neighbors and sent over faster lines to larger cities. It’s just like a road system. You may get access via a gravel road, which turns into a 2 lane blacktop, which then may turn into a 4 lane highway, and finally a super-highway.  The roads you take depend on where you are going. Your ISP may not have much control over how the traffic flows once it leaves their network.

Bottlenecks can happen anywhere. Anything from fiber optic cuts, oversold capacity, routing issues, and plain old unexpected usage. Why are these important? All of these can affect your results and can be totally out of control of your ISP and you.  They can also be totally your ISP’s fault.

They can also be your fault, just like your car can be.  An underpowered router can be struggling to keep up with your connection.Much like a moped on the above super-highway can’t keep up with a 600 horsepower car, your router might not be able to keep up either.  Other things can cause issues such as computer viruses, and low performing components.

Just about any network can become a speedtest.net node or a node with some of the other speedtest sites.  These networks have to meet minimum requirements, but there is no indicator of how utilized these servers are.  A network could put up one and it’s 100 percent utilized when you go running a test. This doesn’t mean your ISP is slow, just the off-ramp to that server is slow.

The final thing we want to talk about is the utilization of your internet pipe from your ISP.  This is something most don’t take into consideration.  Let’s go back to our on-ramp analogy.  Your ISP is selling you a connection to the information super-highway.   Say they are selling you a 10 meg download connection.  If you have a device in your house streaming an HD Netflix stream, which is typically 5 megs or so, that means you only have 5 megs available for a capacity test while that HD stream is happening. Speedtest only tests your current available capacity.  Many folks think a speedtest somehow stops all the traffic on your network, runs the test, and starts the traffic. It doesn’t work that way. Your available capacity is only tested at that point in time.  The same is true for any point between you and the speedtest server.  Remember our earlier analogy about slowing down when you got to work because there were so many people trying to get there.  They exceeded the capacity of that destination.  However, that does not mean your connection is necessarily slow because people were zooming past you on their way to less congested destinations.

This is why results to a server should be taken with a grain of salt. They are a useful tool, but not an absolute. The speedtest server is just a destination.  That destination can have bottlenecks, but others don’t.  Even after this long article, there are many other factors which can affect Internet speed. Things we didn’t touch on like Peering, the technology used, speed limits, and other things can also affect your internet speed to destinations.

J2 briefing: Zayo acquisition, more wifi6, NTIA changes

Hello, welcome to Monday May 13th 2019, my name is Justin and this is the ISP news you need to know.

Zayo to be Acquired
https://investors.zayo.com/news-and-events/press-releases/press-release-details/2019/Zayo-Announces-Definitive-Agreement-to-be-Acquired-by-Digital-Colony-and-EQT/default.aspx
Zayo to be acquired by Digital Colony Partners and EQT Infrastructure fund for $14.3 Billion in cash.  This continues the trend of telecom companies being bought by equity funds.  This is what it appears at least.

Cnet article on WIFI6
https://www.cnet.com/news/wi-fi-6-and-what-it-means-for-you/#ftag=CAD590a51e
A few weeks I did an article on WIFI6.  CNET has a broader overview for those of you looking to learn more.

5G & Spectrum news
https://sanfrancisco.cbslocal.com/2019/05/10/at-competitors-tech-industry/
At&T continues their wannabe 5G rollout.

https://www.theregister.co.uk/2019/05/09/ntia_boss_quits/
Two key positions are now vacant at the NTIA. Could this delay CBRS even more?

Indiana ISP meeting is this Thursday May 16th in Indianapolis
https://www.eventbrite.com/e/2019-indiana-isp-meeting-tickets-60311319781