Posts Tagged ‘bgp’

AusNOG conference

Tuesday, September 1st, 2009

I was lucky enough to get a free pass to the Australian Network Operators Group conference from one of our upstream providers, so that’s what I’m up to at the start of this week. It is interesting to compare it to my experiences at the several LinuxConfAU conferences I’ve been to. On the whole I can say it is more Enterprisey, far less smelly, and a generally smaller but more focussed conference. Obviously network topics dominate the conference (although there are a number of presentations that border on other areas).

Somewhat confusingly for a sysadmin, they named this conference AusNOG03. They have decided to not use a year-based numbering system nor one that starts at 0 (which would please most of us), and as a kicker have locked themselves into a two-digit Y2K-style bug. Well, it’s only 3 years old, we’ll let that point slide.

Unhealthy snacks ahoy

Unhealthy snacks ahoy

Typically tasty and unhealthy snacks could be found upon entry – some delightful mini-croissants with ham and cheese. Coffee and tea staples were omnipresent. Apparently there was a large imbibing session last night and most delegates attended.

Conference room

Conference room

It is being held at the Four Seasons Hotel in Sydney. I have to give them points for style, and functionality. Not only do we have actual stable desks for writing and computing, but there is a power board for every three seats.

Legacy writing equipment, water glass and mints

Legacy writing equipment, water glass and mints

An array of useful items were at every seat. They clearly recognise that network operators lack social etiquette and have strewn mints far and wide. They are on the tables, they are in the conference bags.

To briefly summarise what I have taken in so far – the Internet is not yet blowing up; network operators and BGP are doing a good job and making the Internet as a whole (which is going from a long stringy network, to a fat wide network) better; Open-Source content delivery networks are on the horizon and may become a reality some time soon.

Testing your connectivity

Thursday, May 21st, 2009

Recently I blogged about our new IPv4 address allocation. While we don’t need to start using it for a while as we have been conserving IP addresses quite well, and gave ourselves plenty of time before we actually need to use the new allocation, it is a good idea to check that it is accessible to the Internet at large.

Our new allocation is from the block 110.0.0.0/8 which was only allocated to the Asia-Pacific regional registry APNIC last November. Prior to it being allocated to APNIC, it would have been in a state affectionately known as “bogon” to network administrators. Bogons are network ranges that aren’t in use, and therefore can be safely ignored by all live networks on the Internet. There have been cases where spammers or other parties looking to conduct illegal activity on the Internet have attempted to use unallocated network ranges for various reasons, so most knowledgable network administrators will block all bogon networks. There are several projects such as the Team Cymru Bogon Reference which put together lists of current bogons to aid network administrators in this task.

The problem comes at the time of removing these bogons from the list. There are currently over 30000 active ASs (Autonomous Systems) on the IPv4 Internet, and effectively each of these must update their own bogon list (if they are not peering with an automatic service such as what Team Cymru provides). Not all network administrators are up to date on the IANA allocations so this process can take months. We are lucky enough to have an allocation from a brand new APNIC range – others are not so lucky and often will have been allocated a range that was previously used by spammers.

Faced with this situation, we’ve decided to try to find out exactly how reachable our new allocation is. I consulted the members of the NANOG mailing list, and pondered their suggestions. I’ve documented below the success of the various methods they suggested and a method which we thought up and decided to try as well.

Do Nothing

One member of the mailing list said:

IMHO, if a network doesn't either update filters based on IANA
notifications or follow Cymru BOGON, then they don't deserve to receive
traffic from your network ;) 

The BOFH within me likes this response very much, but sadly I don’t think that response would be accepted by the boss… I’d also like to take more of an active role in determining our connectivity.

RIPE Debogon Prefix Reachability

http://www.ris.ripe.net/cgi-bin/debogon.cgi

The RIPE regional registry has this page which is effectively just a rudimentary looking glass allowing you to ping or traceroute to your new IP address space. Unfortunately I found dubious results when pinging from some of the routers listed, on all of our address ranges. I suspect not all routers are available or the script behind the page needs updating. If you only have a single address range it would be hard to figure out if results are correct.

RIPE also performs their own testing of de-bogoned address space and graphs the output of their reachability tests. This only really helps you if your allocation has come from RIPE though.

Looking Glasses

Similar to the above method, this involves advertising a small segment of IP space for testing and then using as many public looking glasses on the web as you can find to test connectivity. It is quite thorough, although very time consuming.

Notify network operator groups

One suggestion from the list was simply to post a message on NOG mailing lists and ask the participants to check their filters and optionally attempt to ping the address space in question. This requires participation from the network administrators on the lists, but my main reservation with this method is that if they are knowledgeable enough to subscribe to NANOG or a similar mailing list, they probably take care of their BGP filters anyway so this method probably won’t reveal too many misconfigurations.

Active testing from BGP data

I’m always interested in using BGP data for new and exciting things, so this was a good challenge. From our border routers we can assemble a list of endpoint ASs (as we’re not that interested in strictly transit ASs unless we spot a problem we can pin down to one of them) and pick at least one subnet advertised by each AS. Then we attempt to ping or in some other way communicate with one IP address on each of those subnets. We do this from a working IP on our existing IP allocation range.

We then take the results of that testing and perform the tests again from an IP address on our new allocation range. In theory if the range has been correctly debogoned we will see identical results (even if not all IPs are reachable). If there are discrepancies we can determine which ASs may have bogon-related issues and attempt to contact them.

Reusing most of the BGP dump manipulation script I wrote for my BGP Data Visualisation article, I was able to pull out a list of unique endpoint AS numbers and a subnet for each from our live BGP data within a few seconds. The wc utility tells me that there are 31056 unique AS numbers, which sounds about right based on the recent AS reports. Here is the perl script I used to generate the list of ASs and a subnet for each. You simply pipe the output of the “show ip bgp” command from your router into it and it will print one AS and one target subnet per line:

#!/usr/bin/perl

my %aslist;

while (<STDIN>) {
    # Skip the first 5 lines of header data
    if ($. < 6) { next; }
    chomp;

    # Skip any blank lines
    if ( m/^$/ ) { next; }

    # Skip lines without an AS path or subnet
    if ( m/ 0 (i|e|\?)$/ ) { next; }
    if ( m/^... / ) { next; }
    unless (m/ 0 / ) { next; }

    # Skip the last summary line
    if ( m/^Total number of prefixes.*$/ ) { next; }

    # Grab the AS path and subnet
    s/^...([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}(\/[0-9]{1,2})?).* 0 (.*) (i|e|\?)$/\1 \3/;
    s/(\{|\})//g;
    s/,/ /g;

    # Turn the AS path string into an array
    my @path = split(' ', $_);

    # Add classful subnet designations
    $path[0] =~ s/^([0-9]{1,3})\.0\.0\.0$/\1.0.0.0\/8/;
    $path[0] =~ s/^([0-9]{1,3}\.[0-9]{1,3})\.0\.0$/\1.0.0\/16/;
    $path[0] =~ s/^([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})\.0$/\1.0\/24/;

    # Add last AS to our global list of ASs
    $aslist{$path[-1]}=$path[0];
}

while (($key, $value) = each(%aslist)){
    print "$key $value\n";
}

Since this was very much just a proof of concept I didn’t have much motivation to ensure absolute correctness or make the process as efficient as possible. Ideally I’d have the entire thing within the one script/program which intelligently pings multiple hosts in separate threads with some sort of limiting involved. Instead, I hacked up a couple of quick shell scripts; the first takes the list of ASs and subnets and passes them to the second script which is forked off for each pair. Forking off indiscriminately would lead to the process scheduler having a fit and the machine becoming unresponsive pretty quick so there is a quick check to make sure there aren’t more than 250 concurrent pings running before forking off another instance.

#!/bin/bash
cat AS | while read as subnet; do
        while [ `ps h -C ping | wc -l` -gt 250 ]; do
                sleep 60
        done
        /data/pingloop $as $subnet &
done

“AS” is the file with AS/subnet pairs.

#!/bin/bash
as=$1
subnet=$2
for ip in `/usr/bin/ipcalc $subnet 255.255.255.255 | grep Hostroute | awk '{print $2}'`; do
        ping -c 5 -i 0.2 -w 5 $ip >/dev/null 2>/dev/null
        if [ $? -eq 0 ]; then
                ping -I X.X.X.X -c 5 -i 0.2 -w 5 $ip >/dev/null 2>/dev/null
                if [ $? -eq 0 ]; then
                        echo "$as $ip reachable" >> output
                else
                        echo "$as $ip bogoned" >> output
                fi
                exit 0
        fi
done

In the above “pingloop” script, we hamfistedly generate a sequence of IPs on the target subnet and attempt to find one reachable IP address then ping it from our new allocation immediately after to see if it is reachable from both subnets.

The results came back in about 10 hours, which isn’t bad for some fairly non-aggressive ICMP reachability testing of effectively the entire IPv4 Internet. Out of 25446 ASs we were able to reach initially, 1716 couldn’t be reached from our new address space which works out to be around 6.7%. Not terrible, but not great either. From here, we’ll look at the ASs that couldn’t be reached and see if there are any patterns that suggest common upstreams need to update their filters.

One disadvantage to this method is raising the ire of network administrators. The amounts of ICMP traffic the scripts generate is pretty minimal but some networks have overly sensitive network monitoring that will trigger if you perform a sequential ICMP “scan” of their network. Of course, it wasn’t performed with malicious intent to really they have no cause to complain.

On DNS and GeoIP

While network-based bogon lists are the prime concern, you should also consider DNS resolver ACLs and GeoIP data. Many DNS administrators will maintain bogon lists in their configurations and these are probably updated even less frequently than BGP bogon lists. If you run into issues with nameservers on your new IP allocation range, you will know that someone out there hasn’t updated their BIND configuration. Similarly, a lot of web services utilise GeoIP to determine the location of a remote IP. By virtue of the allocation to APNIC, our new range is displayed as being in Australia, but it does not show a city or geographical coordinates. Sending an email to GeoIP with your details can rectify this problem.

New IPv4 allocation for Anchor

Wednesday, May 6th, 2009

Nobody is under any pretences that IPv6 will be close to 100% usage globally any time soon, so despite many entities having firm IPv6 plans or infrastructure already in place, demand for IPv4 is still strong. With that in mind, we’ve just acquired a new allocation from APNIC which will hopefully see us through until IPv6 is dominant on the Internet.

110.173.128.0/19

This allocation is from the 110/8 class A that was allocated to APNIC in November 2008, and represents a tripling of Anchor’s current IPv4 space. We’ll be following our current strict allocation policies to ensure it is the last additional IPv4 allocation we will need, and continuing with our current IPv6 plans as all responsible entities on the Internet should be doing.

BGP Data Visualisation

Thursday, March 19th, 2009

If you are among the upper echelon of network administrators who happen to have BGP administration within their scope of duties you probably have access to a lot of interesting, albeit quite verbose, information about the Internet at large. Generally, any network with a BGP configuration accepting a full feed from their upstream will have data on just about every entity connected to the Internet. How much you decide to use that information is up to you.

BGP information from your upstream generally has the following pieces of data within it:

  • a network prefix and length
  • the next hop for the prefix
  • path of AS numbers through which the advertisement has passed (subject to some manipulation)
  • information on the originating routing protocol
  • community settings
  • various other BGP settings

After making a decision based on all of these factors, the border router will insert a route into the routing table to reach the advertised network and after this point essentially the data is unused (aside from optionally being passed on to other border routers). There are so many more possibilities for this data however – you may use it for diagnosis of network issues or you may want to use it to visualise your BGP router’s view of the Internet (which is far more interesting).

Here we will use the Quagga routing suite to provide the BGP data. You are free to use Cisco or other proprietary equipment but I find having a server running Quagga to allow you a lot more flexibility, especially in this case of getting data out of the BGP system.

BGP table version is 0, local router ID is 202.4.236.8
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
* i3.0.0.0          202.4.236.9                    90      0 4826 703 2914 9304 80 i
*>                  114.31.193.74                  90      0 4826 703 2914 9304 80 i
*                   203.134.70.37                  10      0 9443 2914 9304 80 i
*> 4.0.0.0          114.31.193.74                  90      0 4826 3356 i
* i                 202.4.236.9                    90      0 4826 3356 i
*                   203.134.70.37                  10      0 9443 11867 7018 3356 i
*> 4.0.0.0/9        114.31.193.74                  90      0 4826 3356 i
* i                 202.4.236.9                    90      0 4826 3356 i
*                   203.134.70.37                  10      0 9443 11867 7018 3356 i

...

Total number of prefixes 275034

The above is a small snippet of BGP data, straight from the proverbial horse’s mouth (the BGP router). Immediately we can see that there is a lot of information for us to use – almost 300,000 unique network prefixes with associated paths through various entities identified by their AS numbers. With this path information we can build a visualisation of the entire Internet. It must be stressed though that this “view” of the Internet is only as seen from our network’s point of view and could be vastly different if generated from a different network. Due to the decentralised nature of the Internet, there is not one categorically “authoritative” view of it (even if you took the BGP data from a very well-connected network), but that doesn’t mean that our view is not useful!

Making the Data Usable

I started out with a copy of the full BGP feed from one of our border routers. Using Quagga’s BGPD you can output the entire feed (post-filtering of course) into a file by using the command-line `vtysh` tool:

# vtysh -c 'show ip bgp' > /data/bgp.txt

You will end up with the entire feed in Quagga output format in the file `/data/bgp.txt`, unfortunately not in a well-formatted data structure but in a format we can work with (the format shown in the excerpt above).

From here, we need to pass the file through a little bit of manipulation so that our graphing backend of choice can use it. I hacked up a very quick perl script which takes the output from the “show ip bgp” and attempts to break it down into unique paths between ASs. It strips out unnecessary headers and other text, then goes through each AS path and adds direct links between ASs to a hash table (so we can automatically remove doubled-up entries). It spits out the list of paths in a fairly Graphviz-centric format but can be easily adjusted to fit the requirements of most other graphing engines.

#!/usr/bin/perl

#use strict;
my %aslist;
my %asnodes;
my $numpaths = 0;

while (<STDIN>) {
	# Skip the first 5 lines of header data
	if ($. < 6) { next; }
	chomp;

	# Skip any blank lines
	if ( m/^$/ ) { next; }

	# Skip lines without an AS path
	if ( m/ 0 (i|e|\?)$/ ) { next; }
	unless (m/ 0 / ) { next; }

	# Skip the last summary line
	if ( m/^Total number of prefixes.*$/ ) { next; }

	# Grab just the AS path bit
	s/^.* 0 (.*) (i|e|\?)$/\1/;
	s/(\{|\})//g;
	s/,/ /g;

	# Turn the AS path string into an array
	my @path = split(' ', $_);
	$numpaths++;

	# Grab the path between each pair of nodes in the array
	$current = pop(@path);
	while ( $next = pop(@path) ) {
		# Don't include AS path prepends
		if ( $current == $next ) { next; }

		# Add both ASs to our global list of ASs
		$asnodes{$_}=1 foreach "$current";
		$asnodes{$_}=1 foreach "$next";

		# Add the path between ASs global hash, so we have no duplicates.
		if ( scalar($current) < scalar($next) ) { $aslist{$_}=1 foreach "$current:$next"; }
		else { $aslist{$_}=1 foreach "$next:$current"; }
		$current = $next;
	}
}

while (($key, $value) = each(%aslist)){
	$key =~ s/:/ /;
	print "$key\n";
}

Graphing Engines

This blog post was originally going to be a full-fledged wiki article, but while I originally thought it was a nifty idea I could knock over in a day or so, it turns out that graphing problems can be really, really hard. Who would have thought? So I spent a couple of days on this but didn’t end up getting the pretty yet functional graphs that I had hoped to get. I also stupidly neglected to take screenshots, but due to the graphing engines churning away on my computer in most cases due to the complexity of the data it wouldn’t have been nice to add insult to injury on my little workstation.

But all that is by the by. If this blog posting has piqued your interest in BGP data graphing at all, you’ll hopefully find my summary of a few of the better graphing engines below useful in some way. None of them suited my requirements perfectly but at a very least it is a start for what you could no doubt work on.

  • Graphviz
    • very popular and flexible graphing library.
    • with the number of nodes and paths in this graph, it consumed too much memory and processing time to be effective
  • Large Graph Layout (LGL)
    • very good at handling large graphs, not picky about directed/undirected and has a very simple input format
    • uses a separate java frontend for 2D visualisation after building its meta-data files, and produces VRML output for 3D visualisation (you must provide your own VRML frontend)
    • I found the 2D visualisation to be satisfactory but not very useful for this type of data. I haven’t had much success with VRML viewers with the 3D graph of this size.
  • Walrus
    • entirely java which handles parsing as well as visualisation
    • requires the Java3D library
    • only accepts directed graphs and has a fairly strict input syntax
  • Nodes3D
    • takes relatively simple LUA-files as input
    • quite flexible, and uses standard OpenGL libraries to perform the graphing
    • sadly has a hard-coded limit of a maximum of 2000 nodes, and doesn’t handle more nodes efficiently (with respect to memory allocation) if you alter the limit and recompile
  • aiSee
    • A commercial program that seems to be quite well-rounded and professional-looking
    • Sadly only produces 2D visualizations, with 3D “imitation” with a fish-eye lens effect.
    • It was able to handle my large graph well (not blowing out memory usage) but the resulting visualization in force-directed mode was not sufficient
  • Tulip
    • Relies heavily on QT4. If you are compiling from source grab a cup of coffee while it completes.
    • Has many visualisation possibilities, and can deal with up to one million elements
    • The 3D visualisations aren’t really suitable for this type of data.
  • Lanet-Vi
    • Calculations and visualisation rendering is taken care of for you
    • Has probably the most lenient restrictions on input format
    • An easy option if you don’t want to spend days/weeks/months researching graphing, but would like something quickly
    • source code for local calculation is also available

Other Resources

If you are interested in graphs, the following will probably be interesting to you:

Site links
Anchor
Wiki
Blog
Services
Domain names
Web hosting
VPS
Dedicated Servers
Co-location
Articles
Dedicated Server Purchasing Guide
Dedicated Server Tutorials
Developer Friendly Hosting
Useful Tools