INEX’s Shiny New Route Servers

Copy of an article I wrote on INEX’s own blog for longevity – original published here on April 10 2019.


In this article, we talk about the new route servers that we deployed across all three peering platforms at INEX during February 2019 and, particularly, RPKI support.

Most route server instances at internet exchanges (IXPs) perform prefix filtering based on route/route6 objects published by internet routing registries (IRRDBs). INEX members would be used to creating these through RIPE’s database. However there are many other registries and the data quality of some of these IRRDB objects is often poor, with problems relating to missing, stale and incorrectly duplicated information.

A typical IRRDB entry would resemble the following:

route:          192.0.2.0/24
descr:          Example IPv4 route object
origin:         AS65500
created:        2004-12-06T11:43:57Z
last-modified:  2016-11-16T22:19:51Z
source:         SOME-IRRDB

RPKI

RPKI is a public key infrastructure framework designed to secure the internet’s routing infrastructure in a way that replaces IRRs with a database where trust is assigned by the resource holder. The equivalent of a route object in RPKI is called a ROA (Route Origin Authorisation). It is a cryptographically secure triplet of information which represents a route, the AS that originates it and the maximum prefix length that may be advertised. An example of an IPv4 and an IPv6 ROA would be:

( Origin AS,  Prefix,         Max Length )
( AS65500,    2001:db8::/32,  /48        )
( AS65501,    192.0.2.0/24,   /24        )

ROAs are typically created through your own RIR (so, RIPE for most INEX members). These RIRs are called trust anchors in RPKI. RIPE have created an extremely easy wizard for creating ROAs through the LIR Portal.

To implement RPKI in a router, the router needs to build and maintain a table of verified ROAs from the five RIRs/trust anchors. The easiest way of doing this is to use a local cache server which pulls and validates the ROAs from the trust anchors and uses a new protocol called RPKI-RTR to feed that information to routers. Currently there are three validators: RIPE’s RPKI Validator v3; Routinator 3000 from NLnetLabs; and Cloudflare’s GoRTR. INEX currently uses the former two.

RPKI validation of a route against the table of ROAs yields one of three possible results:

  • VALID: a ROA exists for the route and both the prefix length is within the allowed range and the origin ASN matches.
  • INVALID: a ROA exists for the route but either (or both) the prefix length is outside the allowed range and/or the origin ASN is different.
  • UNKNOWN: no ROA exists for the route.

UNKNOWN is a common response as the database has only a fraction of the prefix coverage as IRR databases do. We are now in a multi-year transition from IRR to RPKI route validation while ROAs are created.

Bird V2

As well as RPKI support, we have also upgrading all route servers to Bird v2.

This is a significant rewrite to Bird which, for v1, maintained separate code and daemons for IPv4 and IPv6. Bird v2 merges these code bases and also introduces support for new SAFIs such as l3vpns / mpls.

Overall, the configuration changes required were minimal and INEX continues to run separate daemons of Bird v2 for IPv4 and IPv6 daemons. Route servers are CPU intensive and separate daemons allows for maximum stability, keeps the configuration clean and fits into the existing deployment processes we have built up with IXP Manager.

Route Server Filtering Flow

Our work on the new route servers will be released to the community as part of IXP Manager v5 shortly. The new filtering flow is enumerated below. One of the key new features is that if any route fails a step, we use internal large community tagging to indicate this and the specific reason to our members through the IXP Manager looking glass (more on that later).

  1. Filter small prefixes (>/24 for IPv4, >/48 for IPv6).
  2. Filter martian / bogon ranges.
  3. Sanity check to ensure the AS path has at least one ASN and no more than 64.
  4. Sanity check to ensure the peer ASN is the same as first ASN in the prefix’s AS path.
  5. Prevent next-hop hijacking (where a member advertises a route but puts the next hop as another member’s router rather than their own). We do allow same-AS’s to specify their other router(s).
  6. Filter known transit networks.
  7. Ensure that the origin AS is in set of ASNs from member’s AS-SET. See below for some additional detail on this.
  8. RPKI validation. If it is RPKI VALID, accept the route. If it is RPKI INVALID then filter it.
  9. If the route is RPKI UNKNOWN, revert to standard IRRDB filtering.

Regarding step 7 above, an AS-SET is another type of IRRDB database entry where a network which also acts as a transit provider for other networks can enumerate the AS numbers of those downstream networks. This is something RPKI does not yet support but it is being worked on – see AS-Cones.

Lastly we have enhanced the BGP large community support to allow our members request as-path prepends on announcements to specific members over the route servers. For these and more supported communities, see the INEX route server page here.

Bird’s Eye and the Looking Glass

As well as IXP Manager, INEX has also written and open sourced a secure micro service for querying Bird daemons called Bird’s Eye. IXP Manager uses this to provide a web-based looking glass into our route collectors and servers. We have recently released v1.2.1 of Bird’s Eye which adds support for Bird v2.

We have greatly enhanced IXP Manager’s looking glass to support both Bird v2 and the large communities we use to tag filtered reasons. You can explore any of INEX’s route servers to see this yourself – for example this is route server #1 for IPv4 on INEX LAN1. When members log into IXP Manager they will also find a new Filtered Prefixes tool which will summarise any filtered routes across all 12 of INEX’s route server instances.

More Information

We have spoken about this at a number of conferences recently:

Bird / Quagga with MD5 Support for IPv4/6 on FreeBSD & Linux

Over in INEX we run a route server cluster which alleviates the burden of setting up bilateral peering sessions for the more than 80% of the members that use them. The current hardware is now about six years old and we have a forklift upgrade in the works.

BGP allows for MD5 authentication between clients (using the TCP MD5 signature option, see RFC 2385) and – while recently obsoleted in RFC 5925 – it is still widely used in shared LAN mediums such as IXPs; primarily to prevent packet spoofing and session hijacking via recycled IP addresses.

Our current route server implementation runs on FreeBSD which does not support TCP MD5 in its stock kernel (you are required to compile a custom kernel – see below for details). Additionally, specifying the session MD5 is not done in the BGP daemon configuration but separately in the IPsec configuration. Lastly, our current FreeBSD version has no support for TCP MD5  over IPv6. These have all led to unnecessarily complex configurations and a degree of confusion.

Because of this, we decided to test up to date Linux and FreeBSD versions for native IPv4 and IPv6 TCP MD5 support with Bird and Quagga (our route server daemons of choice).

In each case, BGP sessions were tested for:

  • no MD5 on each end (expected to work);
  • same MD5 on each end (expected to work);
  • different MD5 on each end (expected not to work); and
  • MD5 on one end with no MD5 on the other end (expected not to work).

For Linux, the platform chosen was Ubuntu 12.04 LTS with the stock 3.2.0-40-generic kernel.

  • Sessions were tested for Quagga to Quagga and Quagga to Bird;
  • Sessions were tested over both IPv4 and IPv6;
  • The presence of valid MD5 signatures were confirmed using tcpdump -M xxx;
  • Stock Quagga and Bird from the 12.04 apt repositories were used.

The results - everything worked and worked as expected:

  • BGP sessions only established when expected (no MD5 configured, same MD5 configured);
  • This held for both IPv4 and IPv6.

Summary: Linux will support TCP MD5 nativily for IPv4 and IPv6 when using Quagga or Bird.

For FreeBSD, we used the latest production release of 9.1. TCP MD5 support is not compiled in by default so a custom kernel must be built with the additional options of:

options   TCP_SIGNATURE
options   IPSEC
device    crypto
device    cryptodev

In addition to this, the MD5 shared secrets need to be added to the IPsec SA/SD database via the setkey utility or, preferably, via the /etc/ipsec.conf file which, for example, would contain entries for IPv4 and IPv6 addresses such as:

add 192.0.2.1 192.0.2.2 tcp 0x1000 -A tcp-md5 "supersecret1";
add 2001:db8::1 2001:db8::2 tcp 0x1000 -A tcp-md5 "supersecret2";

where the addresses ending in .1/:1 are local and .2/:2 are the BGP neighbor addresses. This file can be processed by setting ipsec_enable="YES" in /etc/rc.conf and executing /etc/rc.d/ipsec reload.

  • Sessions were tested for Quagga/Linux to Quagga/FreeBSD and  from Quagga/Linux to Bird/FreeBSD;
  • Sessions were tested over both IPv4 and IPv6;
  • The presence of valid MD5 signatures were confirmed using tcpdump -M xxx;
  • Stock Quagga from the 12.04 apt repositories and stock Quagga and Bird from FreeBSD ports were used.

The results – almost everything worked and worked as expected:

  • BGP sessions only established when expected (no MD5 configured, same MD5 configured);
  • This held for both IPv4 and IPv6;
  • one odd but expected behavior – you only need to set the MD5 via setkey / ipsec.conf – setting it (or not) in the Quagga and Bird config has no effect so long as it is set via setkey (but is useful for documentation purposes). However, trying to set it in Quagga without having rebuilt the kernel will result in an error.

Summary: FreeBSD will support TCP MD5 via a custom kernel and setkey / ipsec.conf for IPv4 and IPv6. Note that there is an additional complexity when changing or removing MD5 passwords as these need to be amended / deleted via setkey which can put an extra burden on automatic route server configuration generators.