Build Scalable Layer 2 Overlays: EVPN/VXLAN with FRRouting and Open vSwitch on Linux

Table of Contents

The Breaking Point: Stretching Layer 2 Across Data Centers

Infrastructure teams often hit a brick wall when scaling virtualization clusters beyond 10 or 15 nodes. You have servers in different racks—or even separate data centers—and you need them to share a flat network. The knee-jerk reaction is to “stretch” the VLAN. This works for a small laboratory, but it’s a ticking time bomb for production.

As the cluster grows, the network becomes fragile. A single broadcast storm can paralyze your entire infrastructure in seconds. Managing Spanning Tree Protocol (STP) priorities across 20 switches is a manual nightmare. Worse, STP usually blocks half your links to prevent loops. You end up paying for four 100Gbps uplinks but only using 200Gbps of capacity while the rest sits idle. The network should support your applications, not dictate how you deploy them.

Why Traditional VLANs Fail at Scale

Modern demands have outgrown the 802.1Q standard. Traditional Layer 2 networking relies on “flood-and-learn.” If a switch doesn’t know where a MAC address lives, it broadcasts that packet to every single port. In a data center with thousands of virtual machines, this overhead is staggering.

The 4,096 VLAN ID limit is another hard ceiling. If you’re building a multi-tenant cloud or a large-scale lab, those 4k IDs disappear faster than you’d expect. We are trying to solve 21st-century scaling problems with a 12-bit identifier designed in the 1980s. While STP was a breakthrough decades ago, it now acts as a bottleneck. It prevents us from using modern Equal-Cost Multi-Path (ECMP) routing to load-balance traffic across all available fiber.

From Manual Tunnels to BGP EVPN: Choosing Your Path

Encapsulation is the escape hatch from these limits. VXLAN (Virtual Extensible LAN) solves the ID shortage by providing over 16 million unique VNIs. However, the real challenge is managing the tunnels.

Static VXLAN: You manually map every remote VTEP (VXLAN Tunnel End Point). This is fine for two servers. For 50 servers? It’s a management suicide pact.
Multicast VXLAN: The network uses multicast groups to simulate flooding. Many engineers avoid this in the core because it is notoriously difficult to troubleshoot and rarely scales across diverse subnets.
BGP EVPN: This is the professional standard. Instead of flooding packets to discover MAC addresses, we use the Border Gateway Protocol (BGP) to distribute reachability information. It transforms a Layer 2 problem into a structured routing problem. The network knows exactly where every device is before the first packet ever leaves the source NIC.

The Power Couple: FRRouting (FRR) and Open vSwitch (OVS)

Linux is no longer just a server OS; it is the engine behind the world’s largest software-defined networks. Combining FRRouting as the control plane with Open vSwitch as the data plane creates a vendor-neutral stack that rivals expensive proprietary hardware.

FRR provides the “intelligence” by learning MAC address locations via BGP. OVS handles the heavy lifting. It encapsulates traffic into VXLAN tunnels and moves packets at 10G, 40G, or 100G line rates depending on your hardware.

1. Preparing the Linux Environment

Most modern distributions like Ubuntu or Debian include these tools in their standard repositories. You will also need to ensure the kernel is ready for VXLAN and dummy interfaces, which we use for loopback addresses.

# Install Open vSwitch and FRRouting
sudo apt update
sudo apt install openvswitch-switch frr frr-pythontools -y

# Enable the BGP daemon
sudo sed -i 's/bgpd=no/bgpd=yes/' /etc/frr/daemons
sudo systemctl restart frr

2. Configuring the Data Plane with Open vSwitch

We need a bridge to host our virtual machines and a VXLAN port to link with the fabric. The loopback IP (1.1.1.1 in this example) serves as our unique VTEP identifier.

# Create an OVS bridge
sudo ovs-vsctl add-br br-int

# Create a VXLAN port
# remote_ip=flow allows FRR to dynamically tell OVS where to send traffic
sudo ovs-vsctl add-port br-int vxlan0 -- set interface vxlan0 type=vxlan options:remote_ip=flow options:key=100

# Bring the bridge up
sudo ip link set br-int up

3. Configuring the Control Plane with FRRouting

Now we instruct FRR to share MAC addresses via BGP EVPN. This configures FRR to monitor our OVS bridge and advertise its contents to BGP neighbors.

# Enter the FRR shell
sudo vtysh

# Configuration steps
conf t
router bgp 65001
 bgp router-id 1.1.1.1
 neighbor 10.0.0.2 remote-as 65001
 !
 address-family l2vpn evpn
  neighbor 10.0.0.2 activate
  advertise-all-vni
 exit-address-family
exit
write memory

4. Verifying the Overlay

Once the BGP session stabilizes, your host will learn remote MAC addresses automatically. Unlike traditional switches that stay silent until traffic arrives, FRR populates the BGP table immediately.

# Check EVPN BGP routes
show bgp l2vpn evpn route

# Verify learned MAC addresses on the bridge
sudo ovs-appctl fdb/show br-int

Final Thoughts

Swapping rigid VLANs for an EVPN/VXLAN architecture changes how you think about the network. You stop fighting loop detection and start optimizing for routing efficiency. By using BGP—the same protocol that powers the global internet—you gain stability and visibility that legacy environments can’t match.

This Linux-based approach offers immense freedom. You aren’t trapped by a specific vendor’s CLI or licensing fees. Whether you’re running on a white-box switch, a high-spec server, or a VM for testing, you are building on the foundation of a modern, software-defined data center.