Kill the ksoftirqd Bottleneck: A Hands-on Guide to NIC Offloading and Interrupt Coalescing

Table of Contents

The 2:14 AM Crisis: Why Your Server is Choking

At 2:14 AM, my phone became a vibrator that wouldn’t quit. PagerDuty alerts were firing every three seconds. By the time I opened my laptop, the dashboard showed success rates had plummeted from 99.9% to a dismal 42%. On the surface, total CPU usage looked fine—maybe 30%—but our 1.2 Gbps throughput had crashed to a measly 200 Mbps.

I logged into the terminal and ran top. While most cores were idling, Core 0 was screaming. The si (software interrupt) column was pinned at 100%. The ksoftirqd/0 process was devouring every available cycle.

This is where most high-traffic servers hit a brick wall. Your application might be lean, but your OS and Network Interface Card (NIC) are drowning in packets. When every single 1500-byte packet triggers a CPU interrupt, the kernel spends more time context switching than running your code. In a production environment, mastering NIC offloading is the difference between scaling efficiently and throwing money at hardware that won’t solve the problem.

The Bottleneck: CPU vs. Packets

Every incoming packet signals the CPU: “I have data!” The CPU pauses its current task, handles the interrupt, and processes the header. This works fine at 1,000 packets per second. But at 800,000 packets per second, the CPU enters an “interrupt storm.” It simply can’t keep up. NIC Offloading and Interrupt Coalescing are the essential fixes. These features shift the heavy lifting from software to the hardware on your network card.

Setting Up Your Toolkit

You’ll need ethtool to tune these settings. It is the standard utility for talking to network drivers. Most distros include it, but you might need to install it on a fresh build.

# On Debian/Ubuntu
sudo apt update && sudo apt install ethtool -y

# On RHEL/CentOS/AlmaLinux
sudo dnf install ethtool -y

Always check your current hardware capabilities first. Assuming your primary interface is eth0, run this command:

# List all offloading features and their status
ethtool -k eth0

Check the status of tcp-segmentation-offload and generic-receive-offload. If they are off, you’ve found your first win.

Configuring NIC Offloading for Throughput

Offloading moves packet processing into the NIC’s own silicon. This is incredibly effective for TCP traffic, which powers almost every web server.

1. TCP Segmentation Offload (TSO) & Generic Segmentation Offload (GSO)

Normally, the kernel breaks large data chunks into 1500-byte packets before sending. With TSO, the kernel passes a large 64KB buffer directly to the NIC. The hardware handles the slicing. This saves the CPU from calculating thousands of individual checksums and headers.

# Enable TSO and GSO
sudo ethtool -K eth0 tso on gso on

2. Generic Receive Offload (GRO)

On the receiving side, GRO is your best friend. The NIC gathers a stream of small incoming packets and combines them into one large “super-packet” before the kernel even sees them. This reduces the number of interrupts the CPU must handle by up to 90%.

# Enable GRO
sudo ethtool -K eth0 gro on

Interrupt Coalescing: Silencing the Storm

Offloading makes packets bigger; coalescing makes interrupts fewer. Instead of shouting at the CPU for every packet, we tell the NIC to wait a few microseconds or until it has a batch of packets ready.

Check your current settings with:

ethtool -c eth0

The key parameter is rx-usecs. If it is set to 0, the NIC interrupts for every single packet. That creates great latency for gaming but kills throughput for a web server. I usually start with a 30-microsecond delay.

# Set RX interrupt delay to 30 microseconds
sudo ethtool -C eth0 rx-usecs 30

Many modern cards support Adaptive Interrupt Coalescing. This is a smart feature that dynamically adjusts the delay. It keeps latency low during quiet times and increases the batch size during a traffic spike.

# Enable adaptive coalescing
sudo ethtool -C eth0 adaptive-rx on adaptive-tx on

Verify and Monitor the Results

Applying changes is only the beginning. You must verify the impact. After my 2 AM fix, I watched the per-core CPU usage like a hawk. Use mpstat to see if the software interrupt load has dropped:

# Monitor CPU stats every 1 second
mpstat -P ALL 1

The %soft column should drop significantly on the core that was previously saturated. You can also watch the raw interrupt counts in real-time:

# Watch interrupt counts for eth0
watch -n 1 "cat /proc/interrupts | grep eth0"

If the count increases slowly while throughput stays high, you’ve won. Remember that these settings often reset after a reboot. Make them permanent using Netplan configuration or a simple systemd unit. Tuning the stack frees your CPU to do what it does best: serve your users.