Configuring Real-time Kernel (PREEMPT_RT) on Linux for Ultra-Low Latency Workloads

Linux tutorial - IT technology blog
Linux tutorial - IT technology blog

When the Standard Kernel Gets in Your Way

A few years back, I was helping set up a low-latency audio processing pipeline on a Linux server. The application kept dropping frames — not because the CPU was overloaded, but because the kernel scheduler occasionally delayed tasks by several milliseconds. That’s when I first ran into PREEMPT_RT.

The standard Linux kernel is optimized for throughput, not determinism. Kernel tasks can hold locks for extended periods. Interrupt handlers can delay your application in ways you simply cannot predict. For most workloads — web servers, databases, containerized apps — that’s completely fine. But for real-time applications like:

  • Industrial control systems (PLC replacements, motion control)
  • Low-latency audio/video production
  • Financial trading systems requiring microsecond precision
  • Robotics and embedded Linux platforms
  • Telecommunications (5G RAN, DPDK-based packet processing)

…the stock kernel simply does not cut it. Worst-case scheduling latency on an untuned system can spike to 5–20 milliseconds. In real-time terms, that’s an eternity.

What PREEMPT_RT Actually Changes

PREEMPT_RT fundamentally reworks the kernel’s locking model. Most spinlocks become preemptable mutexes. Interrupt handlers run as preemptable kernel threads rather than in hard interrupt context. Nearly every kernel code path can be interrupted mid-flight. Net effect: your high-priority tasks can preempt even kernel-internal code, dropping worst-case latency from milliseconds to single-digit microseconds on properly tuned hardware.

Installing the PREEMPT_RT Kernel

Option 1: Pre-built Packages (Ubuntu/Debian — Recommended to Start)

Ubuntu provides real-time kernel packages through Ubuntu Pro. Even on the free tier, the low-latency kernel gives you meaningful improvements:

# Check what RT kernels are available
apt search linux-image-rt

# Low-latency kernel (PREEMPT, not full RT — good starting point)
sudo apt install linux-image-lowlatency linux-headers-lowlatency

# Full PREEMPT_RT on Ubuntu Pro (free for up to 5 machines)
sudo pro attach <your-token>
sudo pro enable realtime-kernel

For Debian systems, the RT kernel is in the standard repositories:

sudo apt install linux-image-rt-amd64 linux-headers-rt-amd64

Option 2: Build from Source

When you need a specific kernel version, custom driver support, or full control over the build config, compiling from source is the right call. Budget 30–60 minutes on a modern desktop, or closer to 2 hours on a 2-core VPS. One rule that has saved me more than once: always test on a staging machine first. An RT kernel build is exactly the kind of change that surprises you in production if you skip that step.

# Install build dependencies
sudo apt install build-essential libncurses-dev bison flex \
  libssl-dev libelf-dev bc dwarves

# Download kernel source and matching RT patch
KERNEL_VERSION=6.6.21
RT_PATCH=patch-6.6.21-rt26

wget https://cdn.kernel.org/pub/linux/kernel/v6.x/linux-${KERNEL_VERSION}.tar.xz
wget https://cdn.kernel.org/pub/linux/kernel/projects/rt/6.6/${RT_PATCH}.patch.xz

tar xf linux-${KERNEL_VERSION}.tar.xz
cd linux-${KERNEL_VERSION}
xzcat ../${RT_PATCH}.patch.xz | patch -p1
# Start from your current running kernel config
cp /boot/config-$(uname -r) .config
make olddefconfig

# Enable full PREEMPT_RT
scripts/config --enable PREEMPT_RT
scripts/config --disable PREEMPT_VOLUNTARY
scripts/config --disable PREEMPT

# Build as .deb packages (easier to install/remove)
make -j$(nproc) deb-pkg LOCALVERSION=-rt

# Install
sudo dpkg -i ../linux-image-*.deb ../linux-headers-*.deb

# Reboot and select the RT kernel in GRUB
sudo reboot

After rebooting, confirm you’re on the RT kernel:

uname -a
# Expected output contains: PREEMPT_RT
# Example: Linux myhost 6.6.21-rt26 #1 SMP PREEMPT_RT Sat Apr 5 ...

Configuring the System for Real-Time Performance

Installing the RT kernel is the easy part. The real work is persuading the rest of the OS to leave your RT task alone.

CPU Isolation

Reserve specific CPU cores for your RT workload — the kernel scheduler stops placing any other work on them once they’re isolated:

sudo nano /etc/default/grub

# Add to GRUB_CMDLINE_LINUX_DEFAULT — here we isolate cores 2 and 3:
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash isolcpus=2,3 nohz_full=2,3 rcu_nocbs=2,3"

sudo update-grub
sudo reboot

nohz_full disables the timer tick on those cores (reduces interrupt-driven jitter), and rcu_nocbs offloads RCU callbacks away from them. These three parameters work together — use all three.

IRQ Affinity

Move hardware interrupts away from your isolated cores. The simplest approach is configuring irqbalance to avoid them:

sudo apt install irqbalance
sudo nano /etc/default/irqbalance

# Bitmask — cores 2 and 3 = binary 1100 = hex 0xC
IRQBALANCE_BANNED_CPUS="0x0c"

sudo systemctl restart irqbalance

For specific IRQs that irqbalance doesn’t handle (like your NIC’s MSI interrupts), pin them manually:

# View all active IRQs
cat /proc/interrupts

# Pin IRQ 30 to CPU 0 only (smp_affinity is a CPU bitmask)
echo 1 | sudo tee /proc/irq/30/smp_affinity

Resource Limits for RT Processes

RT processes need permission to lock memory and set elevated scheduling priorities. Add to /etc/security/limits.conf:

sudo nano /etc/security/limits.conf

# Replace 'rtuser' with your actual username, or use @audio for audio group
rtuser    -    rtprio     99
rtuser    -    memlock    unlimited
rtuser    -    nice       -20

Kernel Tuning Parameters

sudo nano /etc/sysctl.d/99-realtime.conf
# Disable kernel watchdog on RT cores (cores 0,1 only — hex 0x3 = binary 0011)
kernel.watchdog_cpumask = 0x3

# Allow RT tasks to consume 100% CPU if needed
# WARNING: set this back to 950000 if your RT tasks ever spin unexpectedly
kernel.sched_rt_runtime_us = -1

# Minimize swap pressure
vm.swappiness = 0
vm.dirty_ratio = 10
vm.dirty_background_ratio = 5
sudo sysctl -p /etc/sysctl.d/99-realtime.conf

CPU Frequency Governor

Frequency scaling introduces latency variance. Lock your RT cores to maximum frequency:

sudo apt install cpufrequtils

# Set performance governor on isolated cores
sudo cpufreq-set -g performance -c 2
sudo cpufreq-set -g performance -c 3

# Verify
cat /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor
# performance

Launching Your Application with RT Scheduling

Pin your application to the isolated cores and give it SCHED_FIFO scheduling:

# SCHED_FIFO priority 80, pinned to core 2
sudo chrt -f 80 taskset -c 2 ./your-rt-application

# Check what scheduling policy a running process is using
chrt -p <PID>

Verification and Monitoring

Never assume the tuning worked — measure it. The standard RT latency benchmark is cyclictest, which measures the delta between when a timer should fire and when it actually does.

Running cyclictest

sudo apt install rt-tests

# Run on isolated core 2, RT priority 80, for 60 seconds
sudo cyclictest \
  --mlockall \
  --priority=80 \
  --interval=200 \
  --distance=0 \
  --affinity=2 \
  --duration=60s \
  --histogram=400 \
  --histfile=latency.hist

Three numbers matter:

  • Min: baseline hardware latency — typically 1–10 µs on modern x86
  • Avg: normal operating latency under steady load
  • Max: worst-case jitter — this is the number your application actually has to survive

On a well-tuned PREEMPT_RT system, Max should be under 100 µs. Serious deployments — industrial control, professional audio — routinely hit 20–50 µs on good server hardware. Run the same test on a stock kernel under load and 5–20 ms spikes are completely normal.

Test Under Realistic Load

Idle latency numbers are mostly useless. You need to know what happens when the system is actually under pressure:

# Terminal 1: Measure latency
sudo cyclictest --mlockall --priority=80 --interval=200 --duration=120s --affinity=2

# Terminal 2: CPU and memory stress
sudo apt install stress-ng
stress-ng --cpu 2 --vm 1 --vm-bytes 1G --io 2

# Terminal 3: Network load
iperf3 -s &
iperf3 -c localhost -t 120 -P 4

Run all three simultaneously and watch Max. If it stays within your application’s deadline with all stressors active, your tuning is solid enough for production validation.

Tracing Latency Spikes with ftrace

When cyclictest shows occasional outliers you can’t explain, ftrace pinpoints the kernel code path responsible:

# Enable the preempt/IRQ-off latency tracer
echo preemptirqsoff | sudo tee /sys/kernel/debug/tracing/current_tracer
echo 1 | sudo tee /sys/kernel/debug/tracing/tracing_on

# Run your workload for a few seconds, then grab the trace
sudo cat /sys/kernel/debug/tracing/trace | head -80

# Disable tracing
echo 0 | sudo tee /sys/kernel/debug/tracing/tracing_on

RT Health Check Script

Reboots don’t always restore every RT setting. IRQ affinity is the worst offender — it resets silently after a kernel update. This script catches the problem before your RT workload does:

#!/bin/bash
# rt-check.sh — verify RT setup after reboot

echo "=== Kernel ==="
uname -r | grep -q PREEMPT_RT && echo "OK: PREEMPT_RT active" || echo "WARN: Standard kernel"

echo ""
echo "=== CPU Governor ==="
for cpu in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor; do
  echo "  $(basename $(dirname $(dirname $cpu))): $(cat $cpu)"
done

echo ""
echo "=== Isolated CPUs ==="
echo "  $(cat /sys/devices/system/cpu/isolated 2>/dev/null || echo 'none')"

echo ""
echo "=== Swappiness ==="
echo "  $(sysctl vm.swappiness)"
chmod +x rt-check.sh
./rt-check.sh

Run it after every kernel update or reboot, before putting your RT workload back into service. It has caught silent misconfigurations more than once — especially after unattended upgrades.

Share: