Mastering bpftrace: Solving Linux Performance Mysteries with One-Liners

Linux tutorial - IT technology blog
Linux tutorial - IT technology blog

Beyond the Dashboard: Why Traditional Tools Fail

I’ve spent a decade in the trenches of Linux server management, and the hardest bugs are always the invisible ones. You might see a CPU spike in top or a jump in disk latency via iostat, but those tools only show the symptoms, not the cause. For years, strace was our go-to for deep dives. However, using strace on a production database is risky; I’ve seen it add 50-100% overhead to high-frequency syscalls, effectively strangling the very service you’re trying to save.

Enter bpftrace. It utilizes eBPF (Extended Berkeley Packet Filter) to run sandboxed programs inside the kernel. This isn’t just a minor improvement—it’s a paradigm shift. On a busy Ubuntu 22.04 web server handling 10,000 requests per second, I’ve used bpftrace to pinpoint latency issues without the application even noticing the tracer was running. It hooks into events with almost zero impact on throughput.

Quick Start: From Zero to Tracing in 5 Minutes

Setting up bpftrace is remarkably simple on modern distros. It acts as a high-level wrapper, compiling a clean, domain-specific language into efficient BPF bytecode on the fly.

Installation

On Ubuntu/Debian:

sudo apt update
sudo apt install bpftrace

On Fedora/RHEL:

sudo dnf install bpftrace

Your First Practical One-Liner

Need to know which files are being accessed across the entire system right now? Run this:

sudo bpftrace -e 'tracepoint:syscalls:sys_enter_openat { printf("%s (%d) opened %s\n", comm, pid, str(args->filename)); }'

Let’s break that down. The tracepoint is our hook. The comm variable gives us the process name, while args->filename extracts the file path. I recently used this to catch a rogue log-rotation script that was accidentally trying to open 500,000 non-existent files every hour.

The Engine: How Probes Actually Work

Effective tracing requires understanding your sensors. Think of probes as surgical instruments you can place anywhere in the kernel or user-space.

The Probe Hierarchy

  1. Tracepoints: These are stable, pre-defined markers in the kernel. Always use these first; they rarely break during kernel upgrades.
  2. Kprobes (Kernel Probes): Use these to hook into virtually any internal kernel function. They are powerful but fragile, as function names can change between kernel versions.
  3. Uprobes (User Probes): These target functions inside your own binaries, like a specific method in a MySQL or Node.js process.
  4. Software/Hardware Events: These track low-level metrics like CPU cycles, instructions, or page faults.

Aggregating Data with Maps

Printing every event to the screen will kill your performance. Instead, use ‘maps’ to summarize data inside the kernel. This one-liner counts syscalls by process name:

sudo bpftrace -e 'tracepoint:syscalls:sys_enter_read { @[comm] = count(); }'

Press Ctrl+C after a few seconds. You’ll get a clean table of the biggest talkers. It’s the fastest way to find ‘noisy neighbors’ in a crowded container environment.

Real-World Battle Drills

Here are three one-liners I rely on when a production system starts behaving badly.

1. Visualizing Block I/O Latency

Disk lag is often the hidden culprit behind slow database queries. Instead of a single average number, this creates a distribution histogram:

sudo bpftrace -e 'kprobe:vfs_read { @start[tid] = nsecs; } kretprobe:vfs_read /@start[tid]/ { @latency = hist(nsecs - @start[tid]); delete(@start[tid]); }'

If you see a cluster of events above 10,000,000 nanoseconds (10ms), your SSD or SAN is likely saturated.

2. Catching Short-Lived Processes

Zombie processes or frequent shell executions can degrade performance. This captures every new execution with its full arguments:

sudo bpftrace -e 'tracepoint:syscalls:sys_enter_execve { printf("%-10u %-15s", pid, comm); join(args->argv); }'

The join() function is pure magic here. It reconstructs the command-line string so you can see exactly what that obscure cron job is doing.

3. Debugging TCP Retransmissions

Network latency is notoriously difficult to isolate. This hook fires every time the kernel has to re-send a packet:

sudo bpftrace -e 'kprobe:tcp_retransmit_skb { printf("Retransmit: %s (PID %d)\n", comm, pid); }'

If this command starts scrolling rapidly, stop looking at your code and start checking your cables, switches, or cloud provider’s status page.

Production Safeguards

Bpftrace is efficient, but it isn’t foolproof. A few simple rules will keep you out of trouble.

Limit Your Output

Never use printf on events that happen millions of times per second, such as packet processing on a 10Gbps link. You’ll overwhelm your terminal and waste CPU cycles formatting text. Use histograms or counters instead; let the kernel do the math and only report the summary.

Verify Before You Hook

Before assuming a probe exists, verify it for your specific kernel version:

sudo bpftrace -l '*openat*'

This saves time and prevents ‘command not found’ errors when switching between different Linux distributions.

The Clean Exit

Always ensure your bpftrace process terminates properly. While BPF programs are designed to be safe and self-cleaning, running dozens of forgotten tracers across a cluster can lead to a ‘death by a thousand cuts’ overhead. Start simple, use maps for heavy lifting, and you’ll soon find bpftrace is the most indispensable tool in your kit.

Share: