Linux Ftrace: Diagnosing Kernel Latency Without the Strace Overhead

Linux tutorial - IT technology blog
Linux tutorial - IT technology blog

Chasing Ghosts: When Standard Tools Lie

Everything looks fine on paper. top shows 5% CPU usage, iostat reports negligible disk wait, yet your API response times are spiking into the hundreds of milliseconds. When standard user-space tools fail to show a bottleneck, the issue is likely buried in the kernel. This is where Ftrace becomes your most powerful diagnostic tool.

Traditional tracers like strace are great for debugging logic, but they are heavy. Running strace on a high-traffic production process can slow it down by 10x to 50x. Ftrace, the “Function Tracer,” is built directly into the Linux kernel (since version 2.6.27). It operates with minimal overhead—often less than 1-2%—allowing you to observe system calls and scheduling delays in real-time without crashing your performance.

The Architecture: Why Ftrace is Different

Ftrace is a tracing framework, not a standalone binary. It allows you to track how functions are called, measure execution times, and see exactly how processes interact with hardware. While perf is excellent for high-level profiling, Ftrace excels at finding the “why” behind specific latency spikes. It can show you if a process is being blocked by a specific network driver or if the scheduler is making inefficient context switches.

After managing a fleet of 50+ high-traffic nodes, I’ve learned that kernel-level tracing is a precision job. You need to filter your traces carefully. Without proper filters, a busy server can generate 500MB of trace data in under 10 seconds, making it nearly impossible to find the relevant events.

Initialization: Locating the Tracefs Mount

Modern distros like Ubuntu 22.04 or AlmaLinux 9 have Ftrace enabled by default. It interfaces with the user through a virtual filesystem. Since kernel 4.1, this is usually tracefs, though it’s often still mounted within debugfs for compatibility.

Check for the tracing directory with this command:

ls /sys/kernel/tracing

If that path doesn’t exist, try the legacy location:

ls /sys/kernel/debug/tracing

If the directory is empty, you likely need to mount the debug filesystem manually:

mount -t debugfs nodev /sys/kernel/debug

Execution: Targeting Your Investigation

Ftrace uses virtual files to control its logic. You configure it by echoing strings into these files. To start, identify which tracers your specific kernel supports.

cat /sys/kernel/tracing/available_tracers
# Expected: function_graph function blk mmiotrace nop

1. Visualizing the Call Stack

The function_graph tracer is the gold standard for debugging. It provides a C-style visual hierarchy of every kernel function call, including the entry and exit times. This makes it easy to see which sub-function is actually dragging down the parent.

# Enable the graph tracer
echo function_graph > /sys/kernel/tracing/current_tracer

2. Surgical Filtering

To avoid a data deluge, narrow your scope. If you suspect disk latency, focus on the Virtual File System (VFS) layer. You can use wildcards to catch all related functions.

# Limit tracing to VFS-related functions
echo 'vfs_*' > /sys/kernel/tracing/set_ftrace_filter

If you have identified a specific misbehaving process, filter by its PID to ignore background noise from other services:

# Replace 1234 with your actual process ID
echo 1234 > /sys/kernel/tracing/set_ftrace_pid

3. Capturing the Data

The tracer can be toggled instantly. It’s best practice to keep it off while configuring, then toggle it on for a short 5-10 second burst to capture the anomaly.

# Start the capture
echo 1 > /sys/kernel/tracing/tracing_on

# Wait for the lag to occur, then immediately disable
echo 0 > /sys/kernel/tracing/tracing_on

Analysis: Spotting the Smoking Gun

Captured data lives in the trace file. Use less -S to view it; the -S flag is vital because it prevents long lines from wrapping, keeping the visual hierarchy intact.

less -S /sys/kernel/tracing/trace

The output provides a clear timeline of execution:

# CPU  DURATION                  FUNCTION CALLS
# |     |   |                     |   |   |   |
 0)               |  vfs_write() {
 0)               |    rw_verify_area() {
 0)   0.124 us    |      security_file_permission();
 0)   0.512 us    |    }
 0)   1.230 us    |  }

Decoding Latency Symbols

Ftrace flags slow functions automatically using specific markers in the duration column:

  • +: Over 10 microseconds (us)
  • !: Over 100 microseconds
  • #: Over 1 millisecond (ms)
  • *: Over 10 milliseconds
  • @: Over 100 milliseconds

In a high-performance environment, seeing a ! or # in a routine file-write operation is a red flag. I once debugged a server where a third-party security module was adding a 2ms delay (#) to every single packet processing step; Ftrace made it obvious in one screen.

Post-Debug Cleanup

Never leave a tracer active. Even with tracing_on set to 0, filters and tracer selections stay in memory. Reset everything to ensure the next admin has a clean slate.

# Reset to default state
echo nop > /sys/kernel/tracing/current_tracer
echo > /sys/kernel/tracing/set_ftrace_filter
echo > /sys/kernel/tracing/set_ftrace_pid

Pro Tip: The Trace-cmd Wrapper

If the echo commands feel too manual, use the trace-cmd utility. It simplifies the recording and reporting process into a workflow similar to tcpdump.

# Installation
sudo apt install trace-cmd

# Record specific function calls for a command
sudo trace-cmd record -p function_graph -g vfs_write ls /home

# Generate the report
trace-cmd report

Ftrace is an expansive framework, but these fundamentals will solve 90% of your “unexplained” latency issues. Practice on a staging VM to get used to the filtering syntax, and you’ll be ready the next time your production metrics go sideways.

Share: