How to Troubleshoot 'Disk Full' Error and Manage Disk Space on Linux

Table of Contents

When the Disk Fills Up at 2 AM

You get a midnight alert: your web application is throwing 500 errors. SSH in, and the first thing you see is No space left on device. The database won’t write. Nginx can’t create temp files. Everything is broken — because someone (maybe a runaway log process, maybe you) filled the disk.

I’ve been there — 3 AM, fingers flying, cursing a process I didn’t even know was running. Three years and a dozen VPS instances later, I now set up disk monitoring on every new server before anything else runs. But when the alert is already firing, here’s what actually works.

Understanding Why Disks Fill Up

Before jumping to rm -rf everything, it helps to understand what actually eats disk space on a Linux server. The usual suspects:

Log files — Application logs, system logs, and journal files that grow unbounded
Docker — Unused images, stopped containers, and dangling volumes accumulate silently
Package cache — apt/yum caches old packages after updates
Temporary files — Some apps write to /tmp or /var/tmp and never clean up
Database dumps and backups — Old backup files that nobody deleted
Core dumps — A crashing process can dump gigabytes of memory to disk

Here’s a gotcha that burns people constantly: df says the disk is full, but du can’t find the files taking space. This usually means a deleted file is still held open by a running process — the inode stays alive, the space stays consumed. It doesn’t free until that process closes the file handle. Classic trap.

Core Commands You Need to Know

Check Overall Disk Usage

# Human-readable disk space overview
df -h

# Focus on a specific filesystem
df -h /var

Find What’s Eating Space

# Top-level disk usage in /var, sorted by size
du -sh /var/* 2>/dev/null | sort -rh | head -20

# Drill down further
du -sh /var/log/* 2>/dev/null | sort -rh | head -10

The sort -rh flag sorts human-readable sizes correctly — 1.2G sorts above 900M. Drop the -h and you’ll get garbage ordering, chasing the wrong directories while your disk stays full.

Find Large Files Directly

# Find files larger than 100MB anywhere on the system
find / -xdev -type f -size +100M 2>/dev/null | sort -k5 -rn

# More readable version with sizes
find / -xdev -type f -size +100M -exec du -sh {} \; 2>/dev/null | sort -rh

The -xdev flag keeps the search within the current filesystem and doesn’t cross mount points — useful when you want to stay on / and not accidentally traverse NFS mounts.

Hands-On: Recovering Space Quickly

1. Clean Up Journal Logs

On systemd-based distros (Ubuntu, Debian, CentOS 7+, AlmaLinux), journald can quietly reach 2–5GB on a busy server. You probably won’t notice until the disk is gone.

# Check how much journal is using
journalctl --disk-usage

# Keep only the last 200MB of logs
journalctl --vacuum-size=200M

# Or keep only logs from the last 2 weeks
journalctl --vacuum-time=2weeks

Cap it permanently in /etc/systemd/journald.conf:

SystemMaxUse=500M
SystemKeepFree=1G

Then restart journald: systemctl restart systemd-journald

2. Truncate (Not Delete) Active Log Files

If a process has a log file open, deleting it won’t free space — the inode stays alive. Truncating it does work:

# Safely truncate a log file that's actively being written
> /var/log/nginx/access.log

# Equivalent using truncate command
truncate -s 0 /var/log/nginx/error.log

For the deleted-but-still-open file problem, find the process holding the handle:

# Find processes with deleted files still open
lsof | grep '(deleted)'

Restarting that process releases the handle and actually frees the space.

3. Clean Docker Artifacts

On servers running Docker, this is often the biggest win:

# See what Docker is using
docker system df

# Remove stopped containers, unused networks, dangling images, build cache
docker system prune

# Also remove unused volumes (careful — double-check first!)
docker system prune --volumes

# Remove only dangling images
docker image prune

# Remove all images not used by any container
docker image prune -a

I’ve recovered 20–40GB in a single docker system prune -a on build servers that hadn’t been cleaned in months. Just make sure no containers are actively using images you’re about to remove.

4. Clean Package Manager Cache

# Debian/Ubuntu
apt clean          # Removes cached .deb packages
apt autoremove     # Removes orphaned packages

# RHEL/CentOS/AlmaLinux
yum clean all
dnf clean all

On a recently updated Ubuntu server, apt clean alone can recover 500MB–1GB. Not dramatic, but free space is free space — and it takes two seconds.

5. Find and Handle Core Dumps

# Check for core dumps
ls -lh /var/lib/systemd/coredump/

# Clean them out
coredumpctl list
rm /var/lib/systemd/coredump/*

# Or disable core dumps entirely in /etc/security/limits.conf
# * hard core 0

6. Check Inode Usage

Sometimes df -h shows plenty of space but writes still fail. The culprit is often inodes, not bytes:

df -i

# Shows inode usage % per filesystem
# If Use% is at 100%, you've run out of inodes

Millions of tiny files can exhaust the inode table long before disk bytes run out. Mailer queues, PHP session files, and application caches are the usual offenders. Find which directory is responsible:

# Find directories with the most files
for i in /var/*; do echo $(find $i -type f 2>/dev/null | wc -l) $i; done | sort -rn | head -10

Setting Up Monitoring So You Catch This Early

The real lesson from every disk-full incident isn’t “how do I fix it” — it’s “why didn’t I know sooner.” A simple cron job alerting at 80% costs you 10 minutes to set up and saves you from exactly this situation:

#!/bin/bash
# /usr/local/bin/disk-check.sh
THRESHOLD=80
USAGE=$(df / | awk 'NR==2 {print $5}' | tr -d '%')

if [ "$USAGE" -ge "$THRESHOLD" ]; then
  echo "ALERT: Disk usage on $(hostname) is at ${USAGE}%" | \
    mail -s "Disk Warning: $(hostname)" [email protected]
fi

# Add to crontab to check every hour
crontab -e
# Add: 0 * * * * /usr/local/bin/disk-check.sh

For multiple servers, Netdata, Prometheus + Grafana, or Zabbix agents handle disk monitoring at scale. Single VPS? The cron script above is enough. Ship it.

A Checklist for When You’re Under Fire

When the alarm is going off and you need space now, run through this in order:

Run df -h to confirm which filesystem is full
Run du -sh /var/* | sort -rh | head -20 to find the biggest directories
Check for deleted-but-open files: lsof | grep deleted
Truncate or rotate the largest active log file
Run journalctl --vacuum-size=200M
If Docker is present: docker system prune
Run apt clean or dnf clean all
Check inode usage: df -i

Steps 2–5 alone usually recover enough space to stabilize the system. Then you can breathe, investigate the root cause, and fix it properly.

Preventing the Next Incident

Every disk-full crisis I’ve seen followed the same pattern: something grew unchecked for weeks, nobody noticed, then everything broke at once. One-time cleanups don’t fix that. Permanent changes do:

Configure logrotate for all application logs — check /etc/logrotate.d/ for existing configs and add any missing ones
Set SystemMaxUse in journald config
Run docker system prune weekly via cron on build/CI servers
Add disk usage to your monitoring stack — whatever you use, make sure it pages you at 80%
Review backup retention policies — old backups sitting on the same disk as your application is a disaster waiting to happen

Disk space management isn’t glamorous. But 30 minutes of setup today prevents a 2 AM firefight six months from now. Trust me on that one.