The 2 AM Disk Space Emergency
I recently dealt with a PagerDuty alert that every DBA dreads: a production database hitting 95% disk utilization. The math didn’t add up. We hadn’t imported any new datasets, and our core table row counts were flat. Yet, our 500GB EBS volume was losing several gigabytes of free space every hour.
The culprit was “Table Bloat.” While I’ve managed MySQL and MongoDB, PostgreSQL handles data updates differently due to its Multi-Version Concurrency Control (MVCC) architecture. This design is great for performance, but it creates a specific challenge: dead tuples that hog space and refuse to leave.
How MVCC Actually Creates Bloat
To fix the growth, you have to understand how PostgreSQL handles writes. Most databases overwrite data in place. PostgreSQL doesn’t. To allow multiple users to access data simultaneously without heavy locking, it creates versions of rows.
When you execute an UPDATE, PostgreSQL leaves the original row on the disk and marks it as “expired” (a dead tuple). It then inserts a completely new version of that row. A DELETE works similarly; it doesn’t erase the bits immediately, it just flags the row as invisible to future transactions.
These dead tuples are like ghosts. They stay on your disk until a process called VACUUM clears them. If your write volume is high—say, 5,000 updates per second—and your vacuum settings are too conservative, these ghosts pile up until they consume more space than the actual data.
Identifying the Bottleneck
Don’t start changing configs blindly. You need to see which tables are failing to clean up. PostgreSQL tracks this in the pg_stat_user_tables view.
I use this query to find the worst offenders in a cluster:
SELECT
relname AS table_name,
n_live_tup AS live_rows,
n_dead_tup AS dead_rows,
last_vacuum,
last_autovacuum
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC;
If you see a table where dead_rows is more than 20% of live_rows, your autovacuum isn’t keeping up. If last_autovacuum is blank or days old, you have a configuration problem.
The Toolbox: VACUUM, VACUUM FULL, and ANALYZE
You have three main tools to manage this, but picking the wrong one during peak hours can take your application offline.
1. Standard VACUUM
This is your daily driver. It finds dead tuples and marks their space as available for new PostgreSQL data. It does not return the space to the operating system. You can run this while your app is live because it doesn’t block reads or writes.
VACUUM (VERBOSE, ANALYZE) orders;
2. VACUUM FULL
This is the nuclear option. It rewrites the entire table into a new file on disk, leaving the bloat behind. This is the only way to physically shrink the database size on your EBS volume. However, it requires an ACCESS EXCLUSIVE lock. Your application will not be able to read or write to the table until it finishes. On a 100GB table, this could take an hour of downtime.
3. ANALYZE
While VACUUM manages space, ANALYZE updates the statistics used by the Query Planner. If your stats are stale, PostgreSQL might ignore a fast index and perform a slow sequential scan. Always run ANALYZE after a bulk data load.
Tuning Autovacuum for High Traffic
The default PostgreSQL settings are often too passive for modern hardware. For example, the default autovacuum_vacuum_scale_factor is 0.2. This means a table with 100 million rows won’t trigger a cleanup until it hits 20 million dead tuples. That is far too much bloat for a production environment.
Update your postgresql.conf with these more aggressive values:
# Trigger vacuum when 5% of the table changes
autovacuum_vacuum_scale_factor = 0.05
# Trigger analyze when 2% of the table changes
autovacuum_analyze_scale_factor = 0.02
# Increase the workers if you have many active tables
autovacuum_max_workers = 5
# Give the workers more 'budget' to work faster
autovacuum_vacuum_cost_limit = 1000
For specific high-churn tables, like a sessions table, you can set even tighter limits without affecting the whole database:
ALTER TABLE sessions SET (
autovacuum_vacuum_scale_factor = 0.01,
autovacuum_vacuum_cost_limit = 500
);
The “Zombie Transaction” Trap
Sometimes Autovacuum runs but fails to reclaim space. This usually happens because of a long-running transaction. PostgreSQL cannot remove a dead tuple if an active transaction started before that row was deleted. A single forgotten BEGIN in a Python script or a hanging BI report can block the cleanup of millions of rows.
Find these blockers with this query:
SELECT pid, now() - xact_start AS duration, query, state
FROM pg_stat_activity
WHERE state != 'idle'
AND (now() - xact_start) > interval '10 minutes';
If a query has been running for hours, kill it. Once the transaction ends, Autovacuum will finally be able to do its job.
Final Checklist for Production
Database health isn’t a one-time fix. It’s a habit. To keep your PostgreSQL instance lean and fast, follow these rules:
- Monitor
n_dead_tupand set alerts when they exceed 15% of the table size. - Lower the
scale_factoras your data grows. 20% of 1GB is fine; 20% of 1TB is a disaster. - Avoid
VACUUM FULLunless you have a scheduled maintenance window. - Kill idle transactions that have been open for more than 30 minutes.
Understanding MVCC won’t just save your disk space; it will make your queries faster and your application more reliable.

