PostgreSQL Performance Analysis with pg_stat_statements: Find and Fix Slow Queries

The Symptom Nobody Can Explain

Your PostgreSQL database is slow. Response times spike at 2 AM, certain API endpoints time out under load, and your application logs are full of timeout errors — but nobody knows which query is actually causing it.

Guessing which query to optimize wastes days. You need data, not hunches. And yet most teams reach for the wrong tools first.

pg_stat_statements is a built-in PostgreSQL extension that tracks execution statistics for every query running through your database — total calls, total time, mean time, rows returned, cache hit rates. Think of it as a flight recorder for your queries. It’s been shipping with PostgreSQL since version 9.2, and it’s almost certainly already installed on your server.

Approach Comparison: How Teams Usually Debug Slow Databases

Before getting into setup, it’s worth understanding the alternatives — because each has real trade-offs that bite you at different times.

Approach 1: Application-Level Logging

Wrap your database calls in timing code and log anything over a threshold. Most teams start here.

Pros: Easy to add, gives you business context (which user triggered the query), no database changes needed
Cons: Blind to queries from background jobs, cron tasks, migrations, and third-party tools. You only see what your application instrumented. It also adds overhead to your app code.

Approach 2: PostgreSQL Slow Query Log

Set log_min_duration_statement in postgresql.conf to log any query exceeding a time threshold.

Pros: Catches everything, no extension needed, easy to enable
Cons: Logs raw query text with actual parameter values — a privacy and security risk in production. Log volume explodes under load. Gives you individual query instances, not aggregated statistics. It can’t answer “which query runs 50,000 times per day and averages 200ms?”

Approach 3: pg_stat_statements (Recommended)

An official PostgreSQL extension that aggregates statistics per normalized query pattern across all connections, users, and tools.

Pros: Low overhead (~1–5% CPU), catches all queries regardless of source, normalizes parameters (treats WHERE id = 1 and WHERE id = 2 as the same query pattern), gives you aggregated data you can actually prioritize from
Cons: Requires superuser to enable, needs a PostgreSQL restart on first install, doesn’t capture query parameters by default

For ongoing production work, pg_stat_statements wins. The slow query log is handy for one-off debugging sessions; pg_stat_statements gives you the statistical picture to prioritize where to spend your time.

Setting Up pg_stat_statements

Step 1: Enable the Extension

Add it to shared_preload_libraries in your postgresql.conf. A restart is required — that’s the one unavoidable step.

# Find your postgresql.conf
psql -U postgres -c "SHOW config_file;"

# Edit postgresql.conf and add/update this line:
shared_preload_libraries = 'pg_stat_statements'

# Optional but recommended settings:
pg_stat_statements.max = 10000       # Max number of tracked statements
pg_stat_statements.track = all       # Track all statements including nested
pg_stat_statements.track_utility = on  # Track COPY, VACUUM, etc.

# Restart PostgreSQL
sudo systemctl restart postgresql

# On managed services (RDS, Cloud SQL), modify the parameter group
# and apply — usually requires a reboot

Step 2: Create the Extension in Your Database

-- Run as superuser in each database you want to monitor
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

-- Verify it's working
SELECT count(*) FROM pg_stat_statements;

Finding Your Worst Queries

Let the extension collect data for a few hours — ideally 24–48 hours under real traffic. Then run these queries.

Top Queries by Total Time (Your Biggest Bottlenecks)

SELECT
  round(total_exec_time::numeric, 2) AS total_ms,
  calls,
  round(mean_exec_time::numeric, 2) AS mean_ms,
  round(stddev_exec_time::numeric, 2) AS stddev_ms,
  round((100 * total_exec_time /
    sum(total_exec_time) OVER ())::numeric, 2) AS percentage,
  left(query, 120) AS query_preview
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 15;

Total time is what actually matters for database load. A query averaging 5ms that runs 100,000 times per day burns 500 seconds of database time. A 500ms query running once a week barely registers. Optimize by total cost, not peak latency.

Top Queries by Mean Execution Time (Your Slowest Individual Queries)

SELECT
  calls,
  round(mean_exec_time::numeric, 2) AS mean_ms,
  round(max_exec_time::numeric, 2) AS max_ms,
  round(stddev_exec_time::numeric, 2) AS stddev_ms,
  left(query, 120) AS query_preview
FROM pg_stat_statements
WHERE calls > 100  -- Ignore one-off queries
ORDER BY mean_exec_time DESC
LIMIT 15;

Cache Miss Ratio (Buffer Hit Rate)

SELECT
  calls,
  round(mean_exec_time::numeric, 2) AS mean_ms,
  shared_blks_hit,
  shared_blks_read,
  round(100.0 * shared_blks_hit /
    NULLIF(shared_blks_hit + shared_blks_read, 0), 2) AS hit_rate_pct,
  left(query, 100) AS query_preview
FROM pg_stat_statements
WHERE shared_blks_hit + shared_blks_read > 0
ORDER BY shared_blks_read DESC
LIMIT 10;

A hit_rate_pct below 90% on a frequently-called query means it’s hammering disk. Either it needs an index, or your shared_buffers is undersized for your working set. On a server with 32GB RAM, shared_buffers is often still set to the default 128MB — check it.

From Stats to Action: A Practical Optimization Workflow

Step 1: Grab the Full Query Text

The query column in pg_stat_statements is normalized and truncated. To get the full text of a problem query:

SELECT query
FROM pg_stat_statements
WHERE queryid = 1234567890;  -- Use the queryid from your earlier results

Step 2: Run EXPLAIN ANALYZE

Substitute real example parameters and run the query through EXPLAIN ANALYZE. The BUFFERS option is critical — without it you can’t see I/O behavior.

EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT u.id, u.email, o.total
FROM users u
JOIN orders o ON o.user_id = u.id
WHERE u.created_at > '2025-01-01'
  AND o.status = 'pending';

Three things to look for: Seq Scan on large tables (usually needs an index), a large gap between actual rows and estimated rows (stale statistics — run ANALYZE), and Hash Join nodes spilling to disk.

Step 3: Create Targeted Indexes

-- Composite index matching your WHERE + JOIN conditions
CREATE INDEX CONCURRENTLY idx_orders_user_status
  ON orders (user_id, status)
  WHERE status = 'pending';  -- Partial index if you only query one status

-- After creating, verify the planner uses it
EXPLAIN (ANALYZE, BUFFERS)
SELECT ... -- your query

Always use CONCURRENTLY in production. Without it, the index build takes an ACCESS EXCLUSIVE lock — every read and write blocks until it finishes.

Maintenance: Reset and Track Over Time

After an optimization, reset the stats so you can measure the improvement on a clean baseline:

-- Reset stats for all queries (superuser only)
SELECT pg_stat_statements_reset();

-- Or reset stats for a specific query (PostgreSQL 12+)
SELECT pg_stat_statements_reset(0, 0, queryid)
FROM pg_stat_statements
WHERE query LIKE '%your_table%'
LIMIT 1;

A simple weekly cron job that exports the top 20 queries to a CSV gives you a trend line over time. When you need to convert that CSV to JSON for import into a monitoring dashboard, toolcraft.app/en/tools/data/csv-to-json handles it in the browser — no upload to a third-party server, which matters when the CSV contains query patterns from a production system.

Recommended Production Configuration

Here’s the postgresql.conf setup worth running on any production instance:

# postgresql.conf
shared_preload_libraries = 'pg_stat_statements'
pg_stat_statements.max = 10000
pg_stat_statements.track = all
pg_stat_statements.track_utility = on
pg_stat_statements.save = on  # Persist stats across restarts

# Slow query log as a complement (not replacement)
log_min_duration_statement = 1000  # Log queries over 1 second
log_line_prefix = '%t [%p]: [%l-1] db=%d,user=%u,app=%a,client=%h '

pg_stat_statements.save = on is easy to overlook. It persists statistics across restarts — which matters when you’re chasing a pattern that built up over several days and then the server gets rebooted.

For dashboards, both pgBadger (log-based) and pganalyze (SaaS) build on top of this same data. AWS RDS Performance Insights does too. None of them are required to get started — the raw SQL queries above cover the 80% case.

Put It to Work

Enable the extension, collect 24–48 hours of real traffic, then run the total-time query. Consistently, 80% of database load comes from 3–5 queries. Fix those first. Everything else is noise until you’ve dealt with the top of the list.

The stats don’t lie. If a query sits at the top of your total_exec_time ranking, that’s where your effort pays off most. Stop guessing, start measuring.