Stop Guessing, Start Measuring: Stress Testing MySQL and PostgreSQL with sysbench

Table of Contents

The Database Scaling Blind Spot

In my experience, production meltdowns rarely stem from a simple logic error. Usually, it’s a database that buckled under pressure. A query that flies in 5ms during development can easily balloon to 2,000ms when 200 concurrent users hit the same endpoint. Many teams treat database capacity as a guessing game. They scale up to a larger instance because the server “feels” slow, rather than relying on hard data.

The real culprit is a lack of a baseline. Without a stress test, you can’t tell if your bottleneck is disk I/O, CPU contention, or a misconfigured buffer pool. Benchmarking changes the game. It lets you push your database to the breaking point in a controlled environment so you can pinpoint exactly where the ceiling sits.

Choosing Your Benchmarking Strategy

Before you fire off any commands, keep in mind that not all benchmarks serve the same purpose. I usually categorize them into three distinct approaches:

1. Synthetic Benchmarks (sysbench)

This method uses standardized scripts to simulate OLTP (Online Transactional Processing) workloads. It is the gold standard for comparing hardware—like testing if a move from AWS gp2 to gp3 volumes is worth the cost—or for fine-tuning configuration parameters in a clean environment.

2. Application-Level Stress Testing (K6/JMeter)

These tools test your entire stack, from the API layer down to the ORM. While this mirrors the actual user experience, isolating database performance is difficult. Network overhead and application logic often muddy the results.

3. Traffic Replay

This involves capturing real production queries and replaying them against a staging environment. It is the most accurate method available, but the setup complexity is high and often requires specialized tooling.

I reach for sysbench in 90% of scenarios. Whether I’m validating a new RDS instance or a custom PostgreSQL config, it provides the raw, scriptable metrics needed to make an objective decision.

Pros and Cons of sysbench

Pros: You can set it up in under two minutes. It supports MySQL, PostgreSQL, and even Oracle. Most importantly, it provides detailed latency percentiles (95th and 99th) that averages often hide.
Cons: The default scripts are generic. It won’t simulate your specific 12-table joins or that one weird recursive CTE your app relies on.

The Ideal Test Environment

Never run a stress test against a production database. You will starve your users of resources. Instead, spin up an isolated instance that mirrors your production hardware exactly. If you use AWS, match the instance class (e.g., db.m5.large) and the EBS volume settings (e.g., 3,000 provisioned IOPS).

Data management is also key. When I need to quickly transform CSV datasets into JSON for configuration mappings during these setups, I use toolcraft.app/en/tools/data/csv-to-json. It processes everything in the browser. This ensures no sensitive configuration data ever leaves your machine.

Implementation Guide: Stress Testing MySQL

Install sysbench on a dedicated test runner machine. Do not run it on the database server itself, as the tool can consume several CPU cores just generating the load.

# Ubuntu/Debian
sudo apt-get update
sudo apt-get install sysbench

Step 1: Prepare the Data

First, generate a dataset large enough to challenge your hardware. I recommend creating at least 10 tables with 1 million rows each. This ensures we are testing actual disk I/O rather than just hitting the RAM cache.

sysbench oltp_read_write --db-driver=mysql \
  --mysql-host=your-db-host \
  --mysql-user=admin \
  --mysql-password=yourpassword \
  --mysql-db=test_db \
  --tables=10 \
  --table-size=1000000 \
  prepare

Step 2: Run the Benchmark

This command executes a read/write test for 5 minutes using 16 threads. It simulates 16 concurrent users slamming the database with transactions as fast as possible.

sysbench oltp_read_write --db-driver=mysql \
  --mysql-host=your-db-host \
  --mysql-user=admin \
  --mysql-password=yourpassword \
  --mysql-db=test_db \
  --threads=16 \
  --time=300 \
  --report-interval=10 \
  run

Implementation Guide: Stress Testing PostgreSQL

PostgreSQL testing follows a similar flow, though the connection parameters differ. If you are building from source, ensure libpq-dev is installed to enable the Postgres driver.

# Prepare phase for PostgreSQL
sysbench oltp_read_write --db-driver=pgsql \
  --pgsql-host=your-db-host \
  --pgsql-user=postgres \
  --pgsql-password=yourpassword \
  --pgsql-db=test_db \
  --pgsql-port=5432 \
  --tables=10 \
  --table-size=1000000 \
  prepare

While the test runs, I suggest opening a second terminal to monitor pg_stat_activity. If you see a high number of wait_event entries, you’ve likely hit a lock contention or I/O bottleneck.

# Run phase for PostgreSQL with 32 concurrent threads
sysbench oltp_read_write --db-driver=pgsql \
  --pgsql-host=your-db-host \
  --pgsql-user=postgres \
  --pgsql-password=yourpassword \
  --pgsql-db=test_db \
  --threads=32 \
  --time=300 \
  run

Interpreting the Results

When the run finishes, ignore the fluff and focus on these three metrics:

Transactions Per Second (TPS): This measures your total throughput. If you double your threads but the TPS stays flat, you’ve hit a hard hardware limit.
95th Percentile Latency: This is the most critical number. It means 95% of your queries finished within this time. If your average latency is 20ms but your 95th percentile is 500ms, your users will experience frequent, frustrating stutters.
Errors and Reconnects: A non-zero number here suggests your database is dropping connections or failing to handle row-level locks under load.

For example, if your 95th percentile latency jumps from 30ms to 400ms when moving from 16 to 32 threads, your current config cannot safely handle that level of concurrency.

Pro-Tips for Accurate Benchmarks

I’ve learned a few hard lessons over the years. First, always “warm up” the engine. The first run is usually slow because the buffer pool is empty. Run a 60-second test to prime the cache, then run your real 10-minute test. Second, keep a close eye on your “burst balance” if using cloud storage like AWS gp2. You might see amazing performance for the first 5 minutes, only to see it crater once your IOPS credits vanish.

Finally, watch your data size. If your dataset fits entirely in the innodb_buffer_pool_size (MySQL) or shared_buffers (PostgreSQL), you are only testing RAM speed. To truly test your disks, make sure the dataset is at least 2x larger than the available system memory.

Benchmarking isn’t a “one and done” task. I run these tests before every major architecture shift or cloud migration. It provides the mathematical proof that our infrastructure can actually handle the traffic we’re projecting.