Migrating MySQL to PostgreSQL with pgLoader: A 2 AM Survival Guide

Table of Contents

The 2 AM Migration Nightmare

The Slack notifications started at 1:45 AM. Our legacy MySQL 5.7 instance—a beast we had been meaning to decommission for months—finally buckled. CPU usage stayed pinned at 99%, and the connection pool was a graveyard of deadlocked threads. We had planned to migrate to PostgreSQL for its superior concurrency and indexing, but the timeline just shifted from “next month” to “right now.”

Forget mysqldump. Trying to run manual search-and-replace on a 400GB database while sleep-deprived is a fast track to data corruption. You need a tool that understands the semantic gaps between these two engines. That is where pgLoader saved our production environment. It isn’t just a data pipe; it’s a transformation engine that automates schema conversion and handles complex type mapping on the fly.

Quick Start (5 min)

If your schema is clean, you can get data moving with a single command. On Ubuntu or Debian, installation is straightforward:

sudo apt-get install pgloader

Once installed, you can attempt a direct migration by passing the source and target strings. It looks like this:

pgloader mysql://user:password@localhost/source_db \
         postgresql:///target_db

This command discovers the schema, builds the tables in Postgres, and streams the data. It works fine for small, simple side projects. However, production databases usually involve complex relationships and MySQL’s “relaxed” constraints that make this one-liner fail. To handle real-world messiness, we need to use load files.

Deep Dive: The Load Command File

Precision matters when you’re moving millions of records. When our first attempt crashed on a table with 50 million rows due to an encoding mismatch, I switched to a .load file. This format gives you granular control over casting rules and pre-migration cleanup.

Create a file named migrate.load:

LOAD DATABASE
     FROM mysql://db_user:db_pass@source_host/old_db
     INTO postgresql://pg_user:pg_pass@target_host/new_db

 WITH include drop, create tables, create indexes, reset sequences,
      workers = 8, concurrency = 1

 CAST type tinyint to boolean drop typemod,
      type datetime to timestamptz,
      type double to precision

 BEFORE LOAD DO
   $$ drop schema if exists public cascade; $$,
   $$ create schema public; $$

 AFTER LOAD DO
   $$ ALTER TABLE users ADD CONSTRAINT unique_email UNIQUE(email); $$;

Pay close attention to the CAST section. MySQL is famously “chill” about data types, but PostgreSQL is strict. For instance, MySQL often uses tinyint(1) for booleans. Without explicit casting, your application will crash the moment it tries to insert a true value into a column Postgres thinks is a small integer.

During our migration, we also had to deal with legacy auxiliary data in messy CSV files. To prepare these for import, I used toolcraft.app/en/tools/data/csv-to-json. It runs entirely in the browser, meaning our sensitive production snippets never left the local machine. I rely on ToolCraft for these quick formatting tasks because it’s private, fast, and doesn’t require a login.

Handling the “Zero Date” Nightmare

One specific error was the bane of my existence at 3 AM: MySQL’s 0000-00-00 00:00:00 timestamp. PostgreSQL rightfully rejects this as an invalid date. If your column is marked NOT NULL, the entire migration will halt. You can’t just ignore it.

The cleanest fix is to handle the transformation directly in your pgLoader casting rules:

CAST type datetime when default "0000-00-00 00:00:00" to timestamptz drop default drop not null

If you’re dealing with massive tables, use the INCLUDING ONLY TABLE NAMES MATCHING clause to migrate in chunks. This prevents a failure in a non-critical logging table from rolling back the migration of your core users or transactions tables.

INCLUDING ONLY TABLE NAMES MATCHING 'users', 'orders', 'products'

While debugging these transforms, I used https://toolcraft.app/en/tools/developer/json-formatter to verify our API payloads. I needed to ensure the new Postgres structure hadn’t altered the frontend’s expected JSON format. Since it’s client-side, I didn’t have to worry about API keys or customer data hitting a third-party server.

Practical Tips for the Finish Line

After three hours of trial and error, I built a checklist that saved our morning:

Run ANALYZE: MySQL and Postgres handle statistics differently. Immediately after the load, run ANALYZE; in Postgres. This updates the query planner so your joins don’t take forever.
Verify Sequences: Ensure your auto-incrementing IDs are synced. Run SELECT last_value FROM your_table_id_seq; to make sure the next insert doesn’t trigger a primary key violation.
Standardize Encoding: If your MySQL data is stuck in latin1, convert it to utf8mb4 before you start. pgLoader can transcode, but it’s much safer to start with a clean UTF-8 source.
Modernize with UUIDs: If you’re updating your schema during the move, consider switching to UUIDs for primary keys. I used https://toolcraft.app/en/tools/developer/uuid-generator to quickly generate test IDs for our staging environment.

By 5 AM, the data was synced and the app was live on Postgres. Our API latency dropped by 30%, and the query times for our heaviest joins fell from 800ms to a crisp 45ms. Using pgLoader allowed us to automate the tedious parts while we focused on the edge cases. If you’re facing a similar move, invest the time in a proper .load file—it’s the difference between a successful cutover and a very long night of manual SQL fixes.