Stop Over-Engineering: High-Performance PostgreSQL Search without Elasticsearch

Table of Contents

Six Months of PostgreSQL Search in Production

Six months ago, I hit a wall. A client’s application had scaled to the point where basic LIKE %keyword% queries were taking 5 to 8 seconds to return results—essentially crawling. The dev team’s gut reaction was predictable: “Let’s just spin up an Elasticsearch cluster.”

Elasticsearch is powerful, but it’s also a beast to manage. It brings a suitcase full of overhead: extra servers, complex data synchronization (CDC) logic, and a significantly higher cloud bill. Instead of jumping into that fire, I decided to see how far I could push PostgreSQL’s native Full-Text Search (FTS).

Fast forward half a year. We are running this setup in production with 1.2 million rows of content, and search queries still clock in under 100ms. For most SaaS products, Elasticsearch is overkill. Here is exactly how I built a production-grade search system inside the database we already owned.

The 5-Minute Setup

To get a functional search engine running, you only need two concepts: tsvector (the searchable document) and tsquery ( the search intent). Here is a robust implementation for a standard blog posts table.

-- 1. Create a table with an automated search column
-- (Requires PostgreSQL 12+ for generated columns)
CREATE TABLE posts (
    id SERIAL PRIMARY KEY,
    title TEXT,
    body TEXT,
    search_vector tsvector GENERATED ALWAYS AS (
        to_tsvector('english', coalesce(title, '') || ' ' || coalesce(body, ''))
    ) STORED
);

-- 2. Insert some realistic data
INSERT INTO posts (title, body) VALUES 
('Optimizing PostgreSQL', 'Using GIN indexes can cut query times from seconds to milliseconds.'),
('Modern Web Apps', 'Fast apps require efficient data fetching and minimal latency.'),
('Database Design', 'How to structure your tables for maximum scale without extra infra.');

-- 3. Create a GIN index for high-speed lookups
CREATE INDEX idx_posts_search ON posts USING GIN (search_vector);

Now you can query this using the @@ operator. This query finds posts containing ‘database’ or ‘optimizing’ instantly:

SELECT title 
FROM posts 
WHERE search_vector @@ to_tsquery('english', 'database | optimizing');

How it Works Under the Hood

Think of standard string matching as a literalist; it looks for exact characters. PostgreSQL’s FTS is smarter—it looks for lexemes. When you run to_tsvector('english', 'running'), Postgres reduces it to the root word ‘run’. This process, called stemming, ensures that a search for ‘runs’ or ‘running’ matches your document perfectly.

The tsvector type stores these normalized tokens along with their positions. Meanwhile, tsquery represents what the user is looking for, supporting boolean logic like & (AND), | (OR), and ! (NOT).

The Magic of GIN Indexes

B-Tree indexes—the default in most databases—are useless for full-text search because they only index the start of a string. A GIN (Generalized Inverted Index) works like the index at the back of a 500-page textbook. It maps every unique word to a list of IDs where it appears. When a user searches for ‘PostgreSQL’, the index gives the database the exact row locations immediately, bypassing a slow full table scan.

Advanced Features: Scoring and Logic

Users don’t just want results; they want the *best* results first. Postgres doesn’t just find matches; it scores them using ts_rank. You can even assign weights to different columns—making a match in the title significantly more valuable than one in the body.

-- Query with ranking and title-weighting
SELECT title, ts_rank(search_vector, query) as rank
FROM posts, to_tsquery('english', 'database') query
WHERE search_vector @@ query
ORDER BY rank DESC;

Handling Legacy Data and Imports

If you’re on Postgres 11 or older, you can’t use generated columns. For those setups, I reach for a trigger to keep the tsvector column in sync. It ensures the index stays updated without my application code ever knowing it exists.

When I was migrating 450MB of legacy CSV data for this project, I needed a quick way to prep seeding scripts. I used toolcraft.app/en/tools/data/csv-to-json because it runs entirely in the browser. No data ever left my machine, which kept the security team happy while I prepped the initial search vectors.

Hard-Won Production Tips

After monitoring our system for 180 days, here are my three essential takeaways for keeping performance predictable:

Weighting is Mandatory: A match in a 5-word title should always outrank a match in a 1,000-word body. Use setweight() to prioritize your ‘A’ (Title) and ‘B’ (Summary) fields.
Websearch Query: Don’t force users to learn & and | syntax. Use websearch_to_tsquery. It handles “Google-style” inputs like quotes for exact phrases and minus signs for exclusions out of the box.
Maintain Your GIN: GIN indexes can bloat if your table sees 50k+ writes per day. Keep autovacuum aggressive. I schedule a weekly VACUUM ANALYZE on search-heavy tables to keep query planners accurate.

When Should You Actually Move to Elasticsearch?

PostgreSQL is a beast—until it hits a ceiling. If you need advanced fuzzy matching beyond basic trigrams, or if you are searching billions of rows with sub-millisecond latency requirements, Elasticsearch is worth the headache. But for the vast majority of SaaS products, blogs, and internal tools, Postgres is a performance powerhouse that lives right inside your existing stack.