Choosing the Right Home for Your Vectors
Walk into any developer meetup today, and you’ll hear the same debate: which vector database should we use for RAG? If you’re building a recommendation engine or a document bot, the first hurdle is always storage.
Names like Pinecone, Milvus, or Qdrant lead the conversation. They’re powerful, specialized tools, but they come with an “architectural tax”—a new query language to learn, another system to monitor, and the constant friction of keeping your relational data in sync with your vector store.
I’m a big fan of the “boring technology” philosophy. If Postgres can do it, Postgres should do it. This is where pgvector shines. It’s an open-source extension that turns your standard Postgres instance into a vector engine. Instead of shipping data across the network to a third-party service, you keep your metadata and embeddings in one place. This simplifies your stack. You can join similarity search results with actual user tables or product catalogs in a single SQL query.
Approach Comparison: Specialized vs. pgvector
Let’s be clear: specialized vector databases are built for Approximate Nearest Neighbor (ANN) search at a massive scale. If you’re indexing 500 million vectors and need sub-5ms latency under heavy load, they are worth the complexity. However, for 95% of production apps—those handling fewer than 10 million vectors—pgvector is more than enough.
When you go the specialized route, you often create “data silos.” You store a vector in Pinecone and a corresponding ID in Postgres. To show a user a product name, you first query Pinecone for the ID, then hit Postgres for the details. This two-step dance adds latency. With pgvector, one query handles everything. You also get the peace of mind that comes with ACID compliance, point-in-time recovery, and decades of battle-tested backup tools.
The Trade-offs: Pros and Cons
Before you commit to pgvector, you need to understand the practicalities of running it. I’ve deployed it in several LLM-powered projects, and here is how the balance sheet looks.
The Pros
- Native Joins: Your vectors live in the same rows as your relational data. There’s no syncing logic to break.
- SQL Familiarity: If you can write a
SELECTstatement, you’re 80% there. Finding similar vectors looks just like anORDER BYclause. - Ecosystem Ready: Prisma, Drizzle, and SQLAlchemy already support it. Your existing observability stack won’t need an overhaul.
- Power Filtering: Filtering by a
user_idor atimestampbefore a vector search is significantly faster in Postgres than in most specialized stores.
The Cons
- CPU Demands: Distance calculations are math-heavy. On a busy DB, vector searches can spike CPU usage to 80% or higher if not indexed properly.
- RAM Requirements: High-performance indexes like HNSW need to stay in memory. For 1 million vectors (1536-dimensions), expect to dedicate at least 2-3GB of RAM just for the index.
- Build Times: Unlike a B-tree index, a vector index on 5 million rows can take 30 minutes or more to build.
Getting Started
Getting this running doesn’t require a complex setup. If you’re using Docker, use the official image to ensure all C extensions are compiled correctly.
# Spin up a pgvector container
docker run --name pgvector-demo -e POSTGRES_PASSWORD=mysecretpassword -p 5432:5432 -d ankane/pgvector
Managed services like AWS RDS, Google Cloud SQL, and Azure all support pgvector now. You just need to flip a toggle. For local macOS development, brew install pgvector gets you moving in seconds.
A Note on Data Preparation
Your raw data often starts in messy CSVs or Excel sheets. Before generating embeddings, I often need to convert these into clean JSON structures. I use toolcraft.app/en/tools/data/csv-to-json for this. It runs entirely in your browser, which is vital when you’re handling sensitive customer data that shouldn’t leave your machine.
Implementation Guide
Let’s look at a practical implementation for a documentation search bot.
1. Enable the Extension
Run this once on your database to unlock vector support.
CREATE EXTENSION IF NOT EXISTS vector;
2. Define Your Schema
You must specify the dimensions. OpenAI’s text-embedding-3-small outputs 1536 dimensions. If you use the large model, that jumps to 3072.
CREATE TABLE documents (
id SERIAL PRIMARY KEY,
content TEXT,
metadata JSONB,
embedding vector(1536)
);
3. Inserting Data
Vectors are essentially arrays of floats. In standard SQL, you pass them as a string.
INSERT INTO documents (content, embedding)
VALUES ('Postgres is an excellent vector store.', '[0.012, -0.023, 0.45, ...]');
4. The Similarity Search
This is the core feature. pgvector uses specific operators to calculate distance:
<->: Euclidean distance (L2)<=>: Cosine distance (The industry standard for text)
-- Find the 5 most relevant docs
SELECT content, 1 - (embedding <=> '[0.015, -0.021, 0.48, ...]') AS similarity
FROM documents
ORDER BY embedding <=> '[0.015, -0.021, 0.48, ...]'
LIMIT 5;
5. Performance Tuning
Without an index, Postgres performs a “sequential scan.” This is fine for 5,000 rows, but it will crawl once you hit 100,000. You have two main options:
IVFFlat
This clusters vectors into lists. It’s memory-efficient but requires a “training” set of data to build the clusters accurately. Use this if your RAM budget is tight.
CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);
HNSW (Hierarchical Navigable Small Worlds)
HNSW is the gold standard. It builds a graph structure that allows for incredibly fast lookups. In my testing on a standard RDS instance, HNSW kept query times under 20ms for a million-row table. It uses more RAM, but the speed is worth it.
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);
Final Thoughts
Choosing pgvector is usually the smartest move for teams already comfortable with the Postgres ecosystem. It keeps your architecture lean and your deployment pipeline simple. While it won’t replace a specialized database for every edge case, its performance is more than enough for the vast majority of AI apps being built today.
My advice? Start with pgvector. The wall you think you’ll hit is much further away than you realize. Keep it simple, and focus on your product instead of managing another database.

