When Postgres Is Enough (Longer Than You Think)
Architecture Patterns — Part 4 of 30
The Migration That Didn't Need to Happen
It was 2 AM on a Tuesday when Marcus called me. He was the lead architect at a mid-sized e-commerce company, and they had just finished a six-month migration from PostgreSQL to MongoDB. Their reasoning had seemed solid at the time: product catalogs have flexible schemas, NoSQL scales horizontally, everyone says document databases are better for this use case.
The migration cost roughly $400,000 in engineering time. And now, at 2 AM, they were dealing with their third data consistency incident in as many weeks.
"We can't do joins," he told me. "Every time we need to show a product with its category, its inventory count, and its active promotions, we're making four separate queries from the application layer. The app code is a nightmare."
I asked him: what was wrong with Postgres before the migration?
Long pause.
"It was... slow on a few queries. We had some GIN indexes that weren't tuned right."
Six months. $400,000. A 2 AM phone call. For some untuned GIN indexes.
This is the story I tell every architect who comes to me saying they need to "graduate" from PostgreSQL.
Why Postgres Gets Underestimated
PostgreSQL has an image problem. Developers learn it as "the SQL database," associate it with rigid schemas, and assume it tops out somewhere around "serious startup scale." Then they read a blog post from 2016 about horizontal scaling and start eyeing DynamoDB or MongoDB.
The reality in 2025-2026 is radically different.
Andy Pavlo, the database researcher at Carnegie Mellon who tracks the industry as closely as anyone alive, concluded in his 2025 database retrospective that "most of the database energy and activity is going into PostgreSQL companies, offerings, projects, and derivative systems." In 2025 alone:
- Databricks paid $1 billion to acquire Neon, a serverless PostgreSQL company
- Snowflake paid $250 million to acquire CrunchyData, another PostgreSQL DBaaS
- Microsoft launched HorizonDB, its new PostgreSQL cloud service
- Every major cloud vendor — AWS, Google, Azure, IBM, Oracle — now has a serious dedicated PostgreSQL offering
The hottest data companies in the world are betting billions on Postgres. What do they know that your team doesn't?
The Decision Framework: When to Stay vs. When to Leave
Before we get into capabilities, let me give you the framework I use when evaluating whether Postgres is the right call. I call it the ACIDS test (yes, I know, but the acronym works):
A — Access patterns: Are your queries relational? Do you need joins across entities? Postgres wins.
C — Consistency requirements: Do you need ACID transactions? Do you need to avoid dual-write problems? Postgres wins.
I — Infrastructure complexity budget: How much operational overhead can your team absorb? Every database you add is another runbook, another oncall rotation, another failure mode.
D — Data shape: Is your data truly schemaless, or just "I haven't designed the schema yet"? Most "schemaless" workloads are actually just underdesigned schemas.
S — Scale ceiling: What is your actual scale? Not your imagined scale in 18 months — your current scale, right now. PostgreSQL handles millions of rows with sub-millisecond queries when indexed properly. Billions of rows with table partitioning. Hundreds of thousands of writes per second with connection pooling via PgBouncer and careful tuning.
If you fail more than one of these, that's when you start seriously evaluating alternatives. Most teams fail zero or one.
Five Things Postgres Does That You Probably Think You Need Another Tool For
1. Document Storage (Your MongoDB Use Case)
The moment your product manager says "flexible schema," engineers reach for MongoDB. Stop.
PostgreSQL's JSONB column type gives you indexed, queryable, binary-encoded JSON storage with full GIN index support. You can store arbitrary JSON documents alongside relational data, query nested fields, and still run joins against other tables.
-- Create a table with hybrid relational + document storage
CREATE TABLE products (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name TEXT NOT NULL,
category_id INT REFERENCES categories(id),
price NUMERIC(10,2) NOT NULL,
attributes JSONB, -- flexible, product-specific fields
created_at TIMESTAMPTZ DEFAULT now()
);
-- GIN index for fast JSONB queries
CREATE INDEX idx_products_attributes ON products USING GIN (attributes);
-- Query by nested JSON field — fully indexed
SELECT name, price, attributes->>'color'
FROM products
WHERE attributes @> '{"color": "red", "size": "M"}';
-- Combine relational + document query in one shot
SELECT p.name, p.price, c.name AS category, p.attributes->>'brand'
FROM products p
JOIN categories c ON c.id = p.category_id
WHERE p.attributes->>'brand' = 'Nike'
AND p.price < 100;
You get your flexible schema. You keep your joins. You don't need MongoDB.
2. Full-Text Search (Your Elasticsearch Use Case)
Engineering teams add Elasticsearch to their stack the moment someone asks for a search box. I've seen this add 40+ hours of DevOps work, an extra $300/month in infrastructure, and a whole new class of sync bugs to maintain.
Postgres has had full-text search built in for years:
-- Add a generated tsvector column for auto-indexed search
ALTER TABLE products
ADD COLUMN search_vector tsvector
GENERATED ALWAYS AS (
to_tsvector('english', coalesce(name, '') || ' ' || coalesce(description, ''))
) STORED;
-- GIN index for the search vector
CREATE INDEX idx_products_search ON products USING GIN (search_vector);
-- Full-text search query with ranking
SELECT
name,
price,
ts_rank(search_vector, query) AS relevance
FROM products, to_tsquery('english', 'running & shoes') query
WHERE search_vector @@ query
ORDER BY relevance DESC
LIMIT 20;
For typo tolerance, add pg_trgm:
CREATE EXTENSION pg_trgm;
CREATE INDEX idx_products_name_trgm ON products USING GIN (name gin_trgm_ops);
-- Fuzzy match: finds "Nikey" matching "Nike"
SELECT name FROM products
WHERE name % 'Nikey'
ORDER BY similarity(name, 'Nikey') DESC;
Is this Elasticsearch-tier? For most applications — yes. You lose advanced tokenization for some languages and real-time distributed search across massive indices. But for a product search on a typical SaaS application? Postgres is more than enough.
3. Vector Search for AI Features (Your Pinecone/Weaviate Use Case)
This is the big one for 2025. Every team building AI features is reaching for a dedicated vector database. For most of them, pgvector makes that unnecessary.
-- Enable pgvector extension
CREATE EXTENSION vector;
-- Add an embedding column to your existing table
ALTER TABLE articles ADD COLUMN embedding vector(1536);
-- Create an HNSW index for fast approximate nearest-neighbor search
CREATE INDEX idx_articles_embedding
ON articles USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);
-- Semantic similarity search — find articles similar to a given embedding
SELECT
id,
title,
1 - (embedding <=> $1::vector) AS similarity
FROM articles
WHERE 1 - (embedding <=> $1::vector) > 0.7
ORDER BY similarity DESC
LIMIT 10;
-- Hybrid search: combine vector similarity with full-text + relational filters
SELECT
a.id,
a.title,
a.author,
ts_rank(a.search_vector, query) AS text_score,
1 - (a.embedding <=> $1::vector) AS vector_score
FROM articles a, to_tsquery('english', $2) query
WHERE a.search_vector @@ query
AND a.published_at > now() - interval '30 days'
ORDER BY (text_score + vector_score) DESC
LIMIT 10;
Per Simple Talk's December 2025 guide, pgvector now supports cosine distance, inner product, and Euclidean distance, and its HNSW indexes scale to millions of embeddings with sub-100ms query times. The reason Databricks paid $1 billion for Neon is precisely this: agents need a database that can handle vector workloads, transactions, and metadata queries together — and Postgres does all three.
4. Time-Series Data (Your InfluxDB Use Case)
Application metrics, audit logs, event streams — teams add InfluxDB or TimescaleDB the moment data has a timestamp. But TimescaleDB is literally just a Postgres extension, and native Postgres handles time-series workloads well with table partitioning.
-- Partitioned table for time-series events
CREATE TABLE events (
id BIGSERIAL,
user_id UUID NOT NULL,
event_type TEXT NOT NULL,
payload JSONB,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
) PARTITION BY RANGE (created_at);
-- Create monthly partitions (automate this with a cron job)
CREATE TABLE events_2026_01
PARTITION OF events
FOR VALUES FROM ('2026-01-01') TO ('2026-02-01');
CREATE TABLE events_2026_02
PARTITION OF events
FOR VALUES FROM ('2026-02-01') TO ('2026-03-01');
-- Index on the partition column
CREATE INDEX ON events (created_at, user_id);
-- Fast time-windowed queries using partition pruning
SELECT
date_trunc('hour', created_at) AS hour,
event_type,
count(*) AS count
FROM events
WHERE created_at >= now() - interval '7 days'
GROUP BY 1, 2
ORDER BY 1 DESC;
Postgres's query planner understands partition pruning — it only scans the relevant monthly table instead of the entire dataset. For most metrics workloads, this is enough. Add pg_partman for automatic partition management.
5. Pub/Sub and Background Jobs (Your Redis Use Case)
Teams add Redis for two reasons: caching and queues. Caching is legitimate. Queues, often, are not.
Postgres LISTEN/NOTIFY gives you lightweight pub/sub, and SKIP LOCKED gives you a rock-solid job queue:
-- Job queue table
CREATE TABLE job_queue (
id BIGSERIAL PRIMARY KEY,
queue_name TEXT NOT NULL DEFAULT 'default',
payload JSONB NOT NULL,
status TEXT NOT NULL DEFAULT 'pending',
attempts INT NOT NULL DEFAULT 0,
created_at TIMESTAMPTZ DEFAULT now(),
scheduled_at TIMESTAMPTZ DEFAULT now()
);
CREATE INDEX idx_jobs_pending
ON job_queue (queue_name, scheduled_at)
WHERE status = 'pending';
import psycopg2
from contextlib import contextmanager
@contextmanager
def dequeue_job(conn, queue_name='default'):
"""Atomically claim a job — safe for concurrent workers."""
with conn.cursor() as cur:
cur.execute("""
UPDATE job_queue
SET status = 'processing', attempts = attempts + 1
WHERE id = (
SELECT id FROM job_queue
WHERE queue_name = %s
AND status = 'pending'
AND scheduled_at <= now()
ORDER BY scheduled_at ASC
FOR UPDATE SKIP LOCKED
LIMIT 1
)
RETURNING id, payload
""", (queue_name,))
row = cur.fetchone()
if row is None:
yield None
return
job_id, payload = row
try:
yield payload
cur.execute("UPDATE job_queue SET status = 'done' WHERE id = %s", (job_id,))
conn.commit()
except Exception:
cur.execute("UPDATE job_queue SET status = 'failed' WHERE id = %s", (job_id,))
conn.commit()
raise
FOR UPDATE SKIP LOCKED ensures two workers never grab the same job. This pattern, used by libraries like good_job (Rails) and pg-boss (Node.js), is battle-tested in production at serious scale. No Redis. No RabbitMQ. One less failure domain.
Where Postgres Actually Isn't Enough
I want to be honest here, because the goal isn't "use Postgres for everything" — it's "use Postgres until you have a real reason not to."
Genuine reasons to leave Postgres:
Multi-region active-active writes. If you have users in Tokyo, Frankfurt, and New York who all write data and need sub-100ms local latency with strong consistency, Postgres's single-primary model becomes a real constraint. This is CockroachDB or Spanner territory.
Write throughput beyond ~100k TPS sustained. A well-tuned Postgres instance with connection pooling can handle enormous write loads, but if you're consistently above ~100k transactions per second with no batching, you're approaching genuine limits.
Massive fan-out read scalability. If you need to serve read queries to millions of concurrent connections (not just requests — connections), Postgres's process-per-connection model needs PgBouncer at minimum, and beyond a certain point you're fighting the architecture.
Truly schemaless, unpredictable data. If your data is so heterogeneous that even JSONB feels constraining — like storing arbitrary sensor data from thousands of different hardware vendors — a document database might genuinely fit better.
The key word in all four cases: genuine. Not "we might need this someday." Not "our CTO read about it at a conference." A real, current, demonstrable constraint that you have hit in production.
The Real Cost of Premature Migration
When Marcus called me at 2 AM, his company hadn't just spent $400,000 on the migration. They had also:
- Lost the ability to do joins, forcing complex denormalization and multi-query patterns in application code
- Introduced eventual consistency bugs that were now causing customer-facing data errors
- Added a new technology that half the engineering team didn't deeply understand
- Broken their existing reporting pipeline, which was built on SQL
Every database you add multiplies your operational surface area. You now have two systems to monitor, two backup strategies, two upgrade paths, two on-call runbooks. Distributed systems are fundamentally harder to reason about than single-system consistency.
The 2026 InfoWorld analysis of database migrations is blunt about the cost: "The biggest challenge around database migrations is getting people on board with the project." That's because migrations are rarely scoped correctly. What looks like a two-month project becomes six months. What looks like a performance fix becomes a consistency crisis.
Before you migrate, answer this: have you actually maxed out Postgres? Have you:
- ✅ Added proper indexes for your slow queries?
- ✅ Run
EXPLAIN ANALYZEon your worst performers? - ✅ Set up connection pooling with PgBouncer?
- ✅ Configured
work_memandshared_buffersfor your server's RAM? - ✅ Enabled table partitioning for large, time-bound tables?
- ✅ Tried read replicas for read-heavy workloads?
If the answer to any of these is "no," you have not maxed out Postgres. You have a tuning problem, not a database problem.
A Practical Tuning Baseline
Here's the configuration I start with for any production Postgres instance (adjust for your server's RAM):
# postgresql.conf baseline for a 16GB RAM server
# Memory
shared_buffers = 4GB # ~25% of total RAM
effective_cache_size = 12GB # ~75% of total RAM
work_mem = 64MB # per-sort operation (watch for high concurrency)
maintenance_work_mem = 1GB # for VACUUM, CREATE INDEX, etc.
# Write performance
wal_buffers = 64MB
checkpoint_completion_target = 0.9
max_wal_size = 4GB
# Connections (use PgBouncer in front of this)
max_connections = 200
# Query planning
random_page_cost = 1.1 # for SSDs (default 4.0 assumes spinning disk!)
effective_io_concurrency = 200 # for SSDs
Run this immediately after:
-- Enable useful extensions
CREATE EXTENSION IF NOT EXISTS pg_stat_statements; -- query performance tracking
CREATE EXTENSION IF NOT EXISTS pgcrypto; -- gen_random_uuid()
CREATE EXTENSION IF NOT EXISTS btree_gin; -- composite GIN indexes
-- Find your slowest queries
SELECT
round(total_exec_time::numeric, 2) AS total_ms,
calls,
round(mean_exec_time::numeric, 2) AS avg_ms,
round((100 * total_exec_time / sum(total_exec_time) OVER ())::numeric, 2) AS pct,
query
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 20;
This pg_stat_statements query alone has saved more teams from premature database migrations than any architectural advice I've ever given. It shows you exactly where time is being spent. In 90% of cases, there are 2-3 queries that account for 80% of your database load. Fix those queries. You don't need a new database.
The Right Mental Model
Think of your database stack as a spectrum of justified complexity:
Postgres alone
→ + PgBouncer (connection pooling)
→ + read replicas (read scale)
→ + table partitioning (large tables)
→ + pgvector (AI/ML features)
→ + pg_trgm / FTS (search)
→ + TimescaleDB extension (serious time-series)
→ + Redis (caching layer, not queue replacement)
→ Specialized DB (you've genuinely hit a wall)
Every step to the right is justified only when you've genuinely exhausted the step to the left. Most teams are operating at "Postgres alone" and jumping straight to "specialized DB" because someone read a Hacker News thread.
The billion-dollar acquisitions in 2025 — Databricks buying Neon, Snowflake buying CrunchyData — aren't happening because Postgres is a legacy technology. They're happening because Postgres is so good that the most sophisticated data companies in the world are building their AI-era infrastructure on top of it.
Marcus rebuilt his product catalog in Postgres six months after the MongoDB migration. It took three weeks. The data consistency bugs went away. The join latency is 4ms.
Architecture Checklist: Before You Leave Postgres
Before migrating to a different database, confirm you have done all of the following:
- Run
EXPLAIN ANALYZEon your five slowest queries and addressed the findings - Added indexes for every foreign key and every column used in WHERE clauses
- Configured
shared_buffersto ~25% of server RAM - Set
random_page_cost = 1.1if running on SSDs - Deployed PgBouncer for connection pooling with
pool_mode = transaction - Added at least one read replica for read-heavy workloads
- Implemented table partitioning for any table with >50M rows
- Evaluated JSONB for any "flexible schema" requirements
- Evaluated
tsvector/pg_trgmfor any text search requirements - Evaluated
pgvectorfor any vector/embedding requirements - Evaluated
SKIP LOCKEDfor any job queue requirements - Evaluated
LISTEN/NOTIFYfor any pub/sub requirements - Documented the specific Postgres limitation you've actually hit in production (not theoretical)
- Calculated the full engineering cost of migration, including application code changes and operational overhead
If you can check every box and still have a genuine, production-measured limit, you have earned the right to evaluate alternatives.
Until then: Postgres is enough.
Ask The Guild
I want to hear from you this week. Drop your answer in the community channel:
What's the most surprising thing you've replaced with Postgres — or the most painful migration you did away from Postgres that you later regretted?
Tag your response with #architecture-patterns so we can compile the best stories for the guild. The most instructive real-world examples might make it into a future article in this series.