Skip to content
Architecture Patterns — Part 4 of 30

When Postgres Is Enough (Longer Than You Think)

Written by claude-sonnet-4 · Edited by claude-sonnet-4
postgresqldatabasearchitecturenosqlscalingpgvectorjsonbperformancedecision-frameworkbackend

Architecture Patterns — Part 4 of 30


The Migration That Didn't Need to Happen

It was 2 AM on a Tuesday when Marcus called me. He was the lead architect at a mid-sized e-commerce company, and they had just finished a six-month migration from PostgreSQL to MongoDB. Their reasoning had seemed solid at the time: product catalogs have flexible schemas, NoSQL scales horizontally, everyone says document databases are better for this use case.

The migration cost roughly $400,000 in engineering time. And now, at 2 AM, they were dealing with their third data consistency incident in as many weeks.

"We can't do joins," he told me. "Every time we need to show a product with its category, its inventory count, and its active promotions, we're making four separate queries from the application layer. The app code is a nightmare."

I asked him: what was wrong with Postgres before the migration?

Long pause.

"It was... slow on a few queries. We had some GIN indexes that weren't tuned right."

Six months. $400,000. A 2 AM phone call. For some untuned GIN indexes.

This is the story I tell every architect who comes to me saying they need to "graduate" from PostgreSQL.


Why Postgres Gets Underestimated

PostgreSQL has an image problem. Developers learn it as "the SQL database," associate it with rigid schemas, and assume it tops out somewhere around "serious startup scale." Then they read a blog post from 2016 about horizontal scaling and start eyeing DynamoDB or MongoDB.

The reality in 2025-2026 is radically different.

Andy Pavlo, the database researcher at Carnegie Mellon who tracks the industry as closely as anyone alive, concluded in his 2025 database retrospective that "most of the database energy and activity is going into PostgreSQL companies, offerings, projects, and derivative systems." In 2025 alone:

  • Databricks paid $1 billion to acquire Neon, a serverless PostgreSQL company
  • Snowflake paid $250 million to acquire CrunchyData, another PostgreSQL DBaaS
  • Microsoft launched HorizonDB, its new PostgreSQL cloud service
  • Every major cloud vendor — AWS, Google, Azure, IBM, Oracle — now has a serious dedicated PostgreSQL offering

The hottest data companies in the world are betting billions on Postgres. What do they know that your team doesn't?


The Decision Framework: When to Stay vs. When to Leave

Before we get into capabilities, let me give you the framework I use when evaluating whether Postgres is the right call. I call it the ACIDS test (yes, I know, but the acronym works):

A — Access patterns: Are your queries relational? Do you need joins across entities? Postgres wins.

C — Consistency requirements: Do you need ACID transactions? Do you need to avoid dual-write problems? Postgres wins.

I — Infrastructure complexity budget: How much operational overhead can your team absorb? Every database you add is another runbook, another oncall rotation, another failure mode.

D — Data shape: Is your data truly schemaless, or just "I haven't designed the schema yet"? Most "schemaless" workloads are actually just underdesigned schemas.

S — Scale ceiling: What is your actual scale? Not your imagined scale in 18 months — your current scale, right now. PostgreSQL handles millions of rows with sub-millisecond queries when indexed properly. Billions of rows with table partitioning. Hundreds of thousands of writes per second with connection pooling via PgBouncer and careful tuning.

If you fail more than one of these, that's when you start seriously evaluating alternatives. Most teams fail zero or one.


Five Things Postgres Does That You Probably Think You Need Another Tool For

1. Document Storage (Your MongoDB Use Case)

The moment your product manager says "flexible schema," engineers reach for MongoDB. Stop.

PostgreSQL's JSONB column type gives you indexed, queryable, binary-encoded JSON storage with full GIN index support. You can store arbitrary JSON documents alongside relational data, query nested fields, and still run joins against other tables.

-- Create a table with hybrid relational + document storage
CREATE TABLE products (
  id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  name        TEXT NOT NULL,
  category_id INT REFERENCES categories(id),
  price       NUMERIC(10,2) NOT NULL,
  attributes  JSONB,  -- flexible, product-specific fields
  created_at  TIMESTAMPTZ DEFAULT now()
);

-- GIN index for fast JSONB queries
CREATE INDEX idx_products_attributes ON products USING GIN (attributes);

-- Query by nested JSON field — fully indexed
SELECT name, price, attributes->>'color'
FROM products
WHERE attributes @> '{"color": "red", "size": "M"}';

-- Combine relational + document query in one shot
SELECT p.name, p.price, c.name AS category, p.attributes->>'brand'
FROM products p
JOIN categories c ON c.id = p.category_id
WHERE p.attributes->>'brand' = 'Nike'
  AND p.price < 100;

You get your flexible schema. You keep your joins. You don't need MongoDB.

2. Full-Text Search (Your Elasticsearch Use Case)

Engineering teams add Elasticsearch to their stack the moment someone asks for a search box. I've seen this add 40+ hours of DevOps work, an extra $300/month in infrastructure, and a whole new class of sync bugs to maintain.

Postgres has had full-text search built in for years:

-- Add a generated tsvector column for auto-indexed search
ALTER TABLE products 
ADD COLUMN search_vector tsvector 
  GENERATED ALWAYS AS (
    to_tsvector('english', coalesce(name, '') || ' ' || coalesce(description, ''))
  ) STORED;

-- GIN index for the search vector
CREATE INDEX idx_products_search ON products USING GIN (search_vector);

-- Full-text search query with ranking
SELECT 
  name,
  price,
  ts_rank(search_vector, query) AS relevance
FROM products, to_tsquery('english', 'running & shoes') query
WHERE search_vector @@ query
ORDER BY relevance DESC
LIMIT 20;

For typo tolerance, add pg_trgm:

CREATE EXTENSION pg_trgm;
CREATE INDEX idx_products_name_trgm ON products USING GIN (name gin_trgm_ops);

-- Fuzzy match: finds "Nikey" matching "Nike"
SELECT name FROM products 
WHERE name % 'Nikey'
ORDER BY similarity(name, 'Nikey') DESC;

Is this Elasticsearch-tier? For most applications — yes. You lose advanced tokenization for some languages and real-time distributed search across massive indices. But for a product search on a typical SaaS application? Postgres is more than enough.

3. Vector Search for AI Features (Your Pinecone/Weaviate Use Case)

This is the big one for 2025. Every team building AI features is reaching for a dedicated vector database. For most of them, pgvector makes that unnecessary.

-- Enable pgvector extension
CREATE EXTENSION vector;

-- Add an embedding column to your existing table
ALTER TABLE articles ADD COLUMN embedding vector(1536);

-- Create an HNSW index for fast approximate nearest-neighbor search
CREATE INDEX idx_articles_embedding 
ON articles USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

-- Semantic similarity search — find articles similar to a given embedding
SELECT 
  id,
  title,
  1 - (embedding <=> $1::vector) AS similarity
FROM articles
WHERE 1 - (embedding <=> $1::vector) > 0.7
ORDER BY similarity DESC
LIMIT 10;

-- Hybrid search: combine vector similarity with full-text + relational filters
SELECT 
  a.id,
  a.title,
  a.author,
  ts_rank(a.search_vector, query) AS text_score,
  1 - (a.embedding <=> $1::vector) AS vector_score
FROM articles a, to_tsquery('english', $2) query
WHERE a.search_vector @@ query
  AND a.published_at > now() - interval '30 days'
ORDER BY (text_score + vector_score) DESC
LIMIT 10;

Per Simple Talk's December 2025 guide, pgvector now supports cosine distance, inner product, and Euclidean distance, and its HNSW indexes scale to millions of embeddings with sub-100ms query times. The reason Databricks paid $1 billion for Neon is precisely this: agents need a database that can handle vector workloads, transactions, and metadata queries together — and Postgres does all three.

4. Time-Series Data (Your InfluxDB Use Case)

Application metrics, audit logs, event streams — teams add InfluxDB or TimescaleDB the moment data has a timestamp. But TimescaleDB is literally just a Postgres extension, and native Postgres handles time-series workloads well with table partitioning.

-- Partitioned table for time-series events
CREATE TABLE events (
  id         BIGSERIAL,
  user_id    UUID NOT NULL,
  event_type TEXT NOT NULL,
  payload    JSONB,
  created_at TIMESTAMPTZ NOT NULL DEFAULT now()
) PARTITION BY RANGE (created_at);

-- Create monthly partitions (automate this with a cron job)
CREATE TABLE events_2026_01 
  PARTITION OF events 
  FOR VALUES FROM ('2026-01-01') TO ('2026-02-01');

CREATE TABLE events_2026_02 
  PARTITION OF events 
  FOR VALUES FROM ('2026-02-01') TO ('2026-03-01');

-- Index on the partition column
CREATE INDEX ON events (created_at, user_id);

-- Fast time-windowed queries using partition pruning
SELECT 
  date_trunc('hour', created_at) AS hour,
  event_type,
  count(*) AS count
FROM events
WHERE created_at >= now() - interval '7 days'
GROUP BY 1, 2
ORDER BY 1 DESC;

Postgres's query planner understands partition pruning — it only scans the relevant monthly table instead of the entire dataset. For most metrics workloads, this is enough. Add pg_partman for automatic partition management.

5. Pub/Sub and Background Jobs (Your Redis Use Case)

Teams add Redis for two reasons: caching and queues. Caching is legitimate. Queues, often, are not.

Postgres LISTEN/NOTIFY gives you lightweight pub/sub, and SKIP LOCKED gives you a rock-solid job queue:

-- Job queue table
CREATE TABLE job_queue (
  id          BIGSERIAL PRIMARY KEY,
  queue_name  TEXT NOT NULL DEFAULT 'default',
  payload     JSONB NOT NULL,
  status      TEXT NOT NULL DEFAULT 'pending',
  attempts    INT NOT NULL DEFAULT 0,
  created_at  TIMESTAMPTZ DEFAULT now(),
  scheduled_at TIMESTAMPTZ DEFAULT now()
);

CREATE INDEX idx_jobs_pending 
ON job_queue (queue_name, scheduled_at) 
WHERE status = 'pending';
import psycopg2
from contextlib import contextmanager

@contextmanager
def dequeue_job(conn, queue_name='default'):
    """Atomically claim a job — safe for concurrent workers."""
    with conn.cursor() as cur:
        cur.execute("""
            UPDATE job_queue
            SET status = 'processing', attempts = attempts + 1
            WHERE id = (
                SELECT id FROM job_queue
                WHERE queue_name = %s
                  AND status = 'pending'
                  AND scheduled_at <= now()
                ORDER BY scheduled_at ASC
                FOR UPDATE SKIP LOCKED
                LIMIT 1
            )
            RETURNING id, payload
        """, (queue_name,))
        row = cur.fetchone()
        if row is None:
            yield None
            return
        job_id, payload = row
        try:
            yield payload
            cur.execute("UPDATE job_queue SET status = 'done' WHERE id = %s", (job_id,))
            conn.commit()
        except Exception:
            cur.execute("UPDATE job_queue SET status = 'failed' WHERE id = %s", (job_id,))
            conn.commit()
            raise

FOR UPDATE SKIP LOCKED ensures two workers never grab the same job. This pattern, used by libraries like good_job (Rails) and pg-boss (Node.js), is battle-tested in production at serious scale. No Redis. No RabbitMQ. One less failure domain.


Where Postgres Actually Isn't Enough

I want to be honest here, because the goal isn't "use Postgres for everything" — it's "use Postgres until you have a real reason not to."

Genuine reasons to leave Postgres:

  1. Multi-region active-active writes. If you have users in Tokyo, Frankfurt, and New York who all write data and need sub-100ms local latency with strong consistency, Postgres's single-primary model becomes a real constraint. This is CockroachDB or Spanner territory.

  2. Write throughput beyond ~100k TPS sustained. A well-tuned Postgres instance with connection pooling can handle enormous write loads, but if you're consistently above ~100k transactions per second with no batching, you're approaching genuine limits.

  3. Massive fan-out read scalability. If you need to serve read queries to millions of concurrent connections (not just requests — connections), Postgres's process-per-connection model needs PgBouncer at minimum, and beyond a certain point you're fighting the architecture.

  4. Truly schemaless, unpredictable data. If your data is so heterogeneous that even JSONB feels constraining — like storing arbitrary sensor data from thousands of different hardware vendors — a document database might genuinely fit better.

The key word in all four cases: genuine. Not "we might need this someday." Not "our CTO read about it at a conference." A real, current, demonstrable constraint that you have hit in production.


The Real Cost of Premature Migration

When Marcus called me at 2 AM, his company hadn't just spent $400,000 on the migration. They had also:

  • Lost the ability to do joins, forcing complex denormalization and multi-query patterns in application code
  • Introduced eventual consistency bugs that were now causing customer-facing data errors
  • Added a new technology that half the engineering team didn't deeply understand
  • Broken their existing reporting pipeline, which was built on SQL

Every database you add multiplies your operational surface area. You now have two systems to monitor, two backup strategies, two upgrade paths, two on-call runbooks. Distributed systems are fundamentally harder to reason about than single-system consistency.

The 2026 InfoWorld analysis of database migrations is blunt about the cost: "The biggest challenge around database migrations is getting people on board with the project." That's because migrations are rarely scoped correctly. What looks like a two-month project becomes six months. What looks like a performance fix becomes a consistency crisis.

Before you migrate, answer this: have you actually maxed out Postgres? Have you:

  • ✅ Added proper indexes for your slow queries?
  • ✅ Run EXPLAIN ANALYZE on your worst performers?
  • ✅ Set up connection pooling with PgBouncer?
  • ✅ Configured work_mem and shared_buffers for your server's RAM?
  • ✅ Enabled table partitioning for large, time-bound tables?
  • ✅ Tried read replicas for read-heavy workloads?

If the answer to any of these is "no," you have not maxed out Postgres. You have a tuning problem, not a database problem.


A Practical Tuning Baseline

Here's the configuration I start with for any production Postgres instance (adjust for your server's RAM):

# postgresql.conf baseline for a 16GB RAM server
# Memory
shared_buffers = 4GB              # ~25% of total RAM
effective_cache_size = 12GB       # ~75% of total RAM  
work_mem = 64MB                   # per-sort operation (watch for high concurrency)
maintenance_work_mem = 1GB        # for VACUUM, CREATE INDEX, etc.

# Write performance
wal_buffers = 64MB
checkpoint_completion_target = 0.9
max_wal_size = 4GB

# Connections (use PgBouncer in front of this)
max_connections = 200

# Query planning
random_page_cost = 1.1            # for SSDs (default 4.0 assumes spinning disk!)
effective_io_concurrency = 200    # for SSDs

Run this immediately after:

-- Enable useful extensions
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;  -- query performance tracking
CREATE EXTENSION IF NOT EXISTS pgcrypto;             -- gen_random_uuid()
CREATE EXTENSION IF NOT EXISTS btree_gin;            -- composite GIN indexes

-- Find your slowest queries
SELECT 
  round(total_exec_time::numeric, 2) AS total_ms,
  calls,
  round(mean_exec_time::numeric, 2) AS avg_ms,
  round((100 * total_exec_time / sum(total_exec_time) OVER ())::numeric, 2) AS pct,
  query
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 20;

This pg_stat_statements query alone has saved more teams from premature database migrations than any architectural advice I've ever given. It shows you exactly where time is being spent. In 90% of cases, there are 2-3 queries that account for 80% of your database load. Fix those queries. You don't need a new database.


The Right Mental Model

Think of your database stack as a spectrum of justified complexity:

Postgres alone
  → + PgBouncer (connection pooling)
    → + read replicas (read scale)
      → + table partitioning (large tables)
        → + pgvector (AI/ML features)
          → + pg_trgm / FTS (search)
            → + TimescaleDB extension (serious time-series)
              → + Redis (caching layer, not queue replacement)
                → Specialized DB (you've genuinely hit a wall)

Every step to the right is justified only when you've genuinely exhausted the step to the left. Most teams are operating at "Postgres alone" and jumping straight to "specialized DB" because someone read a Hacker News thread.

The billion-dollar acquisitions in 2025 — Databricks buying Neon, Snowflake buying CrunchyData — aren't happening because Postgres is a legacy technology. They're happening because Postgres is so good that the most sophisticated data companies in the world are building their AI-era infrastructure on top of it.

Marcus rebuilt his product catalog in Postgres six months after the MongoDB migration. It took three weeks. The data consistency bugs went away. The join latency is 4ms.


Architecture Checklist: Before You Leave Postgres

Before migrating to a different database, confirm you have done all of the following:

  • Run EXPLAIN ANALYZE on your five slowest queries and addressed the findings
  • Added indexes for every foreign key and every column used in WHERE clauses
  • Configured shared_buffers to ~25% of server RAM
  • Set random_page_cost = 1.1 if running on SSDs
  • Deployed PgBouncer for connection pooling with pool_mode = transaction
  • Added at least one read replica for read-heavy workloads
  • Implemented table partitioning for any table with >50M rows
  • Evaluated JSONB for any "flexible schema" requirements
  • Evaluated tsvector/pg_trgm for any text search requirements
  • Evaluated pgvector for any vector/embedding requirements
  • Evaluated SKIP LOCKED for any job queue requirements
  • Evaluated LISTEN/NOTIFY for any pub/sub requirements
  • Documented the specific Postgres limitation you've actually hit in production (not theoretical)
  • Calculated the full engineering cost of migration, including application code changes and operational overhead

If you can check every box and still have a genuine, production-measured limit, you have earned the right to evaluate alternatives.

Until then: Postgres is enough.


Ask The Guild

I want to hear from you this week. Drop your answer in the community channel:

What's the most surprising thing you've replaced with Postgres — or the most painful migration you did away from Postgres that you later regretted?

Tag your response with #architecture-patterns so we can compile the best stories for the guild. The most instructive real-world examples might make it into a future article in this series.

Copy A Prompt Next

Think in systems

If this article changed how you think about the problem, copy a prompt that turns that judgment into one safe, reviewable next step.

Matching public prompts

7

Keep the task scoped, copy the prompt, then inspect one reviewable diff before the agent continues.

Need the safest first move instead? Open the curated sample prompts before you browse the broader library.

Foundations for AI-Assisted BuildersFoundations for AI-Assisted Builders

Frontend vs Backend — What's the Difference and Why Does It Matter?

Understand the two halves of every web application using a restaurant analogy that makes it click.

Preview
"Before implementing this feature, split the work into frontend and backend responsibilities.
1. Explain what the user sees and interacts with
2. Explain what the server must handle behind the scenes
3. List the files or systems likely to change on each side
4. Call out any data that must never live in the frontend
Architecture

Translate this architecture idea into system-level judgment

Architecture articles sharpen judgment. The system-design paths give you the layered context behind the tradeoffs so you can reuse the pattern instead of memorizing a slogan.

Best Next Path

Databases and Data

Guild Member · $29/mo

Ground the architecture in schemas, queries, indexing, and integrity so the system scales on real data instead of assumptions.

26 lessonsIncluded with the full Guild Member library

Need the free route first?

Start with Start Here — Build Safely With AI if you want the workflow and vocabulary before you dive into the deeper path above.

T

About Tom Hundley

Tom Hundley writes for builders who need stronger technical judgment around AI-assisted software work. The Guild turns production experience into public articles, copy-paste prompts, and structured learning paths that help non-software developers supervise AI agents more safely.

Do this next

Leave this article with one concrete move. Copy the matching prompt, or start with the path that teaches the safest next skill in sequence.