Skip to content
Production Ready — Part 21 of 30

'Too Many Connections': Database Pooling Explained

Written by claude-sonnet-4 · Edited by claude-sonnet-4
databasepostgresqlconnection-poolingpgbouncerserverlessperformancepythontypescriptaws-lambdaproduction

Production Ready — Part 21 of 30


It's 11 PM on a Friday. Your app just blew up. Users are getting 500 errors. You check the logs and see this:

FATAL: sorry, too many clients already

You have maybe 200 concurrent users. Your PostgreSQL instance is configured for max_connections = 100. You do the math and it makes no sense. Then you remember: last week you scaled from one app server to three to handle the load. Each server has its own connection pool set to max: 20.

Three servers × 20 connections each = 60 connections. Should be fine, right?

Except you also added a background job worker. And a metrics collector. And each server has two processes. And Prisma opens its own connections on top of the pool. And your staging environment is pointing at the same database.

Each pool individually looks reasonable. It's only when you add them all up that it explodes.DEV Community, March 2026

You just hit one of the most common production disasters in web development. Let's make sure it never happens to you again.


What Is a Database Connection, Really?

Before we talk about pools, understand what a connection costs.

When your app connects to PostgreSQL, the database spawns a new operating system process for that connection. Not a thread — a process. Each one consumes roughly 5–10 MB of RAM and a file descriptor. PostgreSQL does this for every connection, which is why the default max_connections is typically 100: that's already 500 MB–1 GB of RAM just for connection overhead, before any actual queries run.

Opening a connection also isn't free in time. The TLS handshake, authentication, and process fork typically add 20–100ms of latency. If your app opens a fresh connection on every request and closes it when the request is done, you're burning 50–100ms just on setup before a single query executes.

This is why connection pooling exists.


The Pool: Your Connection Buffer

A connection pool keeps a set of database connections open and reuses them across requests. Instead of open → query → close on every request, the pattern becomes:

  1. Request arrives
  2. Borrow a connection from the pool
  3. Execute query
  4. Return connection to pool (connection stays open)
  5. Next request borrows the same connection

Here's the naive Node.js way that kills production:

// ❌ WRONG: Opens a new connection on every request
import { Client } from 'pg';

export default async function handler(req, res) {
  const client = new Client({ connectionString: process.env.DATABASE_URL });
  await client.connect();  // New OS process spawned on your DB. Every. Time.
  const result = await client.query('SELECT * FROM users WHERE id = $1', [req.query.id]);
  await client.end();
  res.json(result.rows[0]);
}

And here's the right way:

// ✅ CORRECT: Pool initialized once, connections reused across requests
import { Pool } from 'pg';

// This runs once when the module loads, not on every request
const pool = new Pool({
  connectionString: process.env.DATABASE_URL,
  max: 10,                    // Maximum connections this pool will open
  idleTimeoutMillis: 30000,   // Close connections idle for 30s
  connectionTimeoutMillis: 5000, // Fail fast if pool is exhausted
});

export default async function handler(req, res) {
  const client = await pool.connect(); // Borrow from pool
  try {
    const result = await client.query('SELECT * FROM users WHERE id = $1', [req.query.id]);
    res.json(result.rows[0]);
  } finally {
    client.release(); // Always return to pool — even on error
  }
}

The finally block is critical. If you forget client.release(), that connection is checked out forever. Do this enough times and your pool is exhausted even with no actual load.

Same pattern in Python with psycopg2:

import psycopg2.pool
from contextlib import contextmanager

# Module-level pool — created once
_pool = psycopg2.pool.ThreadedConnectionPool(
    minconn=2,
    maxconn=10,
    dsn=os.environ['DATABASE_URL']
)

@contextmanager
def get_db():
    conn = _pool.getconn()
    try:
        yield conn
        conn.commit()
    except Exception:
        conn.rollback()
        raise
    finally:
        _pool.putconn(conn)  # Returns to pool regardless of outcome

# Usage
def get_user(user_id: int):
    with get_db() as conn:
        with conn.cursor() as cur:
            cur.execute('SELECT * FROM users WHERE id = %s', (user_id,))
            return cur.fetchone()

The Serverless Connection Nightmare

Application-level pooling works well on traditional servers. It falls apart completely with serverless.

Here's why: a serverless function is stateless by design. Every cold start creates a new function instance with a new pool. If you get 50 concurrent Lambda invocations, you might get 50 separate pools, each with max: 10 connections — that's 500 connections to your database from one endpoint.

Vercel's documentation describes the problem directly: "Functions are stateless, asynchronous, and ephemeral; they scale immediately and infinitely on traffic spikes, making it easier to exhaust available database connections."Vercel Connection Pooling Guide, November 2025

It gets weirder. When a serverless function becomes idle, it's suspended in memory rather than fully terminated. While suspended, its idle connection timeout doesn't tick. So connections stay open even when nothing is happening, creating phantom connection usage that you can't see in your metrics.

A real-world cascade failure was documented in December 2025 by Serverless Guru, analyzing a Kafka → Lambda → RDS architecture processing 300 events per second. Each Lambda held a 10-connection pool. Slow transaction hold times (750ms per transaction across 40 database tables) caused Lambdas to stack up: 23 concurrent Lambdas × 10 connections = 230 connections at steady state. Under a small latency spike, this jumped to 1,000 connections — instant database meltdown. The counterintuitive fix wasn't increasing the pool size — that made it worse — it was batching 30 records per transaction to reduce Lambda duration from 22.5s to 2s, dropping concurrent Lambdas from 23 to 2.

For serverless, you have two real options:

Option 1: An external connection pooler (PgBouncer / RDS Proxy)

[Your serverless functions]
        ↓ (many short-lived connections, e.g., 500)
[PgBouncer / RDS Proxy]
        ↓ (small persistent pool, e.g., 20)
[PostgreSQL]

The pooler sits between your functions and the database. Functions connect to the pooler (cheap), and the pooler maintains a small pool of real connections to PostgreSQL (expensive). Functions come and go; the database sees a stable, small number of connections.

Option 2: A serverless-native driver

Neon's @neondatabase/serverless driver uses WebSocket connections optimized for the open/close cycle of serverless. It handles connection management natively without requiring an external pooler. Neon designed this specifically for Vercel Functions and Cloudflare Workers where the ephemeral connection pattern would otherwise cause constant pool exhaustion.


PgBouncer: The Industry Standard Pooler

If you're self-hosting Postgres or using a managed service that exposes it, PgBouncer is the tool. Supabase runs PgBouncer in front of every database by default. Here's a production-ready configuration:

# /etc/pgbouncer/pgbouncer.ini

[databases]
# Clients connect to 'myapp' on port 6432
# PgBouncer connects to PostgreSQL on port 5432
myapp = host=127.0.0.1 port=5432 dbname=myapp

[pgbouncer]
listen_addr = 0.0.0.0
listen_port = 6432

# CRITICAL: Use transaction mode for most web apps
# Connection is only held during a transaction, not the entire session
pool_mode = transaction

# PgBouncer can accept up to 1000 client connections
max_client_conn = 1000

# But only maintains 25 real connections to PostgreSQL
default_pool_size = 25

# 5 extra connections for bursts
reserve_pool_size = 5
reserve_pool_timeout = 5

# Health check
server_check_query = SELECT 1
server_check_delay = 30

# Kill long-idle server connections
server_idle_timeout = 600

auth_type = scram-sha-256
auth_file = /etc/pgbouncer/userlist.txt

The key parameter is pool_mode = transaction. There are three modes:

Mode Connection held for... Best for
session Entire client session Apps using SET, LISTEN/NOTIFY, advisory locks
transaction Duration of one transaction Most web apps — recommended default
statement Single statement only Read-only workloads, analytics

Transaction mode gives you the most connection reuse. A client that connects and sends 10 separate queries only holds a real Postgres connection during each transaction — the rest of the time that real connection is available for other clients. This is why PgBouncer can serve 1,000 clients through 25 real connections.

One gotcha with transaction mode: you can't use session-level features like SET LOCAL, named prepared statements (use DEALLOCATE ALL between transactions), or LISTEN/NOTIFY. If your app uses any of these, you either need session mode or need to restructure those queries.

Connecting your app to PgBouncer (note port 6432 instead of 5432):

# Instead of:
DATABASE_URL=postgresql://user:pass@db.example.com:5432/myapp

# Use:
DATABASE_URL=postgresql://user:pass@db.example.com:6432/myapp

For Prisma specifically, add pgbouncer=true to the connection string to disable prepared statements (incompatible with transaction mode):

DATABASE_URL=postgresql://user:pass@db.example.com:6432/myapp?pgbouncer=true

Diagnosing Connection Problems in Production

When you get the too many connections error, run these queries immediately:

-- How many connections are open, vs. the limit?
SELECT
  (SELECT setting::int FROM pg_settings WHERE name = 'max_connections') AS max_connections,
  COUNT(*) AS current_connections,
  COUNT(*) FILTER (WHERE state = 'active') AS active,
  COUNT(*) FILTER (WHERE state = 'idle') AS idle,
  COUNT(*) FILTER (WHERE state = 'idle in transaction') AS idle_in_txn
FROM pg_stat_activity;

-- Who is eating connections?
SELECT
  application_name,
  usename,
  client_addr,
  COUNT(*) AS connections
FROM pg_stat_activity
WHERE pid != pg_backend_pid()
GROUP BY application_name, usename, client_addr
ORDER BY connections DESC
LIMIT 10;

If you see a high idle in transaction count, you have a bug: your app is starting transactions and not committing or rolling back. Those connections are stuck. You can kill them:

-- Emergency: terminate connections idle in transaction for > 5 minutes
SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE state = 'idle in transaction'
AND state_change < NOW() - INTERVAL '5 minutes';

You can also configure PostgreSQL to kill these automatically — add to postgresql.conf:

# Kill transactions idle for more than 10 minutes
idle_in_transaction_session_timeout = '10min'

# Kill completely idle sessions after 30 minutes (PostgreSQL 14+)
idle_session_timeout = '30min'

If PgBouncer is in use, check its pool status from the admin console:

# Connect to PgBouncer admin interface
psql -h 127.0.0.1 -p 6432 -U pgbouncer_admin pgbouncer

# Then:
SHOW POOLS;
# Look at cl_waiting — if nonzero, clients are queued waiting for a connection
# cl_active = active client connections
# sv_idle = idle server connections available to take work

cl_waiting > 0 means your pool is too small for the load. Either increase default_pool_size or reduce application connection demand.


Sizing the Pool

The most common question: how many connections should my pool have?

The formula from Supabase's own documentation:

pool_size = (CPU cores × 2) + spindle_count

For a 4-core database server with SSD storage (spindle_count = 0 effectively):

pool_size = (4 × 2) + 0 = 8

That sounds tiny. But PostgreSQL's bottleneck is typically CPU and I/O, not connection count. More connections than CPU cores means threads competing for the same CPU. The research backing this formula shows that 500 concurrent users can share 10 actual connections — the database is idle waiting for the next query most of the time.

A practical sizing approach:

  1. Start conservative: max_connections = 100 on the database, default_pool_size = 20 in PgBouncer
  2. Monitor cl_waiting in PgBouncer and pg_stat_activity on the database
  3. Only increase pool size if you're seeing real queuing under real load
  4. Never increase max_connections above (RAM_in_GB × 100) — you'll run out of memory

The Complete Architecture

Here's what a properly configured setup looks like, from a Next.js app on Vercel to a self-hosted Postgres:

[Vercel Functions]
  ↓ connects to port 6432 with DATABASE_URL?pgbouncer=true
[PgBouncer]
  max_client_conn = 1000   ← handles the serverless connection storm
  pool_mode = transaction
  default_pool_size = 20   ← only 20 real connections to Postgres
  ↓
[PostgreSQL]
  max_connections = 100    ← 80 available, 20 reserved for superuser/monitoring

For AWS Lambda with RDS:

[Lambda functions]
  ↓ (potentially hundreds of concurrent invocations)
[RDS Proxy]
  connection_pool_size = 20-30%  ← configures as % of max_connections
  ↓
[RDS PostgreSQL]
  max_connections per instance class (db.t3.medium ≈ 410)

RDS Proxy is managed infrastructure — no pgbouncer.ini to maintain — but it costs extra and adds ~1ms latency per query. For most teams starting out, a self-managed PgBouncer sidecar (deployed alongside your app server) is cheaper and sufficient.


Pre-Flight Checklist

Before you ship to production, verify all of these:

  • Single pool per process: Connection pool is initialized at module load time, not inside request handlers
  • Always release: Every pool.connect() or pool.getconn() is paired with client.release() or pool.putconn() in a finally block
  • Connection string points to pooler: If PgBouncer is deployed, DATABASE_URL uses port 6432, not 5432
  • ORM compatibility: If using Prisma with PgBouncer transaction mode, ?pgbouncer=true is in the connection string
  • Timeout configured: connectionTimeoutMillis set so pool exhaustion fails fast (5s) rather than hanging
  • Idle timeouts set: idle_in_transaction_session_timeout is configured on PostgreSQL to clean up stuck transactions
  • Connection math done: Sum of (app servers × pool size) + background workers + monitoring tools is below max_connections - 10 (reserve headroom for superuser access)
  • Serverless handled: Lambda / Vercel Functions either use an external pooler (PgBouncer/RDS Proxy) or a serverless-native driver
  • Monitoring in place: Alert when pg_stat_activity count exceeds 80% of max_connections
  • Recovery runbook exists: You can kill idle-in-transaction connections without a full database restart

Ask The Guild

What's your current setup? Are you connecting directly to Postgres from your app, using PgBouncer, or relying on a managed pooler like Supabase's built-in connection pooling or RDS Proxy?

Have you ever hit the too many connections wall in production? What triggered it — a deployment that added servers, a traffic spike, a connection leak, or something else entirely? Share your story (and your fix) in the comments. The more specific the better — someone else is probably debugging the same thing right now.

Copy A Prompt Next

Review and debug

If this article changed how you think about the problem, copy a prompt that turns that judgment into one safe, reviewable next step.

Matching public prompts

23

Keep the task scoped, copy the prompt, then inspect one reviewable diff before the agent continues.

Need the safest first move instead? Open the curated sample prompts before you browse the broader library.

Working With AI ToolsWorking With AI Tools

v0 by Vercel — UI Components From a Text Prompt

Generate production-ready UI components with v0 and integrate them into your projects.

Preview
"I want v0 to generate a React component for this screen:
[describe the UI, data fields, visual style, empty state, loading state, and mobile behavior]
The component must:
1. work in a Next.js + Tailwind project
2. be easy to wire to real data later
Production Ready

Use this production insight inside a full build sequence

Production articles show you what breaks in the real world. The right path turns that lesson into a sequence you can ship with instead of just nodding at.

Best Next Path

Building a Real Product

Guild Member · $29/mo

Bridge demos to software people can trust: auth, billing, email, analytics, and the surrounding product plumbing.

20 lessonsIncluded with the full Guild Member library

Need the free route first?

Start with Start Here — Build Safely With AI if you want the workflow and vocabulary before you dive into the deeper path above.

T

About Tom Hundley

Tom Hundley writes for builders who need stronger technical judgment around AI-assisted software work. The Guild turns production experience into public articles, copy-paste prompts, and structured learning paths that help non-software developers supervise AI agents more safely.

Do this next

Leave this article with one concrete move. Copy the matching prompt, or start with the path that teaches the safest next skill in sequence.