Skip to content
Architecture Patterns — Part 17 of 30

ISR, SWR, and Cache Strategies That Scale

Written by claude-sonnet-4 · Edited by claude-sonnet-4
ISRSWRcachingNext.jsCDNstale-while-revalidatecache-invalidationperformancearchitectureVercel

Architecture Patterns — Part 17 of 30


The Night the Sale Broke Everything

It's 11 PM on Black Friday eve. An e-commerce team pushes a flash sale — 40% off sitewide, starting at midnight. Their marketing team has queued the emails, the social posts are scheduled, the CEO is watching the dashboard. Midnight hits. Traffic spikes 10x. And 60% of visitors are seeing the old prices.

The culprit? A revalidate = 3600 tucked inside a product page component that nobody thought to change. One integer. Thousands of lost sales.

I've watched this scenario play out — in different forms — more times than I can count across 25 years of building systems. Cache strategies aren't a performance curiosity. They're a business-critical architecture decision. And yet most teams treat them as afterthoughts, copying a revalidate value from a tutorial and calling it done.

This article is about making deliberate, defensible choices — ISR, SWR, and CDN cache invalidation — with frameworks you can actually reason about under pressure.


What These Terms Actually Mean

Before the frameworks, let's strip away the jargon.

ISR (Incremental Static Regeneration) is Next.js's answer to a classic tension: static pages are fast but stale; server-rendered pages are fresh but slow. ISR sits between them. Pages are served as pre-rendered static HTML. After a configured interval, the next request triggers a background regeneration — the user still gets the stale page instantly, and the rebuilt page takes over for subsequent requests. This is stale-while-revalidate applied at the page level.

SWR (stale-while-revalidate) refers to both an HTTP Cache-Control directive and a React data-fetching pattern (popularized by Vercel's swr library and TanStack Query). The directive looks like this:

Cache-Control: max-age=60, stale-while-revalidate=300

This means: serve fresh content for 60 seconds, then serve stale content for up to 300 more seconds while a background fetch runs. After 360 seconds total, the cache is "rotten" and the request blocks on origin. As DebugBear's documentation puts it: stale doesn't mean inedible — you'll serve it once more while sourcing something fresher.

CDN Cache Invalidation is the escape hatch — explicitly purging cached content before its TTL expires. It sounds simple but is one of the hardest problems in distributed systems. Phil Karlton's famous quip about naming things was made for a reason.


The Decision Framework: Three Questions Before You Set a TTL

Here's what I ask every team before they write a single cache directive:

Question 1: What is the blast radius of a stale serve?

Draw two axes: how often does this data change vs. how bad is it if a user sees old data.

Data Type Change Frequency Stale Cost Strategy
Product price during a sale High Critical (revenue loss) On-demand revalidation + short TTL
Blog post body Low Low (cosmetic) ISR with long revalidate (1h+)
Product inventory level Medium-High Medium (oversell risk) SWR with short window, stale-if-error fallback
Navigation menu Very Low Very Low Build-time static, purge on CMS publish
User-specific cart N/A Critical Never cache on CDN — Cache-Control: private, no-store

The mistake I see constantly: teams apply the same revalidate = 60 to everything. Your "About Us" page and your live auction price do not have the same staleness tolerance.

Question 2: Who triggers content changes — a schedule or an event?

This determines whether you use time-based or on-demand revalidation.

  • Schedule-driven (news articles, blog posts, product descriptions): time-based ISR is fine. Set your revalidate to something reasonable for the content cadence.
  • Event-driven (CMS publish, price change, inventory update): time-based ISR is a liability. You need on-demand revalidation triggered by a webhook.

Vercel's ISR documentation makes this explicit: on-demand revalidation lets you purge the cache for an ISR route at any time, without waiting for a time interval to elapse — ideal when content changes based on external events like a CMS publish.

Here's a production-ready on-demand revalidation handler for Next.js App Router:

// app/api/revalidate/route.ts
import { revalidatePath, revalidateTag } from 'next/cache';
import { NextRequest, NextResponse } from 'next/server';

export async function POST(request: NextRequest) {
  const secret = request.headers.get('x-revalidation-secret');

  if (secret !== process.env.REVALIDATION_SECRET) {
    return NextResponse.json({ error: 'Unauthorized' }, { status: 401 });
  }

  const body = await request.json();
  const { path, tag } = body;

  try {
    if (tag) {
      // Granular: invalidate everything tagged with this key
      revalidateTag(tag);
    } else if (path) {
      // Coarse: invalidate a specific route
      revalidatePath(path);
    } else {
      return NextResponse.json({ error: 'path or tag required' }, { status: 400 });
    }

    return NextResponse.json({ revalidated: true, now: Date.now() });
  } catch (err) {
    return NextResponse.json({ error: 'Revalidation failed' }, { status: 500 });
  }
}

Your CMS (Contentful, Sanity, Storyblok) can POST to this endpoint on every publish event. Price changes in your ERP trigger it. Inventory updates from your 3PL trigger it. You're in control, not the clock.

Question 3: Where in the stack does the cache live?

This is the one most builders miss entirely. There isn't a cache — there are at least four:

  1. Browser cache: controlled by Cache-Control headers, lives on the user's machine
  2. CDN edge cache: Cloudflare, Fastly, CloudFront — regional nodes between the user and your origin
  3. Next.js Data Cache: Next.js 15's built-in fetch cache, persisted to disk or an external store
  4. Application-layer cache: Redis, Memcached, in-memory LRU — sitting in front of your database

When content looks stale, you need to know which cache is holding the old version. A hard refresh in the browser (Cmd+Shift+R) bypasses the browser cache but not the CDN. If a redeploy fixes the staleness, it's the Full Route Cache. If it persists after redeploy, it's the CDN or an upstream application cache.

# Quick triage: check which cache layer is responding
curl -sI https://yoursite.com/product/123 | grep -i -E "(x-cache|cf-cache|age|cache-control)"

# Look for:
# x-cache: HIT → CDN is serving cached content
# Age: 3847 → content is 3847 seconds old in that cache
# CF-Cache-Status: HIT → Cloudflare specifically

The Next.js 15 Trap: When revalidate Alone Isn't Enough

In late 2024 through early 2025, a widely-reported issue on the Next.js GitHub repository revealed a painful behavior change in Next.js 15: setting export const revalidate = 60 in an App Router page was no longer sufficient to trigger ISR in all cases. Developers discovered that dynamic routes with empty generateStaticParams arrays were rendering as dynamic server-rendered pages instead of ISR candidates — completely bypassing the cache.

The fix was counter-intuitive:

// This alone is NOT enough in Next.js 15 for certain route patterns:
export const revalidate = 60;

// You ALSO need this for nested dynamic routes:
export const dynamic = 'force-static';

// Then ISR works as expected — pages are statically cached
// and regenerated in the background after 60 seconds

The lesson isn't "Next.js is broken." The lesson is: always verify cache behavior in a production-equivalent environment before launch. Development mode has different (often no) caching. A console.log(Date.now()) in a Server Component is your cheapest validation tool — if the timestamp never changes on refresh, you're cached. If it does, you're not.


The Cloudflare OpenNext Problem: When the Platform Fights Your Cache

Another real-world failure pattern emerged in early 2026 when developers deploying Next.js on Cloudflare Workers via OpenNext discovered ISR caches expiring in roughly 5 minutes — regardless of revalidation intervals set to 1 hour or 24 hours. The culprit: a mismatch between OpenNext's ISR adapter, Cloudflare R2 for cache storage, and Cache Reserve TTL semantics.

This is the platform coupling problem. ISR isn't a universal feature — it requires a runtime that can execute background regeneration jobs. As Naturaily's 2026 ISR guide notes: ISR is explicitly not supported in static export mode (output: 'export'), and in multi-region deployments, cache consistency across regions is traffic-driven, not guaranteed.

Decision rule: if you're deploying outside Vercel's managed infrastructure, validate ISR behavior exhaustively before relying on it for business-critical freshness. Build a smoke test:

#!/bin/bash
# isr-smoke-test.sh — Run after deploy to verify ISR is functioning
URL=$1
echo "Testing ISR on: $URL"

# Hit the page twice with a delay — second hit should show same timestamp
TS1=$(curl -s "$URL" | grep -o 'data-ts="[^"]*"' | head -1)
sleep 5
TS2=$(curl -s "$URL" | grep -o 'data-ts="[^"]*"' | head -1)

if [ "$TS1" = "$TS2" ]; then
  echo "✓ ISR caching confirmed — timestamps match"
else
  echo "✗ ISR not caching — page regenerated between requests"
  echo "  First: $TS1"
  echo "  Second: $TS2"
fi

Granular Invalidation: The Mature Strategy

Purging entire page caches on every CMS update is blunt-force. The pattern that scales is tag-based cache invalidation — tagging cached data with logical keys and purging only the tags that changed.

Next.js fetch caching supports this natively:

// Tag your fetches at data-fetch time
const product = await fetch(`/api/products/${id}`, {
  next: {
    tags: [`product-${id}`, 'products'],
    revalidate: 3600,
  },
});

const relatedItems = await fetch('/api/products/featured', {
  next: {
    tags: ['products', 'featured'],
    revalidate: 3600,
  },
});
// In your revalidation webhook — surgical precision
export async function POST(request: NextRequest) {
  const { productId, priceChanged, inventoryChanged } = await request.json();

  // Only invalidate the specific product — not the entire catalog
  revalidateTag(`product-${productId}`);

  // If price changed, also invalidate any featured/listing pages
  if (priceChanged) {
    revalidateTag('featured');
    revalidateTag('search-results');
  }

  return NextResponse.json({ revalidated: true });
}

As FocusReactive's analysis of headless CMS caching points out: when a price changes, you should purge caches only for that specific product detail page and related listing pages — not the entire catalog. Blunt invalidation under high traffic can cause thundering herd problems: every cache miss triggers a simultaneous origin fetch, overloading your backend at exactly the wrong moment.


The Cache Hierarchy Decision Tree

Here's the framework I use when designing cache strategies for a new system:

Is the content user-specific?
  YES → Cache-Control: private, no-store. Done. CDN never sees it.
  NO  ↓

Can you tolerate any staleness?
  NO  → Dynamic render. No caching. Accept the cost.
  YES ↓

How does content change?
  SCHEDULED (time-based) → ISR with time-based revalidate
  EVENT-DRIVEN (CMS/webhook) → ISR + on-demand revalidation
  NEVER changes → Build-time static, cache forever (immutable assets)

What's your deployment target?
  VERCEL managed → ISR works reliably out of the box
  CLOUDFLARE/EDGE → Validate ISR carefully; consider SWR headers at CDN instead
  SELF-HOSTED → Add Redis or file-system cache adapter; verify revalidation runtime

What's the blast radius of stale data?
  HIGH (prices, inventory, auth) → Short TTL + tag-based on-demand purge
  LOW (editorial content, docs) → Longer TTL, SWR window acceptable

Checklist: Cache Strategy Audit for Your Current Project

Before you ship, run through this:

  • Classify every data type by change frequency and staleness cost. No single TTL fits all.
  • Separate user-specific from public content. Personal data must never hit a shared CDN cache.
  • Use on-demand revalidation for anything event-driven (CMS publishes, price changes, inventory).
  • Tag your fetches with logical keys (product-123, category-electronics) for surgical invalidation.
  • Verify ISR actually works in a production-like environment. Test with a timestamp. Don't assume.
  • If using Next.js 15+ with dynamic routes, check whether you need export const dynamic = 'force-static' alongside revalidate.
  • Triage cache layers explicitly: browser → CDN → Next.js Data Cache → app cache. Know which layer is stale.
  • Set stale-if-error on non-critical content so a broken origin serves cached content instead of errors.
  • Instrument your cache hit ratio. A CDN hit rate below 80% for public pages is a sign something's wrong.
  • Write a post-deploy smoke test that verifies caching behavior, not just functional correctness.

Ask The Guild

This week's community prompt:

What's the most expensive cache bug you've hit in production — and what was the root cause? Was it a TTL misconfiguration, a platform surprise (like the Next.js 15 ISR behavior change), or a CDN layer you forgot existed?

Share your war story in the Guild forum. The collective pain of this community has saved more launch nights than any documentation ever will.


Tom Hundley is a software architect with 25 years of experience helping teams build systems that survive contact with real users. He writes the Architecture Patterns series for the AI Coding Guild.

Copy A Prompt Next

Think in systems

If this article changed how you think about the problem, copy a prompt that turns that judgment into one safe, reviewable next step.

Matching public prompts

7

Keep the task scoped, copy the prompt, then inspect one reviewable diff before the agent continues.

Need the safest first move instead? Open the curated sample prompts before you browse the broader library.

Foundations for AI-Assisted BuildersFoundations for AI-Assisted Builders

Choosing Your Tech Stack — A Decision Framework

A practical framework for choosing the right tools and technologies for your project — with sensible defaults for AI-assisted builders.

Preview
"Recommend a tech stack for this project.
Project type: [describe it]
Constraints: [budget, hosting, mobile, data, auth, payments, privacy]
My experience level: [describe it]
Give me:
Architecture

Translate this architecture idea into system-level judgment

Architecture articles sharpen judgment. The system-design paths give you the layered context behind the tradeoffs so you can reuse the pattern instead of memorizing a slogan.

Best Next Path

Databases and Data

Guild Member · $29/mo

Ground the architecture in schemas, queries, indexing, and integrity so the system scales on real data instead of assumptions.

26 lessonsIncluded with the full Guild Member library

Need the free route first?

Start with Start Here — Build Safely With AI if you want the workflow and vocabulary before you dive into the deeper path above.

T

About Tom Hundley

Tom Hundley writes for builders who need stronger technical judgment around AI-assisted software work. The Guild turns production experience into public articles, copy-paste prompts, and structured learning paths that help non-software developers supervise AI agents more safely.

Do this next

Leave this article with one concrete move. Copy the matching prompt, or start with the path that teaches the safest next skill in sequence.