ISR, SWR, and Cache Strategies That Scale
Architecture Patterns — Part 17 of 30
The Night the Sale Broke Everything
It's 11 PM on Black Friday eve. An e-commerce team pushes a flash sale — 40% off sitewide, starting at midnight. Their marketing team has queued the emails, the social posts are scheduled, the CEO is watching the dashboard. Midnight hits. Traffic spikes 10x. And 60% of visitors are seeing the old prices.
The culprit? A revalidate = 3600 tucked inside a product page component that nobody thought to change. One integer. Thousands of lost sales.
I've watched this scenario play out — in different forms — more times than I can count across 25 years of building systems. Cache strategies aren't a performance curiosity. They're a business-critical architecture decision. And yet most teams treat them as afterthoughts, copying a revalidate value from a tutorial and calling it done.
This article is about making deliberate, defensible choices — ISR, SWR, and CDN cache invalidation — with frameworks you can actually reason about under pressure.
What These Terms Actually Mean
Before the frameworks, let's strip away the jargon.
ISR (Incremental Static Regeneration) is Next.js's answer to a classic tension: static pages are fast but stale; server-rendered pages are fresh but slow. ISR sits between them. Pages are served as pre-rendered static HTML. After a configured interval, the next request triggers a background regeneration — the user still gets the stale page instantly, and the rebuilt page takes over for subsequent requests. This is stale-while-revalidate applied at the page level.
SWR (stale-while-revalidate) refers to both an HTTP Cache-Control directive and a React data-fetching pattern (popularized by Vercel's swr library and TanStack Query). The directive looks like this:
Cache-Control: max-age=60, stale-while-revalidate=300
This means: serve fresh content for 60 seconds, then serve stale content for up to 300 more seconds while a background fetch runs. After 360 seconds total, the cache is "rotten" and the request blocks on origin. As DebugBear's documentation puts it: stale doesn't mean inedible — you'll serve it once more while sourcing something fresher.
CDN Cache Invalidation is the escape hatch — explicitly purging cached content before its TTL expires. It sounds simple but is one of the hardest problems in distributed systems. Phil Karlton's famous quip about naming things was made for a reason.
The Decision Framework: Three Questions Before You Set a TTL
Here's what I ask every team before they write a single cache directive:
Question 1: What is the blast radius of a stale serve?
Draw two axes: how often does this data change vs. how bad is it if a user sees old data.
| Data Type | Change Frequency | Stale Cost | Strategy |
|---|---|---|---|
| Product price during a sale | High | Critical (revenue loss) | On-demand revalidation + short TTL |
| Blog post body | Low | Low (cosmetic) | ISR with long revalidate (1h+) |
| Product inventory level | Medium-High | Medium (oversell risk) | SWR with short window, stale-if-error fallback |
| Navigation menu | Very Low | Very Low | Build-time static, purge on CMS publish |
| User-specific cart | N/A | Critical | Never cache on CDN — Cache-Control: private, no-store |
The mistake I see constantly: teams apply the same revalidate = 60 to everything. Your "About Us" page and your live auction price do not have the same staleness tolerance.
Question 2: Who triggers content changes — a schedule or an event?
This determines whether you use time-based or on-demand revalidation.
- Schedule-driven (news articles, blog posts, product descriptions): time-based ISR is fine. Set your
revalidateto something reasonable for the content cadence. - Event-driven (CMS publish, price change, inventory update): time-based ISR is a liability. You need on-demand revalidation triggered by a webhook.
Vercel's ISR documentation makes this explicit: on-demand revalidation lets you purge the cache for an ISR route at any time, without waiting for a time interval to elapse — ideal when content changes based on external events like a CMS publish.
Here's a production-ready on-demand revalidation handler for Next.js App Router:
// app/api/revalidate/route.ts
import { revalidatePath, revalidateTag } from 'next/cache';
import { NextRequest, NextResponse } from 'next/server';
export async function POST(request: NextRequest) {
const secret = request.headers.get('x-revalidation-secret');
if (secret !== process.env.REVALIDATION_SECRET) {
return NextResponse.json({ error: 'Unauthorized' }, { status: 401 });
}
const body = await request.json();
const { path, tag } = body;
try {
if (tag) {
// Granular: invalidate everything tagged with this key
revalidateTag(tag);
} else if (path) {
// Coarse: invalidate a specific route
revalidatePath(path);
} else {
return NextResponse.json({ error: 'path or tag required' }, { status: 400 });
}
return NextResponse.json({ revalidated: true, now: Date.now() });
} catch (err) {
return NextResponse.json({ error: 'Revalidation failed' }, { status: 500 });
}
}
Your CMS (Contentful, Sanity, Storyblok) can POST to this endpoint on every publish event. Price changes in your ERP trigger it. Inventory updates from your 3PL trigger it. You're in control, not the clock.
Question 3: Where in the stack does the cache live?
This is the one most builders miss entirely. There isn't a cache — there are at least four:
- Browser cache: controlled by
Cache-Controlheaders, lives on the user's machine - CDN edge cache: Cloudflare, Fastly, CloudFront — regional nodes between the user and your origin
- Next.js Data Cache: Next.js 15's built-in fetch cache, persisted to disk or an external store
- Application-layer cache: Redis, Memcached, in-memory LRU — sitting in front of your database
When content looks stale, you need to know which cache is holding the old version. A hard refresh in the browser (Cmd+Shift+R) bypasses the browser cache but not the CDN. If a redeploy fixes the staleness, it's the Full Route Cache. If it persists after redeploy, it's the CDN or an upstream application cache.
# Quick triage: check which cache layer is responding
curl -sI https://yoursite.com/product/123 | grep -i -E "(x-cache|cf-cache|age|cache-control)"
# Look for:
# x-cache: HIT → CDN is serving cached content
# Age: 3847 → content is 3847 seconds old in that cache
# CF-Cache-Status: HIT → Cloudflare specifically
The Next.js 15 Trap: When revalidate Alone Isn't Enough
In late 2024 through early 2025, a widely-reported issue on the Next.js GitHub repository revealed a painful behavior change in Next.js 15: setting export const revalidate = 60 in an App Router page was no longer sufficient to trigger ISR in all cases. Developers discovered that dynamic routes with empty generateStaticParams arrays were rendering as dynamic server-rendered pages instead of ISR candidates — completely bypassing the cache.
The fix was counter-intuitive:
// This alone is NOT enough in Next.js 15 for certain route patterns:
export const revalidate = 60;
// You ALSO need this for nested dynamic routes:
export const dynamic = 'force-static';
// Then ISR works as expected — pages are statically cached
// and regenerated in the background after 60 seconds
The lesson isn't "Next.js is broken." The lesson is: always verify cache behavior in a production-equivalent environment before launch. Development mode has different (often no) caching. A console.log(Date.now()) in a Server Component is your cheapest validation tool — if the timestamp never changes on refresh, you're cached. If it does, you're not.
The Cloudflare OpenNext Problem: When the Platform Fights Your Cache
Another real-world failure pattern emerged in early 2026 when developers deploying Next.js on Cloudflare Workers via OpenNext discovered ISR caches expiring in roughly 5 minutes — regardless of revalidation intervals set to 1 hour or 24 hours. The culprit: a mismatch between OpenNext's ISR adapter, Cloudflare R2 for cache storage, and Cache Reserve TTL semantics.
This is the platform coupling problem. ISR isn't a universal feature — it requires a runtime that can execute background regeneration jobs. As Naturaily's 2026 ISR guide notes: ISR is explicitly not supported in static export mode (output: 'export'), and in multi-region deployments, cache consistency across regions is traffic-driven, not guaranteed.
Decision rule: if you're deploying outside Vercel's managed infrastructure, validate ISR behavior exhaustively before relying on it for business-critical freshness. Build a smoke test:
#!/bin/bash
# isr-smoke-test.sh — Run after deploy to verify ISR is functioning
URL=$1
echo "Testing ISR on: $URL"
# Hit the page twice with a delay — second hit should show same timestamp
TS1=$(curl -s "$URL" | grep -o 'data-ts="[^"]*"' | head -1)
sleep 5
TS2=$(curl -s "$URL" | grep -o 'data-ts="[^"]*"' | head -1)
if [ "$TS1" = "$TS2" ]; then
echo "✓ ISR caching confirmed — timestamps match"
else
echo "✗ ISR not caching — page regenerated between requests"
echo " First: $TS1"
echo " Second: $TS2"
fi
Granular Invalidation: The Mature Strategy
Purging entire page caches on every CMS update is blunt-force. The pattern that scales is tag-based cache invalidation — tagging cached data with logical keys and purging only the tags that changed.
Next.js fetch caching supports this natively:
// Tag your fetches at data-fetch time
const product = await fetch(`/api/products/${id}`, {
next: {
tags: [`product-${id}`, 'products'],
revalidate: 3600,
},
});
const relatedItems = await fetch('/api/products/featured', {
next: {
tags: ['products', 'featured'],
revalidate: 3600,
},
});
// In your revalidation webhook — surgical precision
export async function POST(request: NextRequest) {
const { productId, priceChanged, inventoryChanged } = await request.json();
// Only invalidate the specific product — not the entire catalog
revalidateTag(`product-${productId}`);
// If price changed, also invalidate any featured/listing pages
if (priceChanged) {
revalidateTag('featured');
revalidateTag('search-results');
}
return NextResponse.json({ revalidated: true });
}
As FocusReactive's analysis of headless CMS caching points out: when a price changes, you should purge caches only for that specific product detail page and related listing pages — not the entire catalog. Blunt invalidation under high traffic can cause thundering herd problems: every cache miss triggers a simultaneous origin fetch, overloading your backend at exactly the wrong moment.
The Cache Hierarchy Decision Tree
Here's the framework I use when designing cache strategies for a new system:
Is the content user-specific?
YES → Cache-Control: private, no-store. Done. CDN never sees it.
NO ↓
Can you tolerate any staleness?
NO → Dynamic render. No caching. Accept the cost.
YES ↓
How does content change?
SCHEDULED (time-based) → ISR with time-based revalidate
EVENT-DRIVEN (CMS/webhook) → ISR + on-demand revalidation
NEVER changes → Build-time static, cache forever (immutable assets)
What's your deployment target?
VERCEL managed → ISR works reliably out of the box
CLOUDFLARE/EDGE → Validate ISR carefully; consider SWR headers at CDN instead
SELF-HOSTED → Add Redis or file-system cache adapter; verify revalidation runtime
What's the blast radius of stale data?
HIGH (prices, inventory, auth) → Short TTL + tag-based on-demand purge
LOW (editorial content, docs) → Longer TTL, SWR window acceptable
Checklist: Cache Strategy Audit for Your Current Project
Before you ship, run through this:
- Classify every data type by change frequency and staleness cost. No single TTL fits all.
- Separate user-specific from public content. Personal data must never hit a shared CDN cache.
- Use on-demand revalidation for anything event-driven (CMS publishes, price changes, inventory).
- Tag your fetches with logical keys (
product-123,category-electronics) for surgical invalidation. - Verify ISR actually works in a production-like environment. Test with a timestamp. Don't assume.
- If using Next.js 15+ with dynamic routes, check whether you need
export const dynamic = 'force-static'alongsiderevalidate. - Triage cache layers explicitly: browser → CDN → Next.js Data Cache → app cache. Know which layer is stale.
- Set
stale-if-erroron non-critical content so a broken origin serves cached content instead of errors. - Instrument your cache hit ratio. A CDN hit rate below 80% for public pages is a sign something's wrong.
- Write a post-deploy smoke test that verifies caching behavior, not just functional correctness.
Ask The Guild
This week's community prompt:
What's the most expensive cache bug you've hit in production — and what was the root cause? Was it a TTL misconfiguration, a platform surprise (like the Next.js 15 ISR behavior change), or a CDN layer you forgot existed?
Share your war story in the Guild forum. The collective pain of this community has saved more launch nights than any documentation ever will.
Tom Hundley is a software architect with 25 years of experience helping teams build systems that survive contact with real users. He writes the Architecture Patterns series for the AI Coding Guild.