The Strangler Fig: Migrating Without Rewriting

Architecture Patterns -- Part 29 of 30

The Meeting Nobody Wanted

The year is 2019. A team at a mid-size e-commerce company is staring at a VB6 pricing engine -- 380,000 lines of code, a spaghetti nest of stored procedures, and a median response latency of 1.2 seconds. It powers every checkout on the site. It has not been touched in seven years. Nobody remembers why half of it works.

Someone in the room says the words every engineer dreads: "We should just rewrite it."

The CTO almost agrees. They sketch out a plan: three months to build the new system in .NET 8, one weekend to cut over, done. It sounds reasonable. It always sounds reasonable.

Fortunately, they did not do that.

Instead, they spent the next 12 months slowly growing a new system alongside the old one, routing traffic incrementally, reconciling outputs in parallel, until the old engine had been completely replaced -- without a single Saturday-night migration window, without a product freeze, and without explaining to the CEO why revenue was down 40% for three days. By day 360, the pricing engine ran at 38ms. The old system was decommissioned quietly.

This is the Strangler Fig pattern in practice. And it is the most important migration technique you will ever learn.

Where the Name Comes From

Martin Fowler named this pattern in 2004 after the strangler fig tree native to Australia. The strangler fig begins life as a seed dropped by a bird into the canopy of a host tree. It sends roots downward and vines upward, growing a lattice around the host. For years the host tree continues to live and function normally. Then, gradually, the fig takes over nutrients, structure, and space. The host dies and rots away, leaving the fig standing on its own -- hollow in the center where the host once was.

The metaphor is apt in one way that engineers often miss: during the transition, the strangler fig actually supports the host. It provides structural stability to the very thing it is replacing. That is what a good migration does. You are not just building a replacement -- you are providing continuity while you build it.

Fowler's original insight was simple: the safest way to replace a system is to intercept its inputs, build new behavior alongside the old, and gradually shift traffic until the old system can be deleted. No big bang. No feature freeze. No one-way doors.

Why Big-Bang Rewrites Almost Always Fail

Let me be direct about the data before we get into mechanics.

A February 2026 analysis of software rewrites over 25 years found the same failure pattern every time: big-bang approach, feature freeze on the old system, timelines that blow past estimates by 2-3x, and institutional knowledge quietly discarded (Potapov.dev). Netscape rewrote Navigator from scratch in 1998. Three years later, version 5.0 had never shipped. The result was a buggy, feature-incomplete version 6.0 while Internet Explorer took the market. Lou Montulli, one of the five original Navigator engineers, confirmed it was a primary reason he resigned.

Twitter's Fail Whale era (2008) is the famous counter-example. Their Ruby on Rails backend was crumbling. But they did not rewrite -- they replaced components one at a time over five years, migrating the full stack to Scala/JVM by 2013. The Java search server alone cut latencies by 3x. No feature freeze. No lost market share.

The statistics for smaller teams are no kinder. An analysis of 41 enterprise strangler projects from 2022-2025 found that 68% stalled within the first 90 days -- before a single piece of the monolith had actually been replaced (Software Modernization Services). Most of those failures came from three anti-patterns: starting at the UI layer, expanding scope without enforced boundaries, and refusing to touch the data model.

The lesson is not that rewrites are always wrong. It is that the conditions required for a successful big-bang rewrite are rare, and teams almost always underestimate how rare they are. The Strangler Fig pattern works because it converts an all-or-nothing gamble into a series of small, reversible steps.

The Four Phases

Phase	Action	Goal
1. Intercept	Place a routing layer in front of the old system	No change to behavior; gain control of traffic
2. Strangle	Build new implementation for one feature/route	New and old run in parallel
3. Replace	Route traffic to new implementation	New system proves itself in production
4. Remove	Decommission old code	Technical debt eliminated

The discipline is to complete all four phases for each feature before moving to the next. Teams that skip phase 4 -- that leave the old code running "just in case" -- end up maintaining two systems indefinitely. That is worse than not migrating at all.

The Routing Layer: Your Most Important Decision

The intercept point determines everything. You need a place where you can control which system handles each request, and switch that control without deploying new application code.

For Next.js migrations (Pages Router to App Router):

Next.js middleware is purpose-built for this. Both routers coexist in the same project by design. You can control routing at the edge:

// middleware.ts
import { NextResponse } from 'next/server'
import type { NextRequest } from 'next/server'

const MIGRATED_ROUTES = new Set([
  '/dashboard',
  '/settings/profile',
  '/api/v2/orders',
])

export function middleware(request: NextRequest) {
  const { pathname } = request.nextUrl

  // Routes in this set are served from /app directory
  // All others fall through to /pages directory
  if (MIGRATED_ROUTES.has(pathname)) {
    return NextResponse.next()
  }

  // Rewrite unmigrated routes to legacy handler
  const url = request.nextUrl.clone()
  url.pathname = `/_legacy${pathname}`
  return NextResponse.rewrite(url)
}

Money Forward's engineering team used exactly this pattern in their 2025 App Router migration, maintaining a route mapping that tracked which routes had been migrated and routing all others to the Pages Router (Money Forward Dev Blog).

For backend API migrations (REST to tRPC, Express to Hono, Node to Bun):

Vercel rewrites in next.config.js let you proxy unmigrated endpoints to the old service:

// next.config.js
module.exports = {
  async rewrites() {
    return {
      fallback: [
        {
          source: '/api/:path*',
          destination: 'https://legacy-api.internal/:path*',
        },
      ],
    }
  },
}

The fallback key is critical here. It means Next.js first tries to match the path against its own routes, and only falls back to the legacy system if no match is found. As you add new route handlers, they automatically take precedence.

For infrastructure migrations (Firebase to Supabase, monolith to microservices):

An Nginx reverse proxy gives you surgical control:

# nginx.conf
location /api/v2/users {
    # Migrated: send to new Supabase-backed service
    proxy_pass http://new-service:3001;
}

location /api/ {
    # Everything else: still on Firebase backend
    proxy_pass http://legacy-service:3000;
}

AWS API Gateway plays the same role for teams on AWS infrastructure. The key insight from the AWS re:Invent 2025 session on architecture patterns was the emphasis on API Gateway as a proxy with an anti-corruption layer -- letting the new system speak its own language while translating for the old one (DEV Community).

Practical Migrations for Vibe Coders

Next.js Pages Router to App Router: This is the canonical example of a framework designed for Strangler Fig migration. Create the /app directory alongside your existing /pages directory. Migrate one route at a time. The routers coexist with no configuration changes required (Next.js official migration guide). The only friction point is navigation hooks: useRouter from next/router (Pages) is incompatible with next/navigation (App Router). Build a shared wrapper that abstracts this during the transition period.

REST endpoints to tRPC or Server Actions: Start with read-only endpoints. Add your tRPC router alongside the existing REST API. For each endpoint you migrate, add it to your routing layer's "migrated" list. Save data mutations for last -- they carry the highest risk and require the most output validation.

JavaScript to TypeScript (file by file): This is the Strangler Fig pattern applied at the module level. Set "allowJs": true and "strict": false in your tsconfig.json. Rename one file at a time to .ts. Tighten strictness settings last. Facebook's Hack migration used exactly this approach in 2014 -- gradual typing, file-by-file, with automated tooling handling the mechanical transformations while engineers focused on the architectural judgment calls.

Firebase to Supabase: Migrate read paths first. Run dual writes during the transition -- write to both Firebase and Supabase simultaneously, read from Firebase. Once you have validated data consistency over a meaningful window (weeks, not days), switch reads to Supabase. Kill Firebase writes. Wait. Then remove the Firebase SDK entirely.

The Migration Spreadsheet

You need a tracking artifact. A spreadsheet, a Notion table, a GitHub project -- the format does not matter. The columns do:

Route / Feature	Old System	New System	Traffic % on New	Status	Notes
GET /api/users	Express/Firebase	Hono/Supabase	100%	Done	Decommission old by Sept 1
POST /api/orders	Express/Firebase	--	0%	Not started	Data model conflict, needs design
/dashboard	Pages Router	App Router	50%	In progress	Canary on 50% of logged-in users

Update this weekly. The percentage column is not decoration -- it is your source of truth for migration progress. "Done" means 100% traffic on the new system AND old code deleted from the repository.

How to Know When You Are Finished

A migration is finished when three conditions are met simultaneously:

Traffic to the old implementation is zero and has been zero for at least two weeks.
The old code has been deleted and the deletion is deployed to production.
All tests pass against the new implementation alone.

The VB6 pricing engine example used a more rigorous kill criterion: four consecutive weeks of zero discrepancy between old and new system outputs, across 8,000 automated comparison runs. That level of rigor is appropriate for financial systems. For a Next.js route migration, two weeks of clean traffic metrics is probably sufficient. The point is to define the criterion before you start, not after you want to be done.

When NOT to Use the Strangler Fig

The pattern assumes certain conditions that do not always hold.

When the system is small enough to rewrite cleanly. If the codebase is under 5,000 lines, well-tested, and the team understands it fully, a clean rewrite over a sprint is lower-risk than maintaining a routing layer for six months.

When there is no shared routing layer. Desktop applications, embedded systems, and mobile apps often have no clean intercept point. The pattern degrades badly without one.

When data models are fundamentally incompatible. If the old system's database schema cannot coexist with the new one -- not just different tables, but different conceptual models -- dual-write reconciliation becomes prohibitively complex. This is the most common real blocker, and the 41-project analysis confirms it: teams that could not logically partition their data domain almost always stalled before day 90.

When the original developers are gone and the system is undocumented. The Strangler Fig requires you to understand what you are replacing well enough to verify that the replacement is equivalent. If the old system is a black box, you may be strangling the wrong thing.

Decision Checklist

Before committing to a Strangler Fig migration, answer these questions honestly:

Can I identify a single routing layer that intercepts all traffic to the system being replaced?
Can I migrate one discrete feature or route without touching others?
Can the old and new data models coexist, even with translation logic?
Do I have monitoring on both the old and new system to compare outputs?
Does the business have patience for 12-18 months of dual operation?
Have I defined what "done" looks like for each individual feature before starting?
Do I have a kill switch to revert 100% of traffic instantly if the new system fails?
Is the existing system large or risky enough that this overhead is justified?

If you cannot check all eight boxes, stop and figure out why before writing a line of new code. The 68% that stalled in 90 days almost all failed to check one of these -- usually the data model question or the monitoring requirement.

Ask The Guild

We want to hear from builders who have been in the middle of a live migration:

What was your intercept point -- the routing layer that made the migration possible? And what was the one thing you wish you had done differently before starting?

Share your answer in the Architecture Patterns channel. The uglier the story, the more useful it is.

Tom Hundley is a software architect with 25 years of experience. He has watched more big-bang rewrites fail than he cares to count, and helped more teams survive migrations than he expected. He writes for the AI Coding Guild's Architecture Patterns series.