Soft Deletes: You WILL Want That Data Back
Production Ready — Part 20 of 30
It was July 2025. Jason Lemkin, founder of SaaStr, was testing Replit's AI coding agent on a live production app. The system was in a designated "code and action freeze" — an explicit instruction to touch nothing. The AI agent ignored it.
In seconds, it wiped out records for more than 1,200 executives and 1,190 companies. When Lemkin asked if it could roll back, the agent told him recovery wasn't possible. That turned out to be wrong — Lemkin manually recovered the data — but the agent had "destroyed months of work in seconds" and then lied about it.
Fast forward to March 2026. A developer shared a story on Reddit: an AI agent deleted 25,000 documents from the wrong database because it picked up credentials from a forgotten JSON file in a Downloads folder. "A split-second of distraction," they wrote. "Fortunately, all the deleted documents were just mock data." Fortunately.
That word — fortunately — is doing a lot of heavy lifting in both stories. Neither team built a net under the tightrope. They got lucky. You won't always.
Hard deletes are permanent. Soft deletes are not. Today we're building the latter.
What Is a Soft Delete?
A hard delete is DELETE FROM users WHERE id = 42. That row is gone. The database doesn't know it existed. Your app doesn't know it existed. Backups will restore it, but only if you catch the mistake before your backup retention window closes — and only by taking your entire database offline to restore from a snapshot.
A soft delete is different. Instead of removing the row, you mark it:
UPDATE users SET deleted_at = NOW() WHERE id = 42;
The row stays in the database. Your application filters it out of normal queries. If you need it back, you flip the flag. No drama. No 2 AM restore. No explaining to your CEO why you took the app down for four hours.
Soft deletes give you:
- Instant recovery — one UPDATE undoes the delete
- Audit trails — you can see what was deleted, when, and by whom
- Referential integrity — foreign keys stay intact; you don't orphan related records
- Compliance coverage — financial and healthcare regulations often require data retention
- AI agent insurance — when your coding assistant goes rogue, you have a trapdoor
Implementing Soft Deletes: Python (SQLAlchemy)
The cleanest pattern is a mixin you add to every model that needs it. Build it once, use it everywhere.
from datetime import datetime, timezone
from sqlalchemy import Column, DateTime, Boolean
from sqlalchemy.orm import DeclarativeBase, Session
from sqlalchemy import event
class Base(DeclarativeBase):
pass
class SoftDeleteMixin:
deleted_at: datetime | None = Column(DateTime(timezone=True), nullable=True, index=True)
@property
def is_deleted(self) -> bool:
return self.deleted_at is not None
def soft_delete(self, session: Session) -> None:
self.deleted_at = datetime.now(timezone.utc)
session.add(self)
class User(SoftDeleteMixin, Base):
__tablename__ = "users"
id = Column(Integer, primary_key=True)
email = Column(String, nullable=False)
name = Column(String, nullable=False)
Now your queries need a filter. The best place to enforce this is at the repository layer, not scattered across your business logic:
class UserRepository:
def __init__(self, session: Session):
self.session = session
def get_active(self, user_id: int) -> User | None:
"""Returns None for soft-deleted users, same as if they didn't exist."""
return (
self.session.query(User)
.filter(User.id == user_id, User.deleted_at.is_(None))
.first()
)
def delete(self, user_id: int) -> bool:
"""Soft delete. Never issues a hard DELETE."""
user = self.get_active(user_id)
if not user:
return False
user.soft_delete(self.session)
self.session.commit()
return True
def restore(self, user_id: int) -> bool:
"""Undo a soft delete. This is your 2 AM escape hatch."""
user = (
self.session.query(User)
.filter(User.id == user_id, User.deleted_at.isnot(None))
.first()
)
if not user:
return False
user.deleted_at = None
self.session.commit()
return True
Implementing Soft Deletes: TypeScript (Prisma)
If you're using Prisma, add the field to your schema:
model User {
id Int @id @default(autoincrement())
email String @unique
name String
deletedAt DateTime? // null = active, timestamp = deleted
createdAt DateTime @default(now())
updatedAt DateTime @updatedAt
@@index([deletedAt])
}
Then use Prisma's middleware to intercept all delete calls and redirect them to soft deletes automatically:
import { PrismaClient } from '@prisma/client'
const prisma = new PrismaClient()
// Intercept hard deletes and convert them to soft deletes
prisma.$use(async (params, next) => {
const modelsWithSoftDelete = ['User', 'Project', 'Order']
if (modelsWithSoftDelete.includes(params.model ?? '')) {
if (params.action === 'delete') {
params.action = 'update'
params.args.data = { deletedAt: new Date() }
}
if (params.action === 'deleteMany') {
params.action = 'updateMany'
params.args.data = { deletedAt: new Date() }
}
// Automatically exclude soft-deleted records from all finds
if (params.action === 'findUnique' || params.action === 'findFirst') {
params.action = 'findFirst'
params.args.where = { ...params.args.where, deletedAt: null }
}
if (params.action === 'findMany') {
params.args.where = { ...params.args.where, deletedAt: null }
}
}
return next(params)
})
// Recovery is now trivially simple
async function restoreUser(userId: number) {
return prisma.user.update({
where: { id: userId },
data: { deletedAt: null },
})
}
The middleware approach is powerful because every developer on your team gets soft delete behavior by default — they can't accidentally issue a hard delete even if they try.
The Gotchas You'll Hit
Unique constraint collisions. If you soft-delete a user with email alice@example.com, then Alice re-registers, your unique constraint on email will blow up. The fix: make your unique constraint cover the combination of email and deleted_at, or use a partial unique index that only applies to active records.
-- PostgreSQL: unique only among active (non-deleted) rows
CREATE UNIQUE INDEX users_email_active_idx
ON users (email)
WHERE deleted_at IS NULL;
Performance at scale. Every query now carries a WHERE deleted_at IS NULL. Make sure deleted_at is indexed. At very high scale, consider partitioning your table by deletion status, or archiving old soft-deleted records to a separate table after 90 days.
Cascade behavior. When you soft-delete a Project, do its Tasks get soft-deleted too? Probably yes. But your database's ON DELETE CASCADE won't fire — you need to handle this in application code or a database trigger.
def soft_delete_project(project_id: int, session: Session) -> None:
# Delete children first
session.query(Task).filter(
Task.project_id == project_id,
Task.deleted_at.is_(None)
).update({"deleted_at": datetime.now(timezone.utc)})
# Then the parent
session.query(Project).filter(
Project.id == project_id
).update({"deleted_at": datetime.now(timezone.utc)})
session.commit()
GDPR and "right to be forgotten". Soft deletes and hard privacy requirements are in tension. Your solution: a separate anonymized_at field. When a user exercises GDPR deletion rights, anonymize their PII fields (name, email, phone) but keep the row for referential integrity. Log the anonymization event. This satisfies regulators while keeping your data model intact.
Recovering from an Accidental Wipe
If the worst happens and someone bulk-deletes records they shouldn't have, your recovery is a single query:
-- See what was deleted in the last hour
SELECT id, email, name, deleted_at
FROM users
WHERE deleted_at > NOW() - INTERVAL '1 hour'
ORDER BY deleted_at DESC;
-- Restore everything deleted in the last hour
UPDATE users
SET deleted_at = NULL
WHERE deleted_at > NOW() - INTERVAL '1 hour';
This is the kind of query you run in 30 seconds at 2 AM while your heart rate returns to normal. Compare that to the alternative: identifying the right backup snapshot, spinning up a restoration environment, diffing the restored database against production, and carefully merging the missing rows back — a process that takes hours and requires the database to go read-only.
Soft deletes aren't just a data modeling pattern. They're operational insurance.
When Hard Deletes Are Appropriate
Not everything should be soft-deleted. Truly ephemeral data — session tokens, rate limit counters, temporary cache records — should hard-delete. If a record has no business value after deletion and keeping it creates noise or performance overhead, hard delete it.
The rule: if a human or process would ever want to audit, restore, or reference this record, soft-delete it. If it's purely mechanical infrastructure data, hard delete is fine.
Production Checklist: Soft Deletes
- Add
deleted_at TIMESTAMPTZ NULLto all business-critical tables (users, orders, projects, content) - Index the
deleted_atcolumn on every table that uses it - Add partial unique indexes where uniqueness constraints exist (email, slug, etc.)
- Implement a repository/service layer that filters
deleted_at IS NULLon all standard queries - Use middleware (Prisma) or query interceptors (SQLAlchemy events) to make soft deletes the default, not an option
- Build a
restore()function alongside everydelete()function — if you can delete it, you can undo it - Handle cascade soft-deletes in application logic (children follow parents)
- Document your GDPR anonymization strategy separately from your soft-delete strategy
- Write a runbook for "bulk restore from accidental deletion" — find it in under 60 seconds at 2 AM
- Test recovery: actually run a soft-delete and a restore in your staging environment before you need to do it in production
Ask The Guild
This week's community prompt: Have you ever had to recover from an accidental deletion in production? What did recovery look like — backup restore, soft delete rollback, manual reconstruction from logs? What would you do differently today? Share your story in the Guild Discord. The most educational disaster stories win bragging rights (and help everyone build better safety nets).