API Error Handling: What to Return and What to Swallow
Production Ready — Part 6 of 30
The Stack Trace That Cost $4.5 Million
A mid-sized fintech company — let's call them what they were, careless — had a payments API running in production. It worked fine for normal requests. But when a malformed payload arrived, their Python/Flask app threw an unhandled exception and returned this to the caller:
HTTP/1.1 500 Internal Server Error
Content-Type: text/html
Traceback (most recent call last):
File "/app/services/payment_processor.py", line 147, in process_payment
result = db.execute(f"SELECT * FROM payments WHERE user_id = {user_id}")
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1802
...
OperationalError: (psycopg2.OperationalError) could not connect to server: Connection refused
Is the server running on host "prod-db-01.internal" (10.0.1.45) and accepting
TCP/IP connections on port 5432?
In one error response, they handed an attacker:
- The exact file path and line number of their payment processor
- The raw SQL query pattern (and the fact it was injectable)
- Their internal database hostname and IP address
- The database port
- Their Python version and dependency stack
This is not a horror story I made up. According to a 2026 analysis of API security costs, the average API security breach now costs $4.5 million — and 68% of organizations reported at least one API-related security incident in 2025. The Equixly 2025 API incident report documents case after case where excessive data exposure — including in error responses — provided the initial foothold for attackers.
Verbose errors are not a debugging feature in production. They're a reconnaissance service for attackers.
The Two-Layer Error Model
Every production API needs two completely separate error channels:
- The external response — what you return to the API caller
- The internal log — what you record for your own engineers
These two things should look nothing alike.
The external response exists to help the caller fix their request or understand what kind of failure occurred. It should never contain implementation details.
The internal log exists so your team can diagnose the problem. It should contain everything — stack traces, query parameters, database errors, the full context.
Here's the model visualized:
Incoming Request
|
v
[Your API Handler]
|
|-- Error occurs --
|
|----> [Internal Logger] <-- Full stack trace, DB errors,
| request context, user ID,
| correlation ID, everything
|
v
[Error Sanitizer]
|
v
[Caller Response] <-- Clean, structured, no internals
The sanitizer is the key piece most vibe coders skip. Let's build it properly.
What to Return: The RFC 9457 Standard
Before I show you code, you need to know that there's an actual internet standard for this: RFC 9457 — Problem Details for HTTP APIs. It was published in July 2023, superseding RFC 7807, and it defines a application/problem+json content type that gives you a consistent, machine-readable error format.
Here's the RFC 9457 shape:
{
"type": "https://api.example.com/errors/insufficient-funds",
"title": "Insufficient funds",
"status": 403,
"detail": "Your balance is $30.00 but this transaction requires $50.00.",
"instance": "/transactions/abc-123"
}
Breaking down each field:
| Field | Purpose | Example |
|---|---|---|
type |
A stable URI identifying the class of problem | https://api.example.com/errors/rate-limited |
title |
Short human-readable summary (same for all instances of this type) | "Too many requests" |
status |
The HTTP status code (mirrors the response code) | 429 |
detail |
Human-readable explanation specific to this occurrence | "You've made 100 requests in the last 60 seconds" |
instance |
URI identifying this specific occurrence (optional, but useful) | "/requests/req-xyz-789" |
The critical RFC 9457 philosophy: problem details are not a debugging tool for your implementation — they're a way to expose details about the HTTP interface itself. Internal implementation details have no place here.
Python Implementation
Here's how to build this properly in a FastAPI service:
# errors.py — your error module
from fastapi import Request, HTTPException
from fastapi.responses import JSONResponse
from fastapi.exceptions import RequestValidationError
import logging
import uuid
import traceback
logger = logging.getLogger(__name__)
# Map internal exception types to safe external messages
ERROR_MAP = {
"ResourceNotFoundError": (404, "Resource not found", "The requested resource does not exist."),
"AuthenticationError": (401, "Authentication required", "Valid credentials are required for this endpoint."),
"AuthorizationError": (403, "Access denied", "You do not have permission to perform this action."),
"RateLimitError": (429, "Too many requests", "You have exceeded the rate limit. Please retry after 60 seconds."),
"ValidationError": (422, "Invalid request", "The request data failed validation."),
}
def problem_response(status: int, title: str, detail: str,
instance: str, error_type: str = None,
extra: dict = None) -> JSONResponse:
"""Return an RFC 9457-compliant problem detail response."""
body = {
"type": error_type or f"https://api.example.com/errors/{status}",
"title": title,
"status": status,
"detail": detail,
"instance": instance,
}
if extra:
body.update(extra)
return JSONResponse(
status_code=status,
content=body,
media_type="application/problem+json"
)
async def global_exception_handler(request: Request, exc: Exception) -> JSONResponse:
# Generate a correlation ID for this specific error occurrence
correlation_id = str(uuid.uuid4())
instance_uri = f"/errors/{correlation_id}"
# LOG EVERYTHING internally — stack trace, request details, the works
logger.error(
"Unhandled exception",
extra={
"correlation_id": correlation_id,
"path": request.url.path,
"method": request.method,
"exception_type": type(exc).__name__,
"stack_trace": traceback.format_exc(),
# Include user context if available
"user_id": getattr(request.state, "user_id", None),
}
)
# Return ONLY the generic safe message to the caller
return problem_response(
status=500,
title="Internal server error",
detail="An unexpected error occurred. Use the instance ID to reference this error with support.",
instance=instance_uri,
)
async def http_exception_handler(request: Request, exc: HTTPException) -> JSONResponse:
# HTTP exceptions (404, 403, etc.) are safe to surface — they describe
# the HTTP interface, not implementation internals
return problem_response(
status=exc.status_code,
title=exc.detail,
detail=exc.detail,
instance=request.url.path,
)
async def validation_exception_handler(request: Request, exc: RequestValidationError) -> JSONResponse:
# Validation errors: safe to show field names and what was wrong
# Do NOT echo back the submitted values (could contain sensitive data)
errors = [
{"field": ".".join(str(loc) for loc in err["loc"]), "message": err["msg"]}
for err in exc.errors()
]
return problem_response(
status=422,
title="Validation error",
detail="One or more fields failed validation.",
instance=request.url.path,
error_type="https://api.example.com/errors/validation",
extra={"errors": errors},
)
Then wire up your handlers in main.py:
# main.py
from fastapi import FastAPI
from fastapi.exceptions import RequestValidationError
from starlette.exceptions import HTTPException
from .errors import global_exception_handler, http_exception_handler, validation_exception_handler
app = FastAPI()
app.add_exception_handler(Exception, global_exception_handler)
app.add_exception_handler(HTTPException, http_exception_handler)
app.add_exception_handler(RequestValidationError, validation_exception_handler)
JavaScript/TypeScript Implementation
Here's the Express.js equivalent with the same two-layer approach:
// errors.ts
import { Request, Response, NextFunction } from 'express';
import { v4 as uuidv4 } from 'uuid';
import { logger } from './logger';
interface ProblemDetail {
type: string;
title: string;
status: number;
detail: string;
instance: string;
[key: string]: unknown;
}
export function problemResponse(
res: Response,
status: number,
title: string,
detail: string,
instance: string,
extras?: Record<string, unknown>
): void {
const body: ProblemDetail = {
type: `https://api.example.com/errors/${status}`,
title,
status,
detail,
instance,
...extras,
};
res.status(status)
.type('application/problem+json')
.json(body);
}
// Global error middleware — must have 4 parameters for Express to treat as error handler
export function globalErrorHandler(
err: Error,
req: Request,
res: Response,
_next: NextFunction
): void {
const correlationId = uuidv4();
// Log everything internally
logger.error('Unhandled error', {
correlationId,
path: req.path,
method: req.method,
errorType: err.constructor.name,
errorMessage: err.message,
stack: err.stack,
userId: (req as any).userId ?? null,
});
// Return nothing useful to the attacker
problemResponse(
res,
500,
'Internal server error',
`An unexpected error occurred. Reference ID: ${correlationId}`,
req.path
);
}
What to Swallow (And Why)
Here's a decision framework for every piece of error information:
Always swallow from responses (log internally only):
- Stack traces
- Database query text or error messages
- Internal hostnames, IPs, or ports
- File system paths
- Third-party service error messages (e.g., raw AWS or Stripe SDK errors)
- SQL state codes or ORM-level exceptions
- Environment variable names
- Dependency versions
Safe to return in responses:
- HTTP status code and meaning
- Field-level validation errors (field name + what rule failed)
- Business rule violations ("balance insufficient", "account suspended")
- Rate limit information with retry guidance
- A correlation/request ID for support reference
The trap vibe coders fall into: passing error.message directly into the response. The message might be totally safe ("Email already registered") or catastrophically unsafe ("Column 'ssn' doesn't exist in table 'users_v1'") — you can't tell without reading it. Always map exception types to safe messages explicitly, as shown in the ERROR_MAP pattern above.
The Volkswagen Lesson: Error Responses as Recon
The 2025 Volkswagen API incident is a textbook case. According to the Equixly API incident analysis, their APIs returned entire JSON objects to clients, relying on client-side filtering to hide sensitive fields. When a researcher bypassed the client-side interface and called the API directly, the unfiltered responses included plaintext credentials for backend services including Salesforce, payment processors, and internal systems.
This wasn't a stack trace leak — it was business-logic-level over-exposure. The same principle applies: never rely on your frontend to hide what your API returns.
The Levo.ai 2026 API security checklist makes it explicit: "How your API fails often reveals more than how it succeeds." Test your error responses as carefully as you test your success paths. Try intentionally malformed requests, inject SQL characters, pass wrong types, send oversized payloads — and check exactly what comes back.
The Correlation ID Pattern
One thing your error responses should include is a correlation ID — a random UUID generated per-request that links the safe external error to the full internal log entry.
This solves the support problem cleanly:
# What the user sees:
{
"type": "https://api.example.com/errors/500",
"title": "Internal server error",
"detail": "An unexpected error occurred. Reference ID: 7f3d2a1c-9e4b-4f8d-b1c2-0a3e5f7d9b12",
"instance": "/v1/payments/process",
"status": 500
}
# What you grep in your logs:
$ grep "7f3d2a1c-9e4b-4f8d-b1c2-0a3e5f7d9b12" /var/log/api.log
# What you find:
{
"correlation_id": "7f3d2a1c-9e4b-4f8d-b1c2-0a3e5f7d9b12",
"path": "/v1/payments/process",
"user_id": "usr_abc123",
"exception_type": "OperationalError",
"stack_trace": "...",
"db_error": "could not connect to host prod-db-01.internal:5432"
}
The user gets a reference they can give to support. You get everything you need to debug. Attackers get nothing useful.
Don't Forget: 404 vs. 403 Information Leakage
There's a subtle but critical error handling decision that catches vibe coders off guard: returning 404 when you should return 403.
Consider a user trying to access /api/users/456/profile when they're only authorized to access /api/users/123/profile.
Bad pattern:
GET /api/users/456/profile → 403 Forbidden
This tells the attacker: "User 456 exists. I just can't see them yet."
Better pattern:
GET /api/users/456/profile → 404 Not Found
Return 404 for any resource the caller isn't authorized to know exists. This is called authorization-aware 404 and it's the correct behavior for sensitive resources. The CybelAngel 2025 API threat report noted that 95% of API attacks in 2025 came from authenticated sessions — meaning attackers are already inside, probing what they can discover through response patterns. Don't help them enumerate.
Quick Checklist
Before shipping any API endpoint to production, verify:
- Global exception handler exists — no uncaught exceptions ever reach the caller as raw errors
- Stack traces are never in response bodies — only in internal logs
- Database errors are swallowed — callers see "internal error", not SQL text
- Third-party SDK errors are wrapped — raw Stripe/AWS/Twilio messages stay internal
- All errors use a consistent JSON structure — ideally RFC 9457
application/problem+json - Correlation IDs link external errors to internal logs — support can look up any incident
- Validation errors expose field names but not echoed values — show what was wrong, not what was submitted
- 404 used for unauthorized resource access — don't confirm existence of resources the caller can't see
- Error responses are tested explicitly — malformed requests, auth failures, edge cases all checked
- Dev/staging verbose mode is disabled in production — no
DEBUG=TrueorNODE_ENV=development
Ask The Guild
Community prompt: Have you ever shipped a verbose error message to production — or found one in a codebase you inherited? What was leaking, and how did you fix it? Share your war story (sanitized, of course) in the comments. Bonus points if you've implemented RFC 9457 and want to show off your error registry setup.