Unbounded user input flowing into LLM calls enables two distinct attack classes. First, context-window flooding: an attacker sends 50,000 characters designed to push your system prompt out of the model's effective attention range, weakening instruction adherence—a pattern consistent with OWASP LLM01:2025 evasion techniques. Second, cost amplification: a single API call with a maxed-out context window can cost 10–50× a normal request, making your AI feature a direct financial target (NIST AI RMF MANAGE 1.3). CWE-20 covers the missing input validation directly. For applications with per-seat or consumption billing models, uncontrolled token spend can eliminate margin on every request.
High because unbounded input enables both cost amplification attacks that create immediate financial damage and context-flooding evasion that degrades injection defenses at scale.
Enforce a maximum length constraint at the schema layer, before the value ever reaches the AI call. A 4,000-character ceiling works for most chat UIs; document-analysis flows should apply per-chunk limits instead.
// Zod schema with enforced upper bound
const ChatInputSchema = z.object({
message: z.string().min(1).max(4000),
conversationId: z.string().uuid().optional()
})
export async function POST(req: Request) {
const body = ChatInputSchema.safeParse(await req.json())
if (!body.success) {
return Response.json({ error: 'Invalid input' }, { status: 400 })
}
// body.data.message is now bounded before reaching the AI provider
}
For token-sensitive flows (RAG, document analysis), add a token counting step using tiktoken or your provider's counting endpoint after length validation, so character limits alone don't mask large-token inputs from compact Unicode.
ID: ai-prompt-injection.input-sanitization.input-length-limits
Severity: high
What to look for: Count all LLM-facing input endpoints. For each, look for validation of user input length before it is passed to the AI provider. Check for: maximum length checks on request body fields that become prompt content, middleware that enforces content-length limits, or schema validation (Zod, Joi, Yup, etc.) with string .max() constraints on prompt-related fields. Also check for token counting before submission to catch cases where character limits alone are insufficient.
Pass criteria: User-provided text that flows into prompts has an enforced maximum length — either via schema validation, explicit length check, or token counting — before it reaches the AI API call with a maximum input length of no more than 4096 tokens or equivalent character limit enforced server-side. Report: "X LLM input endpoints found, all Y enforce length limits."
Fail criteria: No length validation found on user input fields that are included in prompt messages. Input is passed to the AI provider without any size constraint.
Skip (N/A) when: No AI provider integration detected.
Detail on fail: "No input length validation found on the userMessage field in POST /api/chat before it is included in the AI prompt" or "Zod schema for chat input uses z.string() without .max() constraint"
Remediation: Unbounded input allows attackers to craft very long prompts designed to overwhelm the context window, consume tokens expensively, or gradually shift model behavior through sheer volume of injected content. Enforce limits at the schema layer:
// Zod schema example
const ChatInput = z.object({
message: z.string().min(1).max(4000), // enforce reasonable upper bound
conversationId: z.string().uuid().optional()
})
Choose a limit that makes sense for your use case. A 4,000-character limit works for most chat applications. For RAG or document-analysis flows, set limits per document chunk, not just per message.