Multi-turn conversation context is validated for injection
Why it matters
When clients supply the full conversation history on each request, an attacker can fabricate previous assistant turns that never occurred—inserting fake model agreements like "As I mentioned earlier, I can help with that" to manipulate the model's behavior in the current turn. This is a form of indirect prompt injection (OWASP LLM01:2025, MITRE ATLAS AML.T0051) that exploits the model's tendency to maintain consistency with apparent prior context. The vulnerability is invisible in code review because the injection arrives as a structurally valid messages array, not as obviously malicious input. Applications with multi-user conversations (support tools, shared workspaces) face the additional risk of context cross-contamination between users.
Severity rationale
Medium because exploitation requires the attacker to craft a plausible fake conversation history rather than a simple string, raising the effort bar while still enabling meaningful instruction override.
Remediation
Read conversation history from a server-side store keyed by a validated conversation ID. Never trust a client-supplied messages array as the canonical record of prior turns.
// Server-side conversation reconstruction
export async function POST(req: Request) {
const { conversationId, userMessage } = await req.json()
const validatedId = z.string().uuid().parse(conversationId)
// History comes from your DB, not the client
const history = await db.messages.findMany({
where: { conversationId: validatedId, userId: session.userId },
orderBy: { createdAt: 'asc' },
select: { role: true, content: true }
})
const messages = [
{ role: 'system', content: SYSTEM_PROMPT },
...history,
{ role: 'user', content: userMessage }
]
// proceed with AI call
}
The userId filter in the query is essential in multi-tenant applications to prevent one user's history from being accessed via another user's conversation ID.
Detection
-
ID:
multi-turn-context-validation -
Severity:
medium -
What to look for: Enumerate every multi-turn conversation endpoint. For each, examine how conversation history is assembled for multi-turn chat. Look for code that reads prior messages from a database, session store, or client-provided array and includes them in the next API call. Check whether the prior messages are used as-is from the client (high risk) or are read from a server-side store keyed by a verified session/conversation ID (lower risk).
-
Pass criteria: Conversation history for multi-turn chat is retrieved from a server-side store (database, Redis, session storage) keyed by a validated conversation ID, not assembled from a client-provided messages array. OR the project has no multi-turn conversation feature — 100% of multi-turn endpoints must validate context integrity between turns. Report: "X multi-turn endpoints found, all Y validate conversation context."
-
Fail criteria: The client sends the full messages array in each request, and the server uses it directly without re-reading history from a server-side store. This allows a client to fabricate previous assistant turns that modify model behavior.
-
Skip (N/A) when: No multi-turn conversation feature exists — the AI integration is stateless (each request is a fresh single-turn prompt).
-
Detail on fail:
"POST /api/chat accepts a messages[] array from the client body and passes it directly to the AI provider without server-side history validation"or"Conversation history is read from client localStorage and included in each request without server verification" -
Remediation: When the client controls the full message history, an attacker can inject fabricated assistant turns ("The assistant previously agreed to...") that shift the model's behavior. Keep conversation history server-side:
// Server-side conversation store pattern const history = await db.messages.findMany({ where: { conversationId: validatedConversationId }, orderBy: { createdAt: 'asc' } }) const messages = [ { role: 'system', content: SYSTEM_PROMPT }, ...history.map(m => ({ role: m.role, content: m.content })), { role: 'user', content: userMessage } ]
External references
- cwe · CWE-1427 — Improper Neutralization of Input Used in AI/ML Prompt Injection
- owasp-llm:2025 · LLM01 — Prompt Injection
- mitre-atlas:v4 · AML.T0051 — LLM Prompt Injection
Taxons
History
- 2026-04-18·v1.0.0·Initial import from ai-prompt-injection·automated