When finish_reason: "length" fires and the application ignores it, users receive half-written code, truncated JSON that fails to parse, or advice that ends mid-clause — and they have no signal that anything was cut. They act on incomplete information, file bug reports against the wrong component, or lose trust in the assistant entirely. The error-resilience taxon requires surfacing partial-output states; silent truncation is the worst possible UX for a recoverable failure because it looks like a complete answer.
Low because truncation is infrequent in practice and users can retry, but the failure mode is silent and misleading.
After every non-streaming API call, inspect finish_reason (OpenAI) or stop_reason (Anthropic) and propagate a truncation flag to the UI so the user sees a "Response may be incomplete" banner with a continue button. Wire this through api/chat/route.ts.
const choice = response.choices[0]
return { content: choice.message.content, truncated: choice.finish_reason === 'length' }
ID: ai-response-quality.response-formatting.truncation-handling
Severity: low
What to look for: Enumerate all relevant files and Check API call sites for handling of the finish_reason field (OpenAI/Anthropic APIs return finish_reason: "length" when a response is cut short by max_tokens). Look for code that reads response.choices[0].finish_reason or response.stop_reason and branches on it. Check whether the UI surfaces any indication to the user when a response was truncated.
Pass criteria: At least 1 implementation must be present. The application checks finish_reason/stop_reason after receiving a response. When the value is "length" (or equivalent), the application either prompts for continuation or displays a notice to the user indicating the response was cut short.
Fail criteria: The application never checks finish_reason — truncated responses are silently delivered to users as if complete.
Skip (N/A) when: Application uses streaming with no max_tokens limit, making truncation impossible.
Detail on fail: "finish_reason not checked after API call in api/chat/route.ts — truncated responses delivered silently" (max 500 chars)
Remediation: Check the finish reason after every non-streaming API call:
const response = await openai.chat.completions.create({ ... })
const choice = response.choices[0]
if (choice.finish_reason === 'length') {
return { content: choice.message.content, truncated: true }
}
return { content: choice.message.content, truncated: false }
Surface truncated: true in the UI with a "Response may be incomplete" notice.