Truncated responses are detected and handled gracefully
Why it matters
When finish_reason: "length" fires and the application ignores it, users receive half-written code, truncated JSON that fails to parse, or advice that ends mid-clause — and they have no signal that anything was cut. They act on incomplete information, file bug reports against the wrong component, or lose trust in the assistant entirely. The error-resilience taxon requires surfacing partial-output states; silent truncation is the worst possible UX for a recoverable failure because it looks like a complete answer.
Severity rationale
Low because truncation is infrequent in practice and users can retry, but the failure mode is silent and misleading.
Remediation
After every non-streaming API call, inspect finish_reason (OpenAI) or stop_reason (Anthropic) and propagate a truncation flag to the UI so the user sees a "Response may be incomplete" banner with a continue button. Wire this through api/chat/route.ts.
const choice = response.choices[0]
return { content: choice.message.content, truncated: choice.finish_reason === 'length' }
Detection
-
ID:
truncation-handling -
Severity:
low -
What to look for: Enumerate all relevant files and Check API call sites for handling of the
finish_reasonfield (OpenAI/Anthropic APIs returnfinish_reason: "length"when a response is cut short bymax_tokens). Look for code that readsresponse.choices[0].finish_reasonorresponse.stop_reasonand branches on it. Check whether the UI surfaces any indication to the user when a response was truncated. -
Pass criteria: At least 1 implementation must be present. The application checks
finish_reason/stop_reasonafter receiving a response. When the value is"length"(or equivalent), the application either prompts for continuation or displays a notice to the user indicating the response was cut short. -
Fail criteria: The application never checks
finish_reason— truncated responses are silently delivered to users as if complete. -
Skip (N/A) when: Application uses streaming with no max_tokens limit, making truncation impossible.
-
Detail on fail:
"finish_reason not checked after API call in api/chat/route.ts — truncated responses delivered silently"(max 500 chars) -
Remediation: Check the finish reason after every non-streaming API call:
const response = await openai.chat.completions.create({ ... }) const choice = response.choices[0] if (choice.finish_reason === 'length') { return { content: choice.message.content, truncated: true } } return { content: choice.message.content, truncated: false }Surface
truncated: truein the UI with a "Response may be incomplete" notice.
Taxons
History
- 2026-04-18·v1.0.0·Initial import from ai-response-quality·automated