Skip to main content

Context window usage or limit is communicated

ab-000343 · ai-ux-patterns.transparency.context-window-indicator
Severity: highactive

Why it matters

When the conversation exceeds the model's context window, the API either errors, truncates silently, or starts forgetting earlier turns — all three outcomes confuse users who have no mental model of why the AI suddenly lost track of something they told it ten messages ago. Surfacing context usage teaches users the constraint proactively so they can branch, summarize, or start fresh before quality degrades. The alternative is an invisible cliff that users hit repeatedly without understanding why responses get worse.

Severity rationale

High because silent context truncation causes the model to appear dumber without the user knowing why.

Remediation

Estimate token usage by summing message.content.length / 4 across the turn history and compare against the model's context ceiling. Render a warning at 75% and a strong recommendation to start a new conversation at 90%. Use a real tokenizer (tiktoken, @anthropic-ai/tokenizer) in src/lib/tokens.ts once the rough heuristic is in place.

{contextUsed > 0.75 && <p>Conversation is {Math.round(contextUsed * 100)}% of context used. Consider starting fresh.</p>}

Detection

  • ID: ai-ux-patterns.transparency.context-window-indicator

  • Severity: high

  • What to look for: Count all references to context window limits, token counting, or conversation length tracking in the codebase. Enumerate: token counting logic, "context full" warning messages, automatic conversation truncation with notification, or visible usage indicators. Check for graceful degradation when context limits are reached — does the app truncate silently, error loudly, or warn proactively? At least 1 proactive context-length warning or indicator must exist.

  • Pass criteria: Either (a) the conversation displays a context usage indicator showing a numeric percentage or token count, or (b) when context limits are approached or reached, the user receives a clear, actionable message (not a raw API error) explaining that the conversation is getting long and suggesting what to do (start a new conversation, summarize, etc.). Report on pass: "Context indicator shows X% usage" or "Warning triggers at X% threshold."

  • Fail criteria: Context window limits are handled silently or with a raw API error. Users have no indication that conversation length affects response quality or that a limit exists.

  • Skip (N/A) when: The application exclusively uses models with effectively unlimited context (>=1M tokens) and has architectural guarantees that context limits will never be approached in normal usage. This is rare — skip with documented justification only.

  • Detail on fail: "No context length indicator found. Context limit errors from the API would surface as raw error messages or empty responses with no explanation.".

  • Remediation: Context limits are a uniquely AI-specific constraint that confuses users. A simple character or message count with a warning is sufficient:

    // Estimate token count from character count (~4 chars per token)
    const estimatedTokens = messages.reduce(
      (sum, m) => sum + Math.ceil(m.content.length / 4), 0
    )
    const contextLimit = 128000 // adjust per model
    const contextUsed = estimatedTokens / contextLimit
    
    {contextUsed > 0.75 && (
      <div className="text-xs text-amber-600 px-4 pb-2">
        This conversation is getting long ({Math.round(contextUsed * 100)}% of context used).
        Consider starting a new conversation for best results.
      </div>
    )}
    

Taxons

History