Uncertainty and confidence are signaled appropriately

ab-000208 · ai-response-quality.hallucination-prevention.uncertainty-signaling

Severity: highactive

Why it matters

A system prompt that instructs an AI to "always sound confident and authoritative" actively suppresses the model's natural uncertainty signaling — turning calibration failures into confident fabrications. Users in factual domains (legal, medical, financial, technical) rely on hedging language to know when to independently verify a claim. Stripping that signal is not a UX improvement; it is an epistemic hazard. OWASP LLM09 identifies overconfident AI output as a misinformation risk. NIST AI RMF MEASURE-2.5 requires that AI systems accurately represent their confidence levels to operators and users.

Severity rationale

High because suppressing uncertainty language causes users to act on fabricated or uncertain AI answers without the hedging signals that would prompt them to verify, directly enabling harm in factual-domain applications.

Remediation

Replace any "always sound confident" instruction with calibrated uncertainty guidance in src/lib/ai/system.ts or wherever your system prompt is defined:

const systemPrompt = `
Be direct when you are confident. When you are uncertain, say so:
"I believe...", "I'm not certain, but...", "You may want to verify this, but..."
Never present a guess as an established fact. If you don't know, say so plainly.
`

Do not overcorrect with blanket hedging on every statement — the goal is calibrated confidence, not universal uncertainty.

Detection

ID: uncertainty-signaling
Severity: high
What to look for: Enumerate all relevant files and Check the system prompt for instructions that require the AI to signal uncertainty when it is not confident in an answer. Look for guidance like "Use hedging language when uncertain", "Say 'I believe' or 'I'm not certain' when you are not confident", "Do not present uncertain information as fact". Also check whether the system prompt prohibits excessive hedging (which can reduce usefulness) — the goal is calibrated confidence, not blanket uncertainty.
Pass criteria: No more than 0 violations are acceptable. System prompt instructs the AI to use calibrated uncertainty language — hedging when uncertain, direct when confident. The prompt does not instruct the AI to always sound confident or never to hedge.
Fail criteria: System prompt instructs the AI to "always sound confident and authoritative" or similar, or the system prompt has no uncertainty guidance at all for an application in a factual domain.
Skip (N/A) when: Application is a creative writing tool, brainstorming assistant, or similar where factual certainty is not a user expectation.
Detail on fail: "System prompt in lib/ai/system.ts says 'Respond with confidence' but provides no uncertainty signaling guidance" (max 500 chars)

Remediation: Add calibrated uncertainty instructions:

const systemPrompt = `
Be direct and clear when you are confident in your answer.
When you are uncertain, use appropriate hedging: "I believe...", "I'm not certain, but...",
"You may want to verify this, but...". Never present guesses as established facts.
If you do not know something, say so plainly.
`

External references

owasp-llm:2025 · LLM09 — Misinformation
nist-ai-rmf:1.0 · MEASURE-2.5 — AI system to be deployed meets stated trustworthiness criteria

Taxons

inference-contract

History

2026-04-18·v1.0.0·Initial import from ai-response-quality·automated