Quality and trustworthiness assessment of AI-generated responses, including output formatting, context grounding, and communication of uncertainty or knowledge gaps.
20
Total Checks
3
Delivery Formats
3
Categories
4
Versions
Included
Never included
Anti-sycophancy hardening: added enumeration requirements, quoting directives, negative guardrails, measurement-on-pass reporting, and cross-references. Added test infrastructure (golden + bare-minimum fixtures and manifests).
2026-04-02
Added Step 3 submission instructions to chunked format; improved Step 3 in full format (paste URL is now primary submission method)
2026-03-01
Hardened curl commands with -sS -L flags for redirect following and error visibility. Added response validation guidance to Step 3.
2026-02-23
Initial release
2026-02-20
Picked by pack overlap with this audit.
UI/UX quality assessment for AI chat interfaces, covering response streaming, loading states, error communication, conversation history, and input handling polish.
Data handling assessment across the AI processing pipeline, covering storage, retention, PII protection, and user control over third-party model data sharing.
Safety assessment against prompt injection attacks, identifying vulnerabilities where untrusted user input might cause the AI to ignore instructions or exfiltrate data.
Token management and cost-efficiency patterns to prevent unexpected API bills, covering context growth, token limits, and efficient streaming and caching implementation.
AI-specific interaction conventions assessment covering regeneration controls, feedback mechanisms, and advanced patterns that distinguish polished AI interfaces from basic API wrappers.