Documentation
Everything you need to understand and use AuditBuffet audit prompts.
New: Visual Guide — step-by-step screenshots for running audits in Claude Code, Cursor, Replit, Bolt, ChatGPT, and more.
Getting Started
AuditBuffet is a library of adversarially-tested audit prompts for AI-built applications. Each audit prompt runs inside your existing AI coding tool — Claude Code, Cursor, Bolt, Lovable, v0, or any other tool — with your project open. No plugins, no integrations, no setup on AuditBuffet required.
The audit prompt instructs your AI tool to inspect your project and produce a structured JSON telemetry block. You paste that JSON into AuditBuffet to get scored results, category breakdowns, and percentile rankings against other projects.
Start with the Stack Scan: open your project in your AI tool, paste the prompt, and it detects your tech stack in about 30 seconds. No account required — the Stack Scan creates your project on AuditBuffet automatically.
How to Run Audits
Every audit page has three prompt formats. Pick the one for your tool:
Claude Code (terminal)
Use the Full Format. Paste the prompt directly into Claude Code and press Enter. Claude Code has full project context and can run shell commands, so the audit will auto-submit results via curl. For long prompts, save to a file and tell Claude: “Read prompt.md and run this audit on the project.”
Cursor
Use the Full Format in Agent mode (the panel at the bottom of the editor). Agent mode has full file access and can run terminal commands, so auto-submission via curl works. Add @codebase at the start of your message for full project context.
Windsurf
Use the Full Format in Cascade mode. Cascade has full project context and shell access, so it works the same as Cursor Agent mode — paste the prompt, wait for the audit to complete.
ChatGPT / Gemini / Chat interfaces
Use the Chat Format. This format includes a preamble that walks you through sharing your code. You'll paste your project files into the conversation (or share them in pieces as the AI asks for them). The AI produces a JSON block at the end — copy it and paste it at auditbuffet.com/submit/telemetry. curl won't work in chat interfaces, so manual paste is the way.
Browser builders (Bolt, Lovable, v0, Replit, Base44)
Use the Chunked Format for Bolt, Lovable, and v0 (smaller context windows). Use the Chat Format for Replit AI and Base44 (they handle longer prompts but lack file access). The audit prompt will guide you through exporting and sharing your project files. Submit the JSON at auditbuffet.com/submit/telemetry.
Codex / Copilot / Cline / Aider
Use the Full Format. These tools have file system access and work the same way — paste the prompt, let the tool analyze your project, copy the JSON output. If your tool can run shell commands, it will auto-submit via curl.
CLI & MCP Server
The auditbuffet package includes both a command-line tool and an MCP server. Install once, use from your terminal or directly inside Claude Code / Cursor.
1. Get your API key
Sign in at auditbuffet.com/dashboard and go to Settings → API Keys. Copy your key (starts with ab_).
Fastest way: Go to Settings and click Copy Setup Command. It copies a ready-to-paste instruction with your API key that any AI coding tool can run to set up the CLI and MCP server automatically.
The --key flag works everywhere, including inside Claude Code, Cursor, and CI/CD pipelines. Your key is verified against the server and saved to ~/.auditbuffet/config.json.
2. Using the CLI
Typical workflow: Run auditbuffet run security-headers to get the prompt, paste it into your AI tool with your project open, save the JSON output to a file, then run auditbuffet submit ./audit-result.json to see your score.
3. MCP server setup
The MCP server lets your AI tool run audits autonomously. It works with any tool that supports the Model Context Protocol — currently Claude Code, Cursor, and Windsurf.
Claude Code (terminal): Run this one-liner:
Cursor, Windsurf, or Claude Desktop: Add to your MCP config file:
Restart Claude Code after saving. The MCP server provides 6 tools that Claude can call directly:
list_projects— find your existing projects (so audits link to the right dashboard)list_audits— browse and filter available auditsget_audit_prompt— fetch the full prompt for any auditsubmit_audit_results— submit telemetry and get scores backget_project_scores— check a project's scores across all auditsget_benchmarks— see how scores compare to other projects
4. Using the MCP tools
Once configured, just ask your AI naturally. It will use the MCP tools automatically:
get_audit_prompt, runs all checks against your codebase, builds the telemetry JSON, and submits it via submit_audit_results. Returns your score, grade, and benchmark ranking.list_audits with the security pack filter and shows you the available audits with check counts.5. Config file locations by tool
claude mcp add auditbuffet -e AUDITBUFFET_API_KEY=... -- npx --package=auditbuffet auditbuffet-mcp~/Library/Application Support/Claude/claude_desktop_config.json.cursor/mcp.json(or Settings → MCP).windsurf/mcp.json(or Settings → MCP)The JSON config is the same for all tools — just put it in the right file. After saving, restart your tool. Then ask: “Run the security-headers audit on this project.”
Understanding Your Results
Results are broken down by audit category (Security, SEO, Accessibility, Performance, Code Quality, Best Practices). Each category shows a score from 0 to 100 and a letter grade.
Grade Scale
Scores are calculated from check severity weights. Critical checks have a weight of 10, Warning checks have a weight of 3, and Info checks have a weight of 1. Your score is the percentage of total applicable weight that your project passed. A score of 80 means your project passed checks representing 80% of the total applicable weight in that category.
The overall project health score is a weighted average across all completed categories. It only displays once at least 50% of audit categories have been completed for your project.
Benchmark percentiles tell you how your score compares to other projects in the same segment. A percentile of 70 means your project scored higher than 70% of other projects with a similar tech stack running the same audit.
Telemetry & Privacy
AuditBuffet telemetry is designed to be safe to share publicly. The audit prompts include explicit instructions that prohibit including sensitive information in the output.
What the telemetry contains: check IDs, pass/fail/skip/error results, severity levels, failure detail messages (capped at 500 characters and sanitized), category scores, audit metadata, and tech stack information (framework names, language, deployment platform).
What the telemetry never contains: source code, file contents, environment variables, API keys, secrets, database connection strings, internal URLs, IP addresses, user data, or any personally identifiable information.
Submissions can be made anonymously — no account is required. If you submit without an account, your submission is stored but not linked to any user profile. You will not be able to track it over time without an account.
Aggregate, anonymized submission data is used to calculate benchmark percentiles and generate the quarterly benchmark reports published on this site.
FAQ
Is my source code sent to AuditBuffet?
No. The telemetry JSON contains only check results, scores, and metadata. The audit prompt instructs your AI tool to never include source code, file contents, environment variables, API keys, or PII in the output.
Can I submit without creating an account?
Yes. The Stack Scan and all audit submissions can be made anonymously. Creating an account lets you track your project over time, see trend charts, and access benchmark comparisons.
How are scores calculated?
Each check has a severity (Critical, Warning, or Info) which maps to a weight (10, 3, or 1). Your category score is the sum of passing check weights divided by the sum of applicable check weights, multiplied by 100. Overall score uses the same formula applied across all checks as a flat pool — category boundaries don’t affect it.
What does N/A mean on a check?
N/A (skip) means the check does not apply to your project — for example, a mobile-responsiveness check on a CLI tool. N/A is determined programmatically by the audit prompt, not by user selection. Skipped checks don't affect your score.
How do benchmarks work?
Benchmarks compare your score against other projects in the same segment (based on tech stack and audit type). Percentiles use a 90-day rolling window and require a minimum of 30 scores per segment before displaying.
Are audit prompts free?
Everything is free during our launch period. Create an account to unlock all audits at no cost.
Which AI tools does this work with?
Any tool that can follow a prompt. AuditBuffet works with ChatGPT, Claude Code, Cursor, Windsurf, Codex, Gemini, Bolt, Lovable, Replit, Base44, Copilot, Cline, Aider, and more. IDE tools use the Full format; chat interfaces use the Chat format. Same checks, same JSON, same scores.