v1.0.1Pro18 checks

Every Test Reality Audit check

All 18 checks with why-it-matters prose, severity, and cross-references to related audits.

3 critical5 high5 medium3 low2 info

Assertion Quality

5 checks

Every test file contains at least one assertion

critical

An assertion-free test file is indistinguishable from a passing test in CI — it runs, it exits 0, and it tells you nothing. AI-generated test suites routinely produce these: scaffolded `describe`/`it` blocks that call setup helpers but never verify a return value, a side effect, or a thrown error. The gap only surfaces when a real defect slips to production and you realize your test suite was never checking anything. ISO-25010:2011 testability requires that tests can actually detect faults; assertion-free files by definition cannot.

Why this severity: Critical because a test that cannot fail provides zero defect-detection value regardless of how many times it runs.

ai-slop-test-theater.assertion-quality.every-test-file-has-assertionSee full pattern

No tautological assertions

critical

Tautological assertions — `expect(true).toBe(true)`, `assert(true)`, `expect(1).toBe(1)` — are placeholders that AI scaffolding generates and never replaces. They pass unconditionally, so a completely broken implementation looks identical to a working one in your test output. Teams relying on green CI as a signal of correctness are flying blind: the assertion validates the literal `true` you typed, not the behavior of your code. ISO-25010:2011 testability requires assertions that can actually fail when the system under test is defective.

Why this severity: Critical because a tautology assertion passes regardless of what the production code does, making the test meaningless as a fault detector.

ai-slop-test-theater.assertion-quality.no-tautology-assertionsSee full pattern

No catch-all try/catch blocks that swallow errors

high

A `try/catch` block in a test that has no assertion, no `throw`, and no `console.error` converts every runtime error into a silent pass. When the production function throws — due to a network failure, a missing DB record, an uncaught null — the test catches it, discards it, and reports success. This pattern is common in AI-generated tests that wrap async calls defensively without understanding that swallowing errors in tests defeats the entire purpose of having tests. ISO-25010:2011 fault-tolerance requires that faults be detectable, not hidden.

Why this severity: High because silent error-swallowing means production failures that would be caught by tests instead generate a green CI run.

ai-slop-test-theater.assertion-quality.no-error-swallowing-try-catchSee full pattern

Assertion density is reasonable per test

medium

When the average assertion-to-test-block ratio falls below 1.0, the majority of test blocks have no assertions at all. This isn't a style issue — it means most of your test suite is exercising code paths without checking outcomes. Coverage tools will report lines as "covered" while the behavior of those lines remains completely unverified. ISO-25010:2011 testability requires that tests can detect deviations from specified behavior; a test block with no assertion can never detect anything.

Why this severity: Medium because a low ratio signals systemic assertion absence rather than isolated gaps, but individual test blocks vary in severity depending on what they cover.

ai-slop-test-theater.assertion-quality.assertion-density-reasonableSee full pattern

Test names are descriptive

low

When a test fails in CI, the first signal is the test name. A name like `test 1` or `should work` tells the developer nothing: which behavior failed, which input triggered it, which component is broken. AI-generated test scaffolding frequently produces placeholder names that were never replaced. The result is a test suite where CI failure messages require reading test bodies to understand what broke — slowing down incident response and increasing the chance a developer ignores the failure. ISO-25010:2011 maintainability requires that tests communicate intent clearly.

Why this severity: Low because placeholder names don't affect whether tests catch bugs, but they degrade the team's ability to act quickly on CI failures.

ai-slop-test-theater.assertion-quality.descriptive-test-namesSee full pattern

Mock & Setup Hygiene

4 checks

Mocks do not swallow the system under test

high

When a test mocks every export from the file it is supposed to test, the test exercises the mock — not the production code. The function under test is never called. The test will pass even if the real implementation is completely deleted. This pattern appears in AI-generated tests where the model confuses "mock the dependencies of X" with "mock X itself." The test suite accumulates green results that have zero correlation with production behavior, creating false confidence. ISO-25010:2011 testability requires that tests exercise the actual system under test.

Why this severity: High because mock-saturated tests provide a false signal of correctness — the production code could be entirely broken while every test passes.

ai-slop-test-theater.mock-hygiene.mocks-do-not-swallow-system-under-testSee full pattern

Setup and teardown are balanced

medium

When a `beforeAll` or `beforeEach` hook creates database rows, writes files, or inserts records without a matching `afterAll`/`afterEach` to remove them, test data accumulates across runs. Subsequent test runs operate on a progressively more polluted database: unique-constraint violations start appearing, count assertions become wrong, and test order begins to matter. This makes the test suite flaky and eventually unmaintainable. ISO-25010:2011 recoverability requires that test runs return the system to a clean state.

Why this severity: Medium because unbalanced setup causes intermittent failures and cross-test contamination that are difficult to diagnose, though immediate test runs may still pass.

ai-slop-test-theater.mock-hygiene.setup-teardown-balancedSee full pattern

Fake timers are paired with explicit clock control

low

Calling `vi.useFakeTimers()` or `jest.useFakeTimers()` freezes the JavaScript clock at the moment of the call. Any code that depends on `setTimeout`, `setInterval`, or `Date.now()` then hangs indefinitely — the timer callback never fires because the clock never advances. Tests that use `useFakeTimers` without a corresponding `vi.advanceTimersByTime()` or `vi.runAllTimers()` will either timeout or silently pass without actually executing the time-dependent code path. ISO-25010:2011 testability requires that all branches of code under test are actually exercised.

Why this severity: Low because frozen-timer tests typically don't produce false passes — they usually hang or time out — but they do silently skip the time-dependent code branch.

ai-slop-test-theater.mock-hygiene.no-fake-timers-without-clock-controlSee full pattern

Mock libraries are imported only in test files

medium

Mock libraries like msw, nock, and sinon are designed to intercept network calls and fabricate responses. When they leak into production source files, real API requests get hijacked by test doubles — users see stubbed data, webhooks never reach their destinations, and payment flows silently return mock success responses. This also bloats production bundles with test-only code and can mask outages because the mock always returns 200 OK regardless of backend health.

Why this severity: Medium because the breakage ranges from silent data corruption to full feature outage depending on which request the mock intercepts.

ai-slop-test-theater.mock-hygiene.mock-imports-bounded-to-testsSee full pattern

Coverage Reality

5 checks

Critical source files have at least one corresponding test

critical

Auth, payment, billing, and webhook handlers are the highest-risk code in any commercial application — bugs there cause account takeovers, double-charges, missed payments, and data exposure. CWE-1059 (Insufficient Documentation) and ISO-25010:2011 testability both flag untested critical paths as systemic risk. An AI agent generating a Stripe checkout route or a JWT auth handler without writing a corresponding test file leaves the most consequential code in the project completely unverified. A single refactor in `lib/billing.ts` with no test coverage can silently break revenue collection.

Why this severity: Critical because untested auth and payment paths are the highest-consequence failure modes in production — bugs there affect money, access, and user data directly.

ai-slop-test-theater.coverage-reality.critical-paths-have-testsSee full pattern

E2E tests exist for the primary user flow

high

An E2E framework installed in `devDependencies` with no actual E2E tests is worse than not having one at all — it creates the impression of integration-level coverage that doesn't exist. Playwright, Cypress, and Puppeteer are designed to catch the class of bugs unit tests cannot: routing misconfigurations, broken form submissions, misconfigured auth redirects. A signup flow that works in isolation but fails end-to-end due to a missing CORS header or broken redirect will never be caught by unit tests. ISO-25010:2011 testability requires that critical user journeys be exercised as a system.

Why this severity: High because missing E2E coverage for the primary user flow means complete-flow regressions — auth, checkout, signup — go undetected until a user reports them.

ai-slop-test-theater.coverage-reality.e2e-tests-exist-for-primary-flowSee full pattern

No `.skip` patterns in shipping tests without justification

high

A `.skip`, `xit`, or `xdescribe` without a justifying comment is a permanently-disabled test. It stops failing in CI, the broken behavior it was covering goes undetected, and because nobody knows why it was skipped, nobody knows it's safe to re-enable. AI-generated test suites often skip tests for code paths that haven't been implemented yet and never re-enable them. Over time these accumulate into a graveyard of skipped tests that silently shrink effective test coverage. ISO-25010:2011 maintainability requires that disabled tests be traceable to a documented reason.

Why this severity: High because unjustified skips reduce effective coverage silently — the skipped test no longer fails in CI even if the production code it covered is broken.

ai-slop-test-theater.coverage-reality.no-skip-without-justificationSee full pattern

Test bodies are not empty or TODO-only

medium

An empty `it('creates a user', () => {})` body or one containing only `// TODO` passes unconditionally in every test runner — it shows green in CI, increments the test count, and contributes nothing. AI coding tools scaffold test stubs frequently and treat them as "complete" once the file is syntactically valid. The downstream effect: the team believes behavior is tested when it isn't, and a growing backlog of TODO-only tests never gets implemented because they never fail to remind anyone. ISO-25010:2011 testability requires that tests verify actual behavior.

Why this severity: Medium because empty test bodies silently inflate the reported test count and create a false sense of coverage without catching any real defects.

ai-slop-test-theater.coverage-reality.tests-not-empty-or-todoSee full pattern

Database/network tests are isolated from production data

low

Tests that write to a database without a separate test database URL will insert, update, and delete rows in the development database. Over time this pollutes developer workflows with synthetic data, breaks count-based assertions as rows accumulate, and creates the risk that a destructive test (`deleteMany`, `truncate`) runs against real data if configuration is accidentally shared across environments. ISO-25010:2011 recoverability requires that test execution be isolated from other system states.

Why this severity: Low because the impact is typically pollution and flakiness rather than immediate data loss, unless a destructive query is involved.

ai-slop-test-theater.coverage-reality.database-or-network-tests-isolatedSee full pattern

Test Operability

4 checks

Tests run in CI

high

Tests that aren't wired into CI are effectively optional — they run on some developer machines, some of the time, with varying local environments. A PR that breaks a test can be merged because the author didn't run `npm test` locally and CI never did either. This is the most common failure mode for AI-generated test suites: the AI writes tests but doesn't update the CI workflow. SLSA Build L1 requires that the build process be scripted; ISO-25010:2011 maintainability requires that quality gates be automated and repeatable.

Why this severity: High because tests that don't run in CI provide no protection against regressions being merged to the main branch.

ai-slop-test-theater.test-operability.tests-run-in-ciSee full pattern

Coverage configuration is present

medium

Without a coverage configuration, you have no data on how much of your production code is actually exercised by tests. You may have 50 test files covering the same 3 utility functions while auth, billing, and error-handling code is entirely untouched. Coverage tooling in Vitest and Jest is built-in — it costs nothing to enable. Without it you cannot identify coverage gaps, set minimum thresholds to protect against regression, or demonstrate to auditors or SOC 2 reviewers that critical paths are tested. ISO-25010:2011 testability explicitly covers the ability to measure test completeness.

Why this severity: Medium because the absence of coverage config doesn't break existing tests but eliminates the feedback loop needed to catch growing blind spots in test coverage.

ai-slop-test-theater.test-operability.coverage-config-presentSee full pattern

Tests use a fake clock when time is involved

info

Tests that call real `Date.now()`, `setTimeout`, or `setInterval` are non-deterministic — their behavior depends on how fast the CI machine is running, what other jobs are executing in parallel, and how close the system clock is to a boundary. A token-expiry check that takes 1ms on your laptop may take 150ms on an overloaded CI runner, making the test fail intermittently. Fake timers eliminate this class of flakiness by making the clock fully deterministic. ISO-25010:2011 testability requires that test outcomes be repeatable across environments.

Why this severity: Info because real-clock tests are annoying and flaky but don't represent a correctness gap — they still exercise the time-dependent code, just non-deterministically.

ai-slop-test-theater.test-operability.tests-use-fake-clock-for-timeSee full pattern

Test database is separate from dev database

info

When the test and development database URLs are identical, every test run that inserts, updates, or deletes records modifies the database you actively develop against. Count-based assertions drift over runs. A `deleteMany` in a cleanup hook that lacks a proper `where` clause can truncate tables you were relying on. More commonly, accumulating test records cause unique-constraint failures mid-development that look like bugs in new code. ISO-25010:2011 recoverability requires that test execution be reversible and isolated.

Why this severity: Info because sharing dev and test DB is typically caught before catastrophic data loss, but it is a persistent source of confusion and intermittent failures.

ai-slop-test-theater.test-operability.test-database-isolated-from-devSee full pattern

Ready to scan your project?

Run this audit in your AI coding tool (Claude Code, Cursor, Bolt, etc.) and submit results here for scoring and benchmarks.

Open Test Reality Audit