All 20 checks with why-it-matters prose, severity, and cross-references to related audits.
Without a root-level error boundary, a single thrown exception in any React component silently blanks the entire application — users see a white screen with no explanation and no recovery path. CWE-755 (Improper Handling of Exceptional Conditions) names this directly: the absence of fault tolerance at the application boundary turns transient bugs into total outages. For Next.js App Router projects, a route-level `error.tsx` alone is insufficient — errors in the root layout bypass it entirely, requiring `global-error.tsx` as a separate boundary. ISO 25010 reliability.fault-tolerance classifies the absence of this pattern as a direct reliability failure.
Why this severity: Critical because a missing root boundary converts any unhandled render exception into a full application crash with no user-visible recovery path.
saas-error-handling.error-boundaries.react-error-boundarySee full patternUnhandled promise rejections crash Node.js server processes outright in newer runtimes, and silently swallow errors in the browser — making failures invisible to both users and monitoring. CWE-391 (Unchecked Error Condition) and CWE-755 (Improper Handling of Exceptional Conditions) both apply: async errors in event handlers and `useEffect` callbacks that lack `try/catch` or `.catch()` chains produce no user feedback and no diagnostic signal. ISO 25010 reliability.fault-tolerance requires that faults be contained — unguarded async paths let errors propagate with no containment boundary. This is especially acute in authentication and payment flows, where a silently failed async call leaves users in an ambiguous state.
Why this severity: Critical because unguarded async errors in Node.js crash server processes and in browsers produce silent failures with no user feedback or diagnostic signal.
saas-error-handling.error-boundaries.unhandled-promise-rejectionsSee full patternRendering `error.message` or `error.stack` to the DOM or returning it in an API JSON response exposes your internal architecture to anyone who triggers a 500 — framework versions, file system paths, database error codes, and third-party service identifiers are all visible in unfiltered stack traces. OWASP A05 (Security Misconfiguration) and CWE-209 (Generation of Error Message Containing Sensitive Information) classify this as an active information-disclosure vulnerability, not merely a cosmetic issue. Attackers routinely trigger intentional errors to enumerate your stack before targeting known CVEs. CWE-215 (Insertion of Sensitive Information Into Debugging Code) further applies when dev-mode conditionals are left in production builds.
Why this severity: Critical because stack traces and raw error messages in production responses give attackers a direct map of your framework, file paths, and database schema.
saas-error-handling.error-boundaries.five-hundred-page-no-internalsSee full patternAn error boundary that returns null or an unstyled error dump gives users a white screen with no explanation and no recovery path, which is functionally identical to the crash it was meant to contain. The fallback is the last line of UX defense when a render throws, and a null return hides the reset() hook that would let the user recover without a full reload. WCAG 2.2 SC 3.3.1 requires that error identification be perceivable, which a blank fallback fails by definition.
Why this severity: High because a broken fallback strands every user who hits an exception with no self-service recovery and forces a full page reload.
saas-error-handling.error-boundaries.error-boundaries-fallback-uiSee full patternWithout an error reporting service, production failures are invisible until a user files a complaint — and most users don't. CWE-778 (Insufficient Logging) classifies the absence of error capture as a reliability and observability failure. ISO 25010 reliability.fault-tolerance requires that faults be detected and recorded. Practically, this means bugs introduced in a deploy go undetected until support volume spikes, by which time the defect may have affected thousands of sessions. A DSN hardcoded in source rather than an environment variable adds a credential-exposure risk on top of the observability gap.
Why this severity: High because production errors with no monitoring service are invisible until user-reported, creating an unbounded failure detection lag.
saas-error-handling.error-reporting.error-reporting-serviceSee full patternError events captured with no context tell you something crashed but not who, where, or what the user was doing — making reproduction difficult and prioritization impossible. Attaching PII to error context (email addresses, full names, raw form values) violates GDPR Art. 5(1)(f) (integrity and confidentiality) and Art. 25 (data minimization by design). CWE-359 (Exposure of Private Personal Information) applies when email or identity fields are attached to error events that are stored and queryable in third-party services like Sentry. User IDs (non-identifiable UUIDs) are acceptable; email addresses are not — they link errors to identifiable individuals stored in a vendor's system.
Why this severity: Medium because contextless errors are a productivity failure, and PII-in-errors is a GDPR data minimization violation requiring remediation before a data subject access request exposes it.
saas-error-handling.error-reporting.errors-include-context-no-piiSee full patternAPI routes and server actions that swallow exceptions without logging leave you with no server-side record of what failed. CWE-778 (Insufficient Logging) and CWE-391 (Unchecked Error Condition) both apply — a catch block that returns a 500 response without a log call means the failure is invisible outside of user complaints. ISO 25010 reliability.fault-tolerance requires that fault events be recorded. For Next.js server actions specifically, an unhandled throw produces an opaque digest hash in production — the error is invisible even to the framework's own logging. Without server-side logs, debugging production incidents requires guesswork from user descriptions alone.
Why this severity: High because API errors with no server-side log record make production debugging dependent on user reports rather than observable system state.
saas-error-handling.error-reporting.api-errors-logged-serverSee full patternA server-side Sentry initialization without a client-side counterpart means rendering crashes, JavaScript exceptions, and failed client-side fetches are never reported. CWE-778 (Insufficient Logging) applies to both server and browser contexts — client errors that go unreported are failures with no observable signal. ISO 25010 reliability.fault-tolerance requires fault detection across all execution environments. In practice, many production JavaScript errors — component hydration mismatches, failed dynamic imports, browser-specific rendering bugs — occur exclusively in the browser and are invisible to server-side monitoring alone.
Why this severity: Low because client-side error capture is a coverage gap rather than a security or data-loss risk, but it leaves a significant class of production failures unobservable.
saas-error-handling.error-reporting.client-errors-reportedSee full patternWhen API routes return `{ error: string }` in some places, `{ message: string }` in others, and `{ success: false, reason: string }` elsewhere, frontend error handling code must branch on shape — and branches that aren't tested get skipped. CWE-755 (Improper Handling of Exceptional Conditions) applies when inconsistency leads to unhandled error shapes in client code. ISO 25010 maintainability.consistency classifies format divergence as a direct maintainability defect. Practically, inconsistent formats cause error messages to silently go undisplayed: a client written against `{ error }` silently ignores a response shaped as `{ message }`, and users see no error feedback at all.
Why this severity: High because inconsistent error shapes cause client-side error handling branches to silently fail, producing invisible errors and confusing UX at the exact moments users need clarity.
saas-error-handling.user-errors.api-consistent-error-formatSee full patternThe framework default 404 page is a dead end: it names the HTTP status, nothing else, and offers no path back into the product. Users who mistype a URL, follow a stale email link, or hit a deleted resource bounce to the browser back button and often leave entirely. A custom 404 recovers that intent by handing users a home link, a search, or suggested pages, which directly affects bounce rate, session depth, and organic-traffic retention from broken inbound links.
Why this severity: Medium because every site serves 404s in production, but the failure only affects users who already took a wrong turn.
saas-error-handling.user-errors.not-found-custom-helpfulSee full patternGeneric error messages at the top of a form — "Something went wrong" or "Invalid input" — require users to scan all fields to find what failed. WCAG 2.2 SC 3.3.1 (Error Identification) and SC 3.3.3 (Error Suggestion) mandate that validation errors identify the specific field and describe how to fix it, making inline field-level errors a legal accessibility requirement for many SaaS products. Beyond compliance, forms with global-only error banners have measurably higher abandonment rates — users who typed a complex password and can't tell which field is wrong often give up rather than re-enter everything.
Why this severity: High because field-level inline validation is a WCAG 2.2 requirement, and its absence directly increases form abandonment in authentication and onboarding flows.
saas-error-handling.user-errors.form-validation-inlineSee full patternNetwork failures are not edge cases — they happen routinely on mobile connections, during backend deployments, and when third-party services degrade. CWE-755 (Improper Handling of Exceptional Conditions) applies when error states are handled for success paths but not for network failures. ISO 25010 reliability.fault-tolerance requires that faults be recoverable. A component that shows a loading spinner indefinitely on a network failure, or renders nothing at all, strands users in an ambiguous state with no self-service recovery. Users without a retry button must reload the entire page, losing any other in-progress work.
Why this severity: Medium because missing retry options on network errors strand users without a recovery path, directly increasing support load and session abandonment.
saas-error-handling.user-errors.network-errors-retrySee full patternVercel Hobby-tier functions time out at 10 seconds and Pro at 60 or 300 seconds — AI generation, file processing, and report exports regularly approach these limits. When they hit, users see a generic error or a permanent spinner with no indication of whether the operation is still running, failed, or timed out. CWE-755 (Improper Handling of Exceptional Conditions) applies: failing to distinguish a timeout from other errors means users can't make an informed decision about whether to wait or retry. ISO 25010 reliability.fault-tolerance requires that timeout conditions be surfaced to users with a recovery path, not collapsed into opaque generic errors.
Why this severity: Medium because timeout errors without specific messaging produce permanent spinners or generic failures, making it impossible for users to distinguish retryable timeouts from permanent errors.
saas-error-handling.user-errors.timeout-messagingSee full patternA 429 response with no `Retry-After` header and no client-side timing message forces users to guess when to try again, which produces two bad outcomes: they either give up, or they hammer retry and trigger a retry storm that extends the rate-limit window. RFC 6585 defines `Retry-After` precisely so clients and humans can coordinate backoff. Without it, the rate limiter is invisible to the user and indistinguishable from a 500, erasing the distinction between throttling and an outage.
Why this severity: Low because rate limiting still protects the backend even without timing information; only the user experience degrades.
saas-error-handling.user-errors.rate-limit-retry-explainSee full patternA dashboard that fetches user profile, billing status, and activity feed in parallel should degrade gracefully when only one source fails — the other two sections are still useful. CWE-755 (Improper Handling of Exceptional Conditions) applies when a single data source failure crashes an entire page that could have partially rendered. ISO 25010 reliability.fault-tolerance requires that failures in one subsystem do not propagate to others. Practically, a single top-level error boundary means a transient billing API timeout takes down the entire dashboard, blocking users from accessing unrelated functionality they need.
Why this severity: High because a single top-level boundary lets any one component failure crash an entire multi-source page, blocking access to all other independent sections.
saas-error-handling.graceful-degradation.partial-failures-dont-crashSee full patternA user who fills out a multi-field signup form, hits a server error, and finds all their input cleared will abandon rather than re-type. ISO 25010 reliability.fault-tolerance requires that fault recovery not introduce secondary failures — clearing user data on error is exactly that. This is especially damaging in multi-step onboarding flows, payment forms, and file upload workflows where re-entering data has high friction. Catch blocks that call `form.reset()` or `setState({})` on error are the most common defect pattern: the form correctly shows an error message but simultaneously destroys the data the user needs to re-submit.
Why this severity: High because error handlers that clear form state force users to re-enter all input after a server failure, causing direct abandonment in high-value conversion flows.
saas-error-handling.graceful-degradation.error-recovery-no-data-lossSee full patternBackground jobs that fail silently — especially those processing payments, sending billing notifications, or syncing shared data — can fail for hours before anyone notices. CWE-391 (Unchecked Error Condition) applies directly: a Vercel Cron route that throws without a try/catch produces no log and no retry path. CWE-778 (Insufficient Logging) applies when errors are caught but not recorded. ISO 25010 reliability.fault-tolerance requires that background faults be detectable and recoverable. Payment notification jobs and benchmark snapshot jobs that fail silently result in real user impact — missed invoices, stale data — with no operational signal.
Why this severity: Low for general background jobs, but elevated for cron jobs handling payments or notifications where silent failures cause direct user-visible impact with no retry path.
saas-error-handling.graceful-degradation.background-job-retrySee full patternA catch block that contains only `console.error()` and no further action is functionally equivalent to swallowing the error in production. CWE-778 (Insufficient Logging) applies when logging goes only to a console that no one monitors in real time. CWE-391 (Unchecked Error Condition) applies when the error is recorded but not acted upon — no user feedback, no monitoring service capture. ISO 25010 reliability.fault-tolerance requires that error conditions trigger corrective action, not just log output. Most hosted Node.js environments (Vercel, Railway, Fly.io) don't surface raw `console.error` output to developers in real time, making bare-console error handling invisible in practice.
Why this severity: Low because `console.error`-only catch blocks produce a log record but no user feedback and no monitoring alert, making errors practically invisible in production hosting environments.
saas-error-handling.graceful-degradation.no-console-error-without-boundarySee full patternA codebase where some async functions use try/catch, others use `.catch()`, some use a shared wrapper, and some have no error handling at all makes every error handling review a full context-reload. CWE-755 (Improper Handling of Exceptional Conditions) applies when inconsistency means individual developers add error handling differently, creating gaps that are hard to detect in review. ISO 25010 maintainability.consistency classifies ad hoc error handling as a direct maintainability defect. Practically, a boilerplate-heavy pattern copy-pasted across 15 server actions means adding cross-cutting behavior — like logging or monitoring calls — requires editing every location individually rather than once in a shared utility.
Why this severity: Low because inconsistent async error handling is a code quality and maintainability defect rather than an immediate security or reliability risk, but it degrades both over time.
saas-error-handling.graceful-degradation.async-error-handling-consistentSee full patternError state variables that are set on failure but never cleared produce a user-hostile artifact: a stale error banner that survives successful navigations, retries, and even cross-session logins on shared devices. Users who have already resolved the issue still see the old message, which undermines trust in the UI's accuracy and can leak information about a previous session's activity on a shared browser. Clearing error state before retries also prevents the double-state flash where old and new errors render simultaneously.
Why this severity: Low because the bug is cosmetic and self-limiting, but it visibly erodes trust every time it appears.
saas-error-handling.graceful-degradation.error-state-reset-mechanismSee full patternRun this audit in your AI coding tool (Claude Code, Cursor, Bolt, etc.) and submit results here for scoring and benchmarks.
Open Error Handling Audit