CWE-359 (Exposure of Private Information) and NIST SP 800-53 Rev. 5 SI-12 (Information Management and Retention) both call for systematic controls on PII exposure. Handwritten regex patterns for PII detection have well-documented failure modes: phone number formats vary by country and era, email local parts accept characters most patterns miss, and edge cases in credit card BIN ranges cause false negatives. An established PII detection library or managed service provides format coverage that ad-hoc patterns cannot match. Even validator (npm, 10M weekly downloads) covers email, mobile phone across locales, and credit card validation with a single dependency.
Info because this check signals the quality and completeness of the PII filtering approach rather than the presence or absence of any filtering — a passing `pii-redacted-before-api` check may already provide adequate coverage.
Install validator for lightweight multi-format PII detection without a managed service dependency:
npm install validator
// lib/ai/pii-guard.ts
import validator from 'validator'
export function containsPii(input: string): boolean {
return input.split(/\s+/).some(token =>
validator.isEmail(token) ||
validator.isMobilePhone(token, 'any') ||
validator.isCreditCard(token)
)
}
For higher-stakes applications handling medical or government data, AWS Comprehend DetectPiiEntities covers 100+ PII types including passport numbers and medical record numbers — worth the latency overhead when the data sensitivity warrants it.
ID: ai-data-privacy.pii-protection.pii-detection-library
Severity: info
What to look for: Enumerate every relevant item. Check package.json for libraries with PII detection or input validation capabilities: validator, joi, zod (with regex refinements), @aws-sdk/client-comprehend, @google-cloud/dlp, presidio-ts, or custom regex patterns in a shared utility file. Also look for comprehensive regex patterns covering email, phone, SSN, credit card formats — either in a dedicated utility file or inline in validation code.
Pass criteria: At least 1 of the following conditions is met. The project uses at least one of: a PII detection library in package.json, a comprehensive custom regex utility covering multiple PII types, or an external PII detection API call before the AI invocation.
Fail criteria: No PII-related libraries detected and no comprehensive custom redaction patterns found — PII filtering appears absent or relies on incomplete single-pattern checks.
Skip (N/A) when: The pii-redacted-before-api check already passed with evidence of a robust redaction implementation, making this library-presence check redundant.
Detail on fail: "No PII detection library found in package.json and no comprehensive custom PII pattern coverage detected"
Remediation: Basic regex patterns miss edge cases in phone number and email formats. Using an established library or service provides more reliable coverage.
For a lightweight option, validator (npm) covers email, phone, and common format validation:
npm install validator
import validator from 'validator'
export function containsPii(input: string): boolean {
const lines = input.split(/\s+/)
return lines.some(token =>
validator.isEmail(token) ||
validator.isMobilePhone(token, 'any') ||
validator.isCreditCard(token)
)
}
For higher-stakes applications, AWS Comprehend's DetectPiiEntities API covers 100+ PII types including medical record numbers and passport numbers.