Email lowercased and trimmed before storage

ab-000859 · data-quality-list-hygiene.dedup-normalization.email-normalization

Severity: criticalactive

Why it matters

Email addresses are case-insensitive per RFC 5321, but most databases store them as case-sensitive strings. Without normalization, Alice@Example.com and alice@example.com are stored as separate contacts, breaking deduplication, suppression lookups, and unsubscribe processing. The CAN-SPAM Act (§ 5) requires that unsubscribe requests be honored within 10 business days — if the suppression lookup misses an address due to case mismatch, the sender is in violation. GDPR Art. 17 right-to-erasure requests face the same risk: a deletion keyed on the normalized form misses records stored in a different case. CWE-20 applies to the failure to normalize input before persistence.

Severity rationale

Critical because case mismatch silently breaks suppression lookups and unsubscribe processing, creating direct CAN-SPAM compliance exposure and duplicate sends to the same person.

Remediation

Normalize email to lowercase and trim whitespace at the point of storage — not at display time. Enforce it at the database level as well:

const normalizedEmail = rawEmail.trim().toLowerCase()

await db.contact.upsert({
  where: { email: normalizedEmail },
  update: { updated_at: new Date() },
  create: { email: normalizedEmail }
})

Add a PostgreSQL check constraint to prevent non-lowercase values reaching the column:

ALTER TABLE contacts ADD CONSTRAINT email_lowercase
  CHECK (email = lower(email));

Run a backfill migration before adding the constraint: UPDATE contacts SET email = lower(trim(email));. Apply the same normalization to suppression lookups — a suppression table keyed on mixed-case emails defeats the dedup entirely.

Detection

ID: email-normalization
Severity: critical
What to look for: Enumerate all code paths that store email addresses to the database. Count every write path found and for each, verify that .toLowerCase() and .trim() (or equivalent) are applied before the value is persisted — quote the actual normalization call. Without this, Alice@Example.com and alice@example.com are stored as different contacts, breaking deduplication and suppression lookups. Also check whether existing data has mixed-case records that need backfilling.
Pass criteria: Count all email write paths and report the ratio: "N of N write paths normalize before storage." 100% of write paths must normalize — at least 1 write path exists. All stored email addresses are in lowercase with no leading or trailing whitespace. Database queries for email lookup also normalize the query input.
Fail criteria: Email addresses are stored as-entered without lowercasing, or the database column has a case-sensitive collation and no normalization is applied.
Skip (N/A) when: Never — any system that stores email addresses for later use must normalize them.
Cross-reference: Check data-quality-list-hygiene.dedup-normalization.plus-address-normalization — normalization should also cover plus-tag stripping when deduplication is the goal.
Detail on fail: Describe the specific failure. Example: "Email stored as-entered with no toLowerCase() call — database contains 'Alice@Example.com' and 'alice@example.com' as separate contacts" or "Normalization applied at display but not at storage — raw email column has mixed case"

Remediation: Normalize at the point of storage, not at the point of display:

// In your ingest handler:
const normalizedEmail = rawEmail.trim().toLowerCase()

await db.contact.upsert({
  where: { email: normalizedEmail },
  update: { updated_at: new Date() },
  create: { email: normalizedEmail, ... }
})

For PostgreSQL, add a CITEXT column type or a CHECK constraint to enforce lowercase at the database level:

-- Option 1: CITEXT column (case-insensitive comparisons, stores as-is)
ALTER TABLE contacts ADD COLUMN email CITEXT NOT NULL UNIQUE;

-- Option 2: Check constraint enforcing lowercase storage
ALTER TABLE contacts ADD CONSTRAINT email_lowercase
  CHECK (email = lower(email));

Run a backfill migration to normalize existing records before adding the constraint.

External references

cwe · CWE-20 — Improper Input Validation
iso-25010:2011 · functional-correctness — Functional Correctness (functional suitability)
gdpr · Art. 17 — Right to erasure ('right to be forgotten')

Taxons

data-integrity

History

2026-04-18·v1.0.0·Initial import from data-quality-list-hygiene·automated