Sending raw email addresses or full names to analytics platforms violates GDPR Art. 25 (data protection by design) and Art. 5(1)(b) (purpose limitation) because analytics processors are not authorized recipients of identifying PII under most privacy policies. OWASP A02 flags unnecessary PII exposure as a cryptographic and privacy failure. CWE-359 specifically covers unintended exposure of private information to third parties — which is exactly what happens when analytics.identify(user.email) fires. If a Segment, Mixpanel, or GA4 account is breached or subpoenaed, the attacker or regulator receives a map from behavioral data to real identities. ISO-27001:2022 A.5.34 requires privacy controls to be embedded in processing, not bolted on afterward.
Critical because sending PII to third-party analytics processors creates an irrevocable exposure — once transmitted, you cannot undo the data transfer or control downstream retention.
Replace direct PII in analytics calls with a salted SHA-256 hash. Add this to lib/analytics.ts and use it everywhere you call identify() or set user_id.
// lib/analytics.ts
import { createHash } from 'crypto'
// One-way, consistent pseudonymous ID per user
export function pseudoId(email: string): string {
return createHash('sha256')
.update(email + process.env.ANALYTICS_SALT!)
.digest('hex')
.slice(0, 16)
}
// Segment / Amplitude / Mixpanel:
analytics.identify(pseudoId(user.email), {
plan: user.plan, // OK — not PII
createdAt: user.createdAt
// Never: email, name, phone, address
})
// GA4:
gtag('config', 'G-XXXXXXXX', { user_id: pseudoId(user.email) })
Audit logs must reference userId (UUID), not email or name. Keep PII-to-ID correlation in a secure internal admin tool only, never in the log store itself.
ID: data-protection.data-collection-consent.pii-separation
Severity: critical
What to look for: Enumerate every relevant item. Examine the database schema and analytics implementation. Does the users table store email, phone, and full name alongside the user_id? That is acceptable — separation means not using PII as the identifier itself (e.g., using email address as the primary key or sending it directly to analytics). Check analytics configuration: in GA4 look for gtag('config', 'G-XXXXXXXX', { user_id: ... }) — what value is passed? It should be a UUID or hash, not an email or name. Check Segment, Mixpanel, and Amplitude identify() calls for the same pattern. Search for identify(user.email) or setUser(user.name) — these are violations. Check audit logs (if present) to confirm log entries use IDs, not names or email addresses.
Pass criteria: At least 1 of the following conditions is met. Analytics identify() calls use a pseudonymous ID (a UUID or a SHA-256 hash of the email, never the raw email). Audit logs reference user_id not plaintext email or name. The application does not use email address as a primary key or URL parameter that ends up in server logs.
Fail criteria: Analytics receives raw email addresses or full names as the user identity. Audit logs contain plaintext PII. Email is used as a primary key and appears in URLs (e.g., /user/alice@example.com/profile).
Do NOT pass when: The item exists only as a placeholder, stub, or TODO comment — partial implementation does not count as passing.
Skip (N/A) when: The application has no analytics, no audit logging, and no user identification beyond a session-scoped token.
Cross-reference: For broader data handling practices, the Data Protection audit covers data lifecycle management.
Detail on fail: Specify the issue. Example: "Segment identify() called with user.email as userId in src/analytics.ts." or "Audit logs include plaintext email addresses in user_action column." or "GA4 user_id set to user.email in _app.tsx.".
Remediation: Hash or replace PII before sending to analytics or logs:
// lib/analytics.ts
import { createHash } from 'crypto'
// Pseudonymous ID: one-way hash of email, consistent per user
export function pseudoId(email: string): string {
return createHash('sha256').update(email + process.env.ANALYTICS_SALT!).digest('hex').slice(0, 16)
}
// Usage in Segment/GA4/Amplitude:
analytics.identify(pseudoId(user.email), {
plan: user.plan, // OK — not PII
createdAt: user.createdAt // OK — not PII
// Never send: email, name, phone, address
})
// For GA4:
gtag('config', 'G-XXXXXXXX', { user_id: pseudoId(user.email) })
Audit logs should reference userId (UUID) only. If you need to correlate a log entry back to an email during incident response, that lookup should happen in a secure internal admin tool, not be stored in the log itself.