GDPR Art. 5(1)(c) and Art. 25 (privacy by design) require that only data strictly necessary for the stated purpose is collected. Collecting a phone number on a SaaS tool with no SMS or calling feature, or a date of birth on a product with no age gate, is not a grey area — it is excess collection with no lawful purpose. This creates real risk: every field you collect is a field that can be breached, subpoenaed, or scraped. Excess collection also signals to regulators that data minimization was never considered, which compounds penalties for other violations. AI-built apps are particularly prone to schema bloat because code generators default to comprehensive field lists rather than minimal ones.
Critical because collecting personal data with no active purpose violates GDPR Art. 5(1)(c) directly, and unnecessary fields expand the breach surface with no compensating benefit to users.
Audit every form and database schema column together. For each personal data field, ask: which specific feature breaks if this field is removed? If nothing breaks, remove it from both the form and the schema.
// BEFORE — over-collection at signup
type SignupForm = {
email: string // needed
password: string // needed
phone: string // not used anywhere — remove
birthdate: string // no age gate — remove
gender: string // no personalization — remove
}
// AFTER — minimal compliant collection
type SignupForm = {
email: string
password: string
displayName?: string // optional; only if shown in UI
}
For the database, run SELECT column_name FROM information_schema.columns WHERE table_name = 'users' and cross-reference against actual application queries. Drop columns that are never read. Delay collection of additional fields until the feature that requires them actually ships.
ID: gdpr-readiness.lawful-basis.data-minimization
Severity: critical
What to look for: Review all forms in the application that collect personal data: signup, onboarding, profile, checkout, contact, and survey forms. For each field that collects personal information, trace whether it is actually used in a product feature. Check the database schema for columns that are populated at signup but never queried. Common over-collection patterns in AI-built apps: phone number on a SaaS with no SMS or call feature, date of birth on a tool with no age gate, gender with no personalization, company size collected on a developer tool, "How did you hear about us" alongside extensive demographic questions. Also check whether required fields are truly required for service delivery, or whether required status is set by default without justification. Count all instances found and enumerate each.
Pass criteria: All collected personal data serves a documented, active purpose. No database columns storing personal data are persistently null or never queried by application code. Optional fields are genuinely optional in the UI. No fields are collected "in case they're useful later." At least 1 implementation must be confirmed.
Fail criteria: Forms collect fields the product does not use. Required fields include data not needed for service delivery. Database schema has personal data columns that are always null or never read. Onboarding flow requests extensive demographic data for a product that has no personalization features.
Skip (N/A) when: N/A criteria do not apply — data minimization is evaluable for any application that collects personal data.
Detail on fail: Specify which fields are excessive. Example: "Signup form requires phone number, but no SMS, call, or 2FA-by-phone feature exists. Phone column populated in users table and never queried." or "Onboarding form collects birthdate, gender, and company revenue range. No personalization, age-gating, or segmentation logic found in codebase.".
Remediation: Audit all forms and database schema together. For each personal data field, ask: "What specific feature breaks if this field is removed?" If nothing breaks, remove it. If it may be needed in the future, make it optional and delay collection until the feature ships:
// BEFORE — collecting excessive data at signup
type SignupForm = {
email: string // needed: account delivery (contract basis)
password: string // needed: authentication
fullName: string // needed: personalization — keep, but make optional
phone: string // not used anywhere — remove
birthdate: string // no age gate — remove
gender: string // no personalization — remove
companySize: string // no segmentation logic — remove
}
// AFTER — minimal compliant collection
type SignupForm = {
email: string
password: string
displayName?: string // optional; collected only if used in UI
}
For the database, audit unused columns: SELECT column_name FROM information_schema.columns WHERE table_name = 'users', then cross-reference with actual application queries. Drop columns that are never read.