In certain regions, Google operated googlemail.com as an alias for gmail.com — alice@googlemail.com and alice@gmail.com are the same mailbox. Without alias resolution, the same person can appear as two separate contacts, receiving duplicate sends and having their unsubscribe applied only to the alias variant. While googlemail.com is the most common example, additional alias relationships exist for other providers in specific markets. Duplicate contacts from unresolved aliases inflate list metrics, add send cost, and introduce suppression gaps when the unsubscribe is captured under the aliased domain.
Low because domain alias collisions are infrequent and affect a narrow set of known providers, creating manageable duplication rather than systematic data corruption.
Apply a domain alias map during normalization, resolving aliased domains to their canonical form before deduplication lookup:
const DOMAIN_ALIASES: Record<string, string> = {
'googlemail.com': 'gmail.com',
'googlemail.co.uk': 'gmail.com',
}
function normalizeEmail(email: string): string {
const [local, domain] = email.toLowerCase().trim().split('@')
if (!local || !domain) return email.toLowerCase().trim()
const canonicalLocal = local.split('+')[0]
const canonicalDomain = DOMAIN_ALIASES[domain] ?? domain
return `${canonicalLocal}@${canonicalDomain}`
}
Combine this with plus-tag stripping (see ab-000860) in a single normalizeEmail function so all normalization logic runs in one pass. Add new alias entries as you discover them via duplicate detection reports.
ID: data-quality-list-hygiene.dedup-normalization.domain-alias-resolution
Severity: low
What to look for: Count all domain alias mappings defined in the normalization logic. Check whether the system resolves known domain aliases before storage. The most common example: googlemail.com is an alias of gmail.com in some regions. alice@googlemail.com and alice@gmail.com are the same mailbox. Look for a domain alias map applied during normalization.
Pass criteria: Known domain aliases are resolved to canonical domains before deduplication with at least 1 alias mapping. At minimum, googlemail.com → gmail.com is handled.
Fail criteria: Domain aliases are not resolved, allowing the same person to appear as duplicates with aliased domains.
Skip (N/A) when: The system serves a narrow, known audience (e.g., B2B enterprise only) where personal email domain aliases are not expected.
Detail on fail: Example: "alice@googlemail.com and alice@gmail.com stored as separate contacts"
Remediation: Apply a domain alias map during normalization:
const DOMAIN_ALIASES: Record<string, string> = {
'googlemail.com': 'gmail.com',
'googlemail.co.uk': 'gmail.com',
// Add others as discovered
}
function resolveDomain(domain: string): string {
return DOMAIN_ALIASES[domain.toLowerCase()] ?? domain.toLowerCase()
}
function normalizeEmail(email: string): string {
const [local, domain] = email.toLowerCase().trim().split('@')
if (!local || !domain) return email.toLowerCase().trim()
const canonicalLocal = local.split('+')[0]
return `${canonicalLocal}@${resolveDomain(domain)}`
}