Enrichment from third-party services (Clearbit, Apollo, Hunter) supplements original data with fields those vendors sourced from their own pipelines — which may themselves have provenance gaps or consent issues. GDPR Art. 30 requires that the RoPA document all processing operations, including data augmentation from third parties. SLSA Provenance L1 applies to data supply chains as much as software supply chains. CWE-345 applies when enriched fields can no longer be traced to their origin. Without an enrichment log, you cannot tell a regulator which fields a third party supplied and under what basis.
Medium because without an enrichment chain of custody, the company cannot demonstrate GDPR Art. 30 compliance for enriched fields or reconstruct the data lineage required to respond to a data subject access request.
Create an enrichment_events table and write a record every time a contact is augmented from a third-party source. Capture at minimum: provider name, timestamp, and which fields were modified.
CREATE TABLE enrichment_events (
id TEXT PRIMARY KEY DEFAULT gen_random_uuid(),
contact_id TEXT NOT NULL REFERENCES contacts(id),
provider TEXT NOT NULL, -- 'clearbit', 'apollo', 'hunter', etc.
enriched_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
fields_added TEXT[], -- column names that were updated
provider_ref TEXT -- provider's record ID if available
);
Insert a row into this table inside the same transaction that updates the contact record, so the enrichment audit and the data change are always in sync.
ID: data-sourcing-provenance.provenance-tracking.enrichment-chain-of-custody
Severity: medium
What to look for: Enumerate all enrichment code paths and count the fields captured per enrichment event. Look for code that enriches contact records with data from third-party services (Clearbit, Hunter, Apollo, etc.). Check whether the enrichment action is recorded with at least 3 fields: which enrichment provider was used, when the enrichment happened, and what fields were added or updated. This is distinct from the original provenance record — it documents what happened to the data after initial ingestion. Look for an enrichment_log table, a JSONB enrichment_history column, or event records that capture enrichment actions.
Pass criteria: When a contact is enriched from a third-party source, an audit record is created capturing at least 3 fields: the enrichment provider name, a timestamp, and the fields that were modified. Report the count of fields captured even on pass.
Fail criteria: Enrichment updates contact fields directly with no record of what was changed, by whom, or when. The enrichment log captures fewer than 3 fields, or no chain of custody exists for post-ingestion data modifications.
Skip (N/A) when: The system performs no post-ingestion enrichment — contacts are never supplemented with data from third-party services.
Detail on fail: "Clearbit enrichment updates contact fields directly with no audit log of what changed" or "No enrichment history found — chain of custody for enriched fields cannot be reconstructed".
Remediation: Add an enrichment audit table:
CREATE TABLE enrichment_events (
id TEXT PRIMARY KEY DEFAULT gen_random_uuid(),
contact_id TEXT NOT NULL REFERENCES contacts(id),
provider TEXT NOT NULL, -- 'clearbit', 'apollo', 'hunter', etc.
enriched_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
fields_added TEXT[], -- which fields were updated
provider_ref TEXT -- provider-specific record ID if available
);