GDPR Art. 17 (right to erasure) and Art. 5(1)(e) (storage limitation) require that personal data be erased when the basis for holding it no longer applies. If a purchased list is invalidated — the vendor is found to have collected it unlawfully, or the list exceeds its contractual retention period — you need a reliable mechanism to remove or anonymize all contacts derived from that source. CWE-459 (Incomplete Cleanup) applies when deletion of a parent record leaves orphaned child records with dangling foreign keys and no defined disposition.
Medium because orphaned contacts after a source deletion leave the system in a GDPR Art. 17 non-compliant state — records exist with no valid processing basis and no mechanism to find or remove them.
Define explicit cleanup behavior for source deletion via a database trigger. Anonymization (replacing source_id with a tombstone) is preferred over cascade deletion when you need to retain aggregate analytics.
-- Option 1: Anonymize on source delete (preferred)
CREATE OR REPLACE FUNCTION anonymize_contacts_on_source_delete()
RETURNS TRIGGER LANGUAGE plpgsql AS $$
BEGIN
UPDATE contacts
SET source_id = 'deleted:' || OLD.id
WHERE source_id = OLD.id;
RETURN OLD;
END;
$$;
CREATE TRIGGER on_source_delete
BEFORE DELETE ON data_sources
FOR EACH ROW EXECUTE FUNCTION anonymize_contacts_on_source_delete();
-- Option 2: Cascade delete (use when retention is not needed)
ALTER TABLE contacts
ADD CONSTRAINT fk_contacts_source
FOREIGN KEY (source_id) REFERENCES data_sources(id)
ON DELETE CASCADE;
ID: data-sourcing-provenance.provenance-tracking.source-deletion-cascade
Severity: medium
What to look for: Count all code paths that handle source deletion or deactivation and classify each by cleanup strategy (cascade, anonymize, or flag). Look at what happens when a data source is removed from the system (e.g., a purchased list is invalidated, a scraping source is decommissioned). Does the system handle the orphan contacts — either deleting them, anonymizing the provenance fields, or flagging them for review? Check for ON DELETE CASCADE on foreign key relationships between sources and contacts, or application-level cleanup logic triggered when a source is removed.
Pass criteria: At least 1 explicit cleanup mechanism exists. When a source record is deleted or marked inactive, the system either: cascades deletion to contacts derived from that source, anonymizes the provenance reference (replacing source_id with a tombstone), or flags affected contacts for manual review. The behavior is explicit and implemented.
Fail criteria: Deleting or deactivating a source leaves contacts with a dangling source_id reference, with no code path handling the orphaned records.
Skip (N/A) when: Sources are never deleted or deactivated — the system only appends sources.
Detail on fail: "No cascade or cleanup behavior defined for source deletion — orphaned contacts would retain a reference to a deleted source".
Remediation: Define explicit behavior on source deletion:
-- Option 1: Cascade delete
ALTER TABLE contacts
ADD CONSTRAINT fk_contacts_source
FOREIGN KEY (source_id) REFERENCES data_sources(id)
ON DELETE CASCADE;
-- Option 2: Anonymize (preferred when retaining for analytics)
CREATE OR REPLACE FUNCTION anonymize_contacts_on_source_delete()
RETURNS TRIGGER LANGUAGE plpgsql AS $$
BEGIN
UPDATE contacts
SET source_id = 'deleted:' || OLD.id
WHERE source_id = OLD.id;
RETURN OLD;
END;
$$;
CREATE TRIGGER on_source_delete
BEFORE DELETE ON data_sources
FOR EACH ROW EXECUTE FUNCTION anonymize_contacts_on_source_delete();