Contacts and suppression lists are the two most operationally critical datasets in an email system. Losing the suppression list means mailing previously unsubscribed contacts — a CAN-SPAM violation that triggers regulatory fines and FCC complaints. Losing contacts means losing revenue and user relationships with no recovery path. NIST SP 800-53 CP-9 (Information System Backup) requires that backup procedures are tested, not just configured. ISO 25010 reliability.recoverability requires that a restore procedure exists and is documented. A backup without a documented restore procedure is not a recovery plan — it is an archive of unknown integrity.
Critical because loss of the contacts or suppression list without a tested restore procedure means permanent data loss, CAN-SPAM violations, and irreversible damage to sender reputation.
Configure a daily backup job and document the restore procedure. At minimum, add a scheduled workflow:
# .github/workflows/db-backup.yml
on:
schedule:
- cron: '0 2 * * *' # 2am UTC daily
jobs:
backup:
steps:
- name: Dump contacts table
run: pg_dump $DATABASE_URL -t contacts -t suppression_list | gzip > backup-$(date +%Y%m%d).sql.gz
- name: Upload to S3
run: aws s3 cp backup-*.sql.gz s3://my-backups/email/
Document the restore steps in docs/runbooks/db-restore.md with at least three explicit steps: retrieve the backup, restore to staging, verify row counts. Run a restore drill quarterly to confirm the procedure works end-to-end.
ID: operational-resilience-email.failure-recovery.contact-db-backup-tested
Severity: critical
What to look for: Enumerate all backup mechanisms covering the contacts/subscribers table: backup scripts in scripts/, cron jobs in CI/CD workflows, managed backup service configuration (Supabase PITR enabled, RDS automated backup settings referenced in infrastructure code or documentation), or S3 sync jobs. Count the number of tables covered by backups (contacts and suppression_list at minimum). Also look for a documented restore procedure — a runbook file, a README section, or a docs/runbooks/ entry that describes the steps to restore from backup. The Data Quality & List Hygiene Audit's suppression list data is part of what must survive this backup.
Pass criteria: A backup configuration exists for the contact and suppression database — either a managed backup service is referenced in infrastructure code or documentation, or a backup script/cron job exists in the repository. A restore procedure is documented in a runbook or README with at least 3 steps (retrieve backup, restore to staging, verify row counts). Must not pass when backup exists but no restore procedure is documented — backup without tested restore is not a recovery plan.
Fail criteria: No backup configuration found in infrastructure code, deployment configs, or documentation. No restore procedure documented anywhere in the repository. Or backup exists but restore procedure has fewer than 3 steps.
Skip (N/A) when: The project stores no contact data (purely transactional, contacts stored in external CRM only) — confirmed by the absence of contacts, subscribers, or suppression_list tables in the database schema.
Detail on fail: Describe what is missing. Example: "No backup scripts or managed backup configuration found — contact database has no recovery path" or "No restore procedure documented — recovery steps are not captured in any runbook or README"
Remediation: At minimum, configure a daily backup job and document the restore procedure:
# .github/workflows/db-backup.yml
on:
schedule:
- cron: '0 2 * * *' # 2am UTC daily
jobs:
backup:
steps:
- name: Dump contacts table
run: pg_dump $DATABASE_URL -t contacts -t suppression_list | gzip > backup-$(date +%Y%m%d).sql.gz
- name: Upload to S3
run: aws s3 cp backup-*.sql.gz s3://my-backups/email/
Document the restore steps in docs/runbooks/db-restore.md — include how to retrieve the latest backup, restore to a staging database, and verify row counts. After implementing, run a restore drill quarterly to verify the procedure works end-to-end.