Retries are idempotent — no duplicate sends
Why it matters
A worker that calls the ESP, receives a success response, and then crashes before updating the database leaves the job in an unacknowledged state. BullMQ re-queues it, and the next worker pickup sends an identical transactional email — a password reset, an order confirmation, a payment receipt — to a real recipient who has already received it. CWE-362 covers race conditions on shared resources; this is the classic check-then-act race across a network boundary. Unlike the queue-level dedup check, this pattern fails specifically on the retry after a successful-but-unacknowledged send.
Severity rationale
Critical because worker crashes during ESP network I/O are expected events, not edge cases, and without idempotency guards every crash produces a duplicate transactional email sent to real recipients.
Remediation
Establish the 'sending' status in the database before calling the ESP, and check for it on every retry, in workers/email.worker.ts:
async function processEmailJob(job: Job<EmailJobData>) {
const { campaignId, recipientId } = job.data
const log = await db.emailSendLog.findFirst({
where: { campaignId, recipientId }
})
if (log?.status === 'sent') return
if (!log) {
await db.emailSendLog.create({
data: { campaignId, recipientId, status: 'sending', jobId: job.id }
})
}
const idempotencyKey = `${campaignId}:${recipientId}:${job.id}`
await esp.send({ ...job.data, idempotencyKey })
await db.emailSendLog.update({
where: { campaignId_recipientId: { campaignId, recipientId } },
data: { status: 'sent', sentAt: new Date() }
})
}
Detection
-
ID:
idempotent-retries -
Severity:
critical -
What to look for: This check is distinct from the dedup guard in Queue Architecture — it specifically verifies the retry behavior after a partial failure. Examine the sequence of operations in the job processor: does the worker mark a send as successful before or after confirming the ESP has accepted it? If the worker calls the ESP, receives a success response, but then crashes before updating the database, the job will be retried and the email will be sent twice. Look for the transactional pattern: update the database atomically with the ESP call, or use the dedup check described in the queue-architecture category.
-
Pass criteria: The job processor is structured so that a retry after partial failure does not result in a duplicate send. Enumerate all idempotency mechanisms: (a) dedup key check before sending, (b) ESP idempotency key, (c) "sending" status in database. At least 1 mechanism must be present. The dedup check must occur before the ESP API call, not after.
-
Fail criteria: The job processor calls the ESP and then updates the database, with no dedup check at the start. A worker crash after the ESP call but before the DB update causes the next retry to send again.
-
Skip (N/A) when: The queue provides exactly-once delivery guarantees and no retry is possible — note the mechanism in the detail field.
-
Detail on fail:
"Worker calls ESP then updates DB — crash between these two operations causes the next retry to send a duplicate email"or"No idempotency key passed to Mailgun — retried jobs send duplicates if the original request succeeded but the response was lost" -
Remediation: Use an idempotency key or a pre-flight dedup check:
async function processEmailJob(job: Job<EmailJobData>) { const { campaignId, recipientId } = job.data // Option 1: Use ESP idempotency key (SendGrid, Mailgun support this) const idempotencyKey = `${campaignId}:${recipientId}:${job.id}` // Option 2: Pre-flight dedup check (see queue-architecture.at-least-once-dedup) const alreadySent = await db.emailSendLog.findFirst({ where: { campaignId, recipientId, status: { in: ['sending', 'sent'] } } }) if (alreadySent?.status === 'sent') return if (!alreadySent) { await db.emailSendLog.create({ data: { campaignId, recipientId, status: 'sending', jobId: job.id } }) } // Send with idempotency key const html = await renderTemplate(job.data) await sgMail.send({ to: job.data.to, from: process.env.EMAIL_FROM!, subject: job.data.subject, html, text: convert(html), headers: { 'X-Idempotency-Key': idempotencyKey } }) await db.emailSendLog.update({ where: { campaignId_recipientId: { campaignId, recipientId } }, data: { status: 'sent', sentAt: new Date() } }) }
External references
- cwe · CWE-362 — Concurrent Execution Using Shared Resource With Improper Synchronization
- iso-25010:2011 · reliability.fault-tolerance — Fault Tolerance
Taxons
History
- 2026-04-18·v1.0.0·Initial import from sending-pipeline-infrastructure·automated