Dezen Technology
All articles
EngineeringApr 7, 20267 min read

Stripe + idempotency: the patterns every SaaS gets wrong

Dedupe by event ID BEFORE doing real work, anchor it in the database, and use Idempotency-Key on outgoing calls. The pattern we ship that ends 3am double-charge incidents.

Stripe + idempotency: the patterns every SaaS gets wrong

The first time we got paged at 3am about Stripe charging a customer twice, the root cause was three lines of code: a webhook handler that did real work before it had marked the event as seen, then crashed before it could ACK. Stripe retried. We charged twice. Customer support spent the next week refunding.

Idempotency in Stripe (and any webhook-driven system) is one of those topics where 90% of the writeups say “use the idempotency key” and stop. Real production code needs a tighter pattern. Here’s the one we ship.

The retry-safe handler in one diagram

Idempotent webhook handling flow — receive event, dedupe by event.id, run handler, ACK

The pattern: dedupe by event ID BEFORE doing real work, using the database’s UNIQUE constraint as the source of truth. Not application code, not Redis, not memory.

-- Once, in your migrations:
CREATE TABLE webhook_events (
  id        TEXT PRIMARY KEY,    -- the Stripe event.id
  received  TIMESTAMPTZ DEFAULT now(),
  payload   JSONB NOT NULL
);

-- In your handler (Postgres dialect):
INSERT INTO webhook_events (id, payload)
VALUES ($1, $2)
ON CONFLICT (id) DO NOTHING
RETURNING id;

If RETURNING idgives you a row, this is the first time you’ve seen the event — do the work. If it’s empty, you’ve seen this event before — ACK 200 and move on. Database does the heavy lifting; your application code stays simple.

The four mistakes that cause double charges

1. Doing work before deduping

Common pattern: receive event → charge the customer → record the event. If the process crashes between step 2 and step 3, Stripe retries, you charge again. Wrong order. Always dedupe first.

2. Treating a 500 response as a retry signal but doing the work anyway

Stripe retries on 5xx and connection errors. If your handler does the work and then returns 500 because of a downstream error, Stripe will hit you again. Make sure that EVERY path that does work writes the dedupe row first. EVERY path.

3. Skipping the same logic for outgoing API calls

Idempotency cuts both ways. When YOU call Stripe (charge a card, create a subscription), pass an Idempotency-Keyheader so retries from YOUR side don’t double-create on Stripe’s side. Generate the key deterministically from your own business event (e.g. `payment-${userId}-${invoiceId}`), not randomly.

4. Relying on application-level locks

“If event.id is in this in-memory set, skip it” doesn’t survive a deploy or a horizontal scale-out. The dedupe state has to be in the database, in the same transaction as the work itself, or you’ll lose the protection right when you most need it.

Edge cases that trip up most implementations

  • Event ordering.Stripe doesn’t guarantee webhooks arrive in order. Don’t code as if they do. Use event.created timestamps and a state machine.
  • Event types you don’t handle yet.Always ACK 200, even if you don’t know what to do with the event. Otherwise Stripe will retry indefinitely.
  • Slow handlers. Stripe times out after 30s. If your work might take longer, ACK fast and enqueue the heavy work to a job queue.
  • Signature verification. Always verify the Stripe-Signature header before processing. Otherwise anyone with your endpoint URL can mint fake events.

The same pattern works everywhere

This isn’t Stripe-specific. The same “UNIQUE constraint on event_id” pattern works for SQS, EventBridge, Webhooks from any vendor, S3 event notifications, Slack interactions — anything where the producer might deliver the same event twice. Once you internalize it, the production-incident surface drops noticeably.

How we approach this

Every SaaS we ship via our SaaS Product Development service ships with this pattern as the webhook spine on day one. It’s a 5-minute decision that pays back for the lifetime of the product.

Takeaways

  • Dedupe by event.id BEFORE doing real work.
  • Use Postgres UNIQUE constraint, not application state.
  • Use Idempotency-Key on outgoing Stripe calls too.
  • Always ACK 200 for events you don’t handle.
  • Verify the Stripe-Signature header. Always.
Keep reading

More from the engine room

AI in QA: where it helps, where it doesn’t

May 27, 2026

AI in QA: where it helps, where it doesn’t

AI augments QA throughput — test generation, triage, visual regression. It doesn’t replace QA judgment: strategy, exploratory testing, and defining correctness stay human.

Read More
Controlling LLM costs in production

May 25, 2026

Controlling LLM costs in production

Four levers cut spend 10x without cutting quality: route by difficulty, cache, trim context, batch and stream. Measure cost-per-feature first; set budget guardrails always.

Read More
RAG vs fine-tuning: which do you actually need?

May 23, 2026

RAG vs fine-tuning: which do you actually need?

Facts → RAG. Behavior → maybe fine-tune. Most business AI features want RAG even when teams ask for fine-tuning. The decision rule and the order to try things in.

Read More
Agentic features in SaaS: the maturity ladder

May 21, 2026

Agentic features in SaaS: the maturity ladder

From manual to autonomous — four levels of autonomy and the guardrails each needs. Match autonomy to the cost of being wrong, not to how impressive it sounds.

Read More
Offline-first mobile: the app that works on the subway

May 19, 2026

Offline-first mobile: the app that works on the subway

The UI never waits on the network. Local DB, sync engine, server — with conflict resolution per data type. The architecture that makes mobile apps feel instant.

Read More
Lift-and-shift vs refactor: how to actually decide

May 17, 2026

Lift-and-shift vs refactor: how to actually decide

Lift-and-shift is fast, cheap to do, expensive to keep. Refactor is months of work with structural upside. The matrix — and why half-finished refactors are the worst path.

Read More
Monolith migration: the strangler-fig playbook

May 15, 2026

Monolith migration: the strangler-fig playbook

The big-bang rewrite is the most consistently bad idea in software. Proxy in front, extract one route at a time, shrink the monolith to nothing. No migration day.

Read More
SOC 2 readiness in plain English

May 13, 2026

SOC 2 readiness in plain English

Five Trust Service Criteria, Security mandatory and the rest optional. Type 1 vs Type 2. The pragmatic 6-month timeline — not the year-long ordeal it’s made out to be.

Read More

Let’s Build the Future Together!

Contact our team today and turn your ideas into reality.

Let’s Discuss
Contact Details : sales@dezentech.com Sy. No:40, Flat No:402, SIRISAMPADHA ARCADE I, Plot no:18-21, behind Union Bank of India, Khajaguda, Hyderabad, Telangana 500104