You probably don't need a queue

Every time someone tells me they need a queue, I ask them what they're actually trying to do. About nine times out of ten, the answer is: run a job later, retry it if it fails, and make sure two workers don't run the same one at the same time. Postgres can do that.

The shape of the problem

You have tasks. You want to run them asynchronously, distribute them across N workers, retry them if they fail, and never execute the same task twice concurrently. That's it. Ninety percent of background jobs look exactly like this.

You do not need Kafka. You probably don't need Redis. You almost certainly don't need SQS.

The whole solution

create table job (
  id         bigserial primary key,
  payload    jsonb not null,
  run_at     timestamptz default now(),
  started_at timestamptz,
  attempts   int default 0
);

Then, from each worker:

begin;
select id, payload from job
  where started_at is null and run_at <= now()
  order by run_at
  for update skip locked
  limit 1;
-- do the work
update job set started_at = now() where id = $1;
commit;

FOR UPDATE SKIP LOCKED is the whole trick: it hands each worker a different row without making them step on each other. No polling loop explodes under load, no two workers pick the same job, no external dependencies.

Retries and dead letters

Add a failed_at column and an attempts counter. If attempts > 5, flag it. Show the flagged rows on a page somewhere. You now have a DLQ.

You'll know you need a real queue when Postgres tells you — not when a blog post tells you.

When you do need moreYou'll know. It'll be a million jobs a day sustained, or a fan-out pattern, or a hard latency budget. Until then, a table is enough.

What about exactly-once?

If your handler is idempotent — and it should be — then "exactly-once delivery" stops mattering. Make the unit of work safe to re-run. Drop the anxiety. Go home.

You probably don't need a queue

The shape of the problem

The whole solution

Retries and dead letters

What about exactly-once?

More posts