Messaging, Queues & Async Work

Why you push slow work off the request path, queue vs pub/sub, the three delivery guarantees, and why idempotency is the price of reliability.

Why async at all

If a request triggers slow work (send email, run an ML scoring job, evaluate alerts), doing it inline makes the user wait and couples your uptime to the downstream's. Put a queue between them: the request enqueues a job and returns fast; workers consume at their own pace.

Benefits: decoupling (producer/consumer scale independently), buffering (absorb spikes), resilience (retry on failure), and smoothing load. StockVision's APScheduler alert worker and LandAI's ingestion are exactly this: slow/bursty work moved off the hot path.

Queue vs pub/sub

Queue (point-to-point): each message is processed by one consumer — work distribution (e.g. task queues like RabbitMQ/SQS).
Pub/sub (fan-out): each message goes to every subscriber — event broadcast (e.g. Kafka topics, Redis pub/sub for live price ticks).

Kafka blurs the line: a partitioned, durable log where consumer groups split partitions (queue-like) while different groups each see all events (pub/sub-like).

Delivery guarantees

Guarantee	Meaning	Cost
At-most-once	may drop, never duplicates	simplest, lossy
At-least-once	never drops, may duplicate	the common default
Exactly-once	no loss, no dupes	expensive; often "effectively-once" via idempotency

At-least-once means you WILL see duplicates

A worker that crashes after doing the work but before ack'ing will reprocess the message. So consumers must be idempotent — processing the same message twice has the same effect as once. Use an idempotency key / dedupe table, or make the operation naturally idempotent (upsert, set-to-value).

Ordering, backpressure, DLQs

Ordering is only guaranteed within a partition/key — design so order- sensitive events share a key.
Backpressure: if producers outrun consumers, the queue grows; bound it and shed or slow producers.
Dead-letter queue: messages that keep failing go to a DLQ for inspection instead of blocking the pipeline forever.

Design drills

Queues are where reliability lives or dies. Drill the failure modes.

Design drills: Async & messaging0/4 done

Whiteboard each one out loud for 5–10 minutes before you reveal what a strong answer covers — the gap between your sketch and the checklist is your study list. Progress is saved on this device.

Warm-up

When do you put a queue between two services — and when is it the wrong call?

Core

Exactly-once delivery is impossible. Design for at-least-once instead.

Core

A consumer keeps failing on one poisoned message. What happens to the queue, and how do you contain it?

Stretch

Task queue vs log (Kafka): when each, and why does ordering differ?