classic system design

Design a message queue

Move work between producers and consumers with leases, ordering boundaries, and backpressure.

delivery semanticspartitioningbackpressuredead-letter queues

Prompt

Design a managed message queue used by internal services to process background work reliably.

Clarifying questions

  • Is ordering required globally, per key, or not at all?
  • Are messages small payloads or references to object storage?
  • How long should unconsumed messages be retained?

Functional requirements

  • Publish messages to named queues.
  • Let consumers lease, acknowledge, and retry messages.
  • Expose dead-letter queues and replay controls.

Nonfunctional requirements

  • Provide at-least-once delivery with bounded duplicate risk.
  • Keep publish latency predictable during consumer outages.
  • Prevent one queue from starving other queues.

Scale assumptions

  • One million messages per minute at peak.
  • Most messages are under 16 KB.
  • Some queues are idle for days and then spike.

API sketch

  • POST /v1/queues/{name}/messages { bodyRef, orderingKey?, dedupeKey? }
  • POST /v1/queues/{name}/lease { maxMessages, leaseMs } -> messages[]

Data model

  • messages(queue, partition, offset, body_ref, status, available_at, lease_until).
  • consumer_offsets(queue, consumer_group, partition, committed_offset).

Architecture components

  • Producers write to partitioned append logs.
  • Consumers lease available messages and acknowledge completion.
  • A dead-letter policy moves repeatedly failing messages aside.

Bottlenecks

  • Hot ordering keys limit partition parallelism.
  • Slow consumers cause retention growth and replay lag.

Failure modes

  • Consumer crash: lease expires and message becomes visible again.
  • Producer retry: dedupeKey prevents duplicate logical messages for a short window.
  • Poison message: dead-letter after max attempts with failure reason.

Observability

  • Queue depth, oldest visible age, consumer lag, retry count, dead-letter rate.
  • Publish and lease latency by queue tier.

Security / privacy

  • Authorize producers and consumers per queue.
  • Avoid raw PII in message bodies; prefer encrypted payload references.

Cost considerations

  • Retention cost grows with consumer lag and body size.
  • High fanout may need topic semantics instead of many duplicate queue writes.

Tradeoffs

  • Strict ordering simplifies consumers but limits throughput.
  • Push delivery reduces polling waste but complicates backpressure.

Rubric

CriterionWeightEvidence
Separates product behavior from infrastructure assumptions before drawing boxes.
clarification
10The answer names users, write paths, read paths, retention, and what is explicitly out of scope.
Turns traffic and data assumptions into concrete sizing constraints.
scale
15Uses RPS, storage growth, hot-key risk, fanout, latency budget, or memory budget where relevant.
Draws clear service, cache, queue, and storage boundaries with reasons for each split.
architecture
20The component diagram has one owner per responsibility and names the synchronous path.
Defines durable state, indexes, keys, and idempotency records.
data
15Tables or collections include primary keys, lookup paths, TTLs, and consistency expectations.
Names failure modes and the recovery behavior users see.
failure
15Covers partial outages, retries, duplicate work, stale reads, overload, and backfill.
Defines the small set of metrics and traces needed to debug the design.
observability
10Includes SLIs, saturation metrics, queue lag, error classes, and an alert tied to user harm.
Explains what is being sacrificed and why that sacrifice fits the prompt.
tradeoffs
15Compares at least two viable designs and names the losing design's advantage.