classic system design

Design a durable job scheduler

Run delayed and recurring work with retries, backoff, leases, and a clean story for duplicate execution.

leasesretry queuescron semanticsidempotency

Prompt

Design a scheduler that can run delayed jobs, cron jobs, and retryable background work for a multi-tenant SaaS product.

jobs(id, tenant_id, run_at, cron_expr, status, attempts, payload_ref, idempotency_key).
job_attempts(job_id, attempt, lease_until, worker_id, started_at, finished_at).

Store payloads by reference with scoped access, not as raw secrets in queue messages.
Enforce tenant isolation on job listing and cancellation.

Database-backed scheduling is simpler to reason about; a broker-first design can absorb spikes better.
At-least-once execution is realistic; exactly-once side effects belong in the job handler.

Criterion	Weight	Evidence
Separates product behavior from infrastructure assumptions before drawing boxes. clarification	10	The answer names users, write paths, read paths, retention, and what is explicitly out of scope.
Turns traffic and data assumptions into concrete sizing constraints. scale	15	Uses RPS, storage growth, hot-key risk, fanout, latency budget, or memory budget where relevant.
Draws clear service, cache, queue, and storage boundaries with reasons for each split. architecture	20	The component diagram has one owner per responsibility and names the synchronous path.
Defines durable state, indexes, keys, and idempotency records. data	15	Tables or collections include primary keys, lookup paths, TTLs, and consistency expectations.
Names failure modes and the recovery behavior users see. failure	15	Covers partial outages, retries, duplicate work, stale reads, overload, and backfill.
Defines the small set of metrics and traces needed to debug the design. observability	10	Includes SLIs, saturation metrics, queue lag, error classes, and an alert tied to user harm.
Explains what is being sacrificed and why that sacrifice fits the prompt. tradeoffs	15	Compares at least two viable designs and names the losing design's advantage.