classic system design

Design a distributed cache

Protect a database from hot reads while staying honest about invalidation and stale data.

cache keysTTL designinvalidationconsistent hashing

Prompt

Design a cache layer for a read-heavy product catalog used by web, mobile, and partner API clients.

Clarifying questions

  • Which reads can be stale and which reads must reflect writes immediately?
  • Are keys small objects, large blobs, or precomputed query results?
  • Do partners require stronger consistency than public clients?

Functional requirements

  • Cache product records and common product-list queries.
  • Invalidate or refresh entries after product updates.
  • Expose cache behavior to operators through dashboards.

Nonfunctional requirements

  • Keep p95 read latency below 50 ms for cached objects.
  • Avoid a cache stampede after popular keys expire.
  • Survive one cache node failure without full cold-start behavior.

Scale assumptions

  • 500 million reads per day.
  • 10 million active products.
  • Top 10,000 keys produce a large fraction of traffic.

API sketch

  • GET /v1/products/{id} reads through cache.
  • POST /internal/cache/invalidate { key, version } for write-path invalidation.

Data model

  • Cache key: product:{productId}:v{version}.
  • Invalidation event: entity_type, entity_id, old_version, new_version, emitted_at.

Architecture components

  • Application services use read-through caching for product records.
  • Write service emits versioned invalidation events after database commit.
  • Hot-key protection uses request coalescing and stale-while-revalidate.

Bottlenecks

  • Popular list queries can cache too many combinations of filters.
  • Invalidation fanout can lag behind writes during import jobs.

Failure modes

  • Cache node failure: client library rehashes keys and falls back to database with rate limits.
  • Invalidation loss: TTL bounds maximum stale duration.
  • Cold start: warm top keys from access logs before promotion.

Observability

  • Hit rate by route and object type, not only global hit rate.
  • Origin database QPS, coalesced request count, stale response count.

Security / privacy

  • Never cache per-user authorization decisions under shared keys.
  • Encrypt sensitive values or keep them out of shared cache entirely.

Cost considerations

  • Memory cost follows object size times replication factor times hot set size.
  • Over-caching low-hit keys wastes memory and reduces eviction quality.

Tradeoffs

  • Versioned keys make invalidation safer but increase orphaned entries until TTL expiry.
  • Write-through cache improves freshness but couples writes to cache availability.

Rubric

CriterionWeightEvidence
Separates product behavior from infrastructure assumptions before drawing boxes.
clarification
10The answer names users, write paths, read paths, retention, and what is explicitly out of scope.
Turns traffic and data assumptions into concrete sizing constraints.
scale
15Uses RPS, storage growth, hot-key risk, fanout, latency budget, or memory budget where relevant.
Draws clear service, cache, queue, and storage boundaries with reasons for each split.
architecture
20The component diagram has one owner per responsibility and names the synchronous path.
Defines durable state, indexes, keys, and idempotency records.
data
15Tables or collections include primary keys, lookup paths, TTLs, and consistency expectations.
Names failure modes and the recovery behavior users see.
failure
15Covers partial outages, retries, duplicate work, stale reads, overload, and backfill.
Defines the small set of metrics and traces needed to debug the design.
observability
10Includes SLIs, saturation metrics, queue lag, error classes, and an alert tied to user harm.
Explains what is being sacrificed and why that sacrifice fits the prompt.
tradeoffs
15Compares at least two viable designs and names the losing design's advantage.