classic system design
Design a distributed cache
Protect a database from hot reads while staying honest about invalidation and stale data.
cache keysTTL designinvalidationconsistent hashing
Prompt
Design a cache layer for a read-heavy product catalog used by web, mobile, and partner API clients.
Clarifying questions
- Which reads can be stale and which reads must reflect writes immediately?
- Are keys small objects, large blobs, or precomputed query results?
- Do partners require stronger consistency than public clients?
Functional requirements
- Cache product records and common product-list queries.
- Invalidate or refresh entries after product updates.
- Expose cache behavior to operators through dashboards.
Nonfunctional requirements
- Keep p95 read latency below 50 ms for cached objects.
- Avoid a cache stampede after popular keys expire.
- Survive one cache node failure without full cold-start behavior.
Scale assumptions
- 500 million reads per day.
- 10 million active products.
- Top 10,000 keys produce a large fraction of traffic.
API sketch
- GET /v1/products/{id} reads through cache.
- POST /internal/cache/invalidate { key, version } for write-path invalidation.
Data model
- Cache key: product:{productId}:v{version}.
- Invalidation event: entity_type, entity_id, old_version, new_version, emitted_at.
Architecture components
- Application services use read-through caching for product records.
- Write service emits versioned invalidation events after database commit.
- Hot-key protection uses request coalescing and stale-while-revalidate.
Bottlenecks
- Popular list queries can cache too many combinations of filters.
- Invalidation fanout can lag behind writes during import jobs.
Failure modes
- Cache node failure: client library rehashes keys and falls back to database with rate limits.
- Invalidation loss: TTL bounds maximum stale duration.
- Cold start: warm top keys from access logs before promotion.
Observability
- Hit rate by route and object type, not only global hit rate.
- Origin database QPS, coalesced request count, stale response count.
Security / privacy
- Never cache per-user authorization decisions under shared keys.
- Encrypt sensitive values or keep them out of shared cache entirely.
Cost considerations
- Memory cost follows object size times replication factor times hot set size.
- Over-caching low-hit keys wastes memory and reduces eviction quality.
Tradeoffs
- Versioned keys make invalidation safer but increase orphaned entries until TTL expiry.
- Write-through cache improves freshness but couples writes to cache availability.
Rubric
| Criterion | Weight | Evidence |
|---|---|---|
Separates product behavior from infrastructure assumptions before drawing boxes. clarification | 10 | The answer names users, write paths, read paths, retention, and what is explicitly out of scope. |
Turns traffic and data assumptions into concrete sizing constraints. scale | 15 | Uses RPS, storage growth, hot-key risk, fanout, latency budget, or memory budget where relevant. |
Draws clear service, cache, queue, and storage boundaries with reasons for each split. architecture | 20 | The component diagram has one owner per responsibility and names the synchronous path. |
Defines durable state, indexes, keys, and idempotency records. data | 15 | Tables or collections include primary keys, lookup paths, TTLs, and consistency expectations. |
Names failure modes and the recovery behavior users see. failure | 15 | Covers partial outages, retries, duplicate work, stale reads, overload, and backfill. |
Defines the small set of metrics and traces needed to debug the design. observability | 10 | Includes SLIs, saturation metrics, queue lag, error classes, and an alert tied to user harm. |
Explains what is being sacrificed and why that sacrifice fits the prompt. tradeoffs | 15 | Compares at least two viable designs and names the losing design's advantage. |