Caching & CDNs

Where to cache, what to cache, how to invalidate, and the failure modes (stampede, penetration, avalanche) interviewers love to probe.

performancerediscdnscalability

The layers

A request can be served from several caches before it ever touches your database. Push the cache as close to the user as you can afford to.

LayerCachesTTL feelInvalidation
Browserstatic assets, GET responsesminutes–daysCache-Control, content-hash filenames
CDN / edgeimages, JS/CSS, cacheable API GETsminutes–dayspurge API, versioned URLs
Application (Redis)hot rows, computed views, sessionsseconds–hoursTTL + explicit delete on write
Databasebuffer pool, query cacheengine-managed

Read strategies

  • Cache-aside (lazy): app checks cache → on miss, reads DB and populates. Simple, resilient, the default. Risk: first request after expiry is slow.
  • Read-through: the cache library loads from DB on miss. Cleaner app code, couples you to the cache layer.
  • Write-through / write-behind: write to cache and DB together (through) or flush async (behind). Behind is fast but can lose writes on crash.

Watch the hit/miss mechanic play out. The cache is cold at first — every request is a miss and a slow trip to the store — then it warms up: repeated keys are served from memory, and once it's full the least-recently-used key is evicted to make room. That climbing hit-rate is what every caching decision is really after:

Cache — hit, miss & LRU evictiontime O(1) per requestspace O(capacity)
request
cache
·
·
·
MRU → LRU
storeuntouched (served from memory)
hit rate0% · 0H / 0M

1/8Empty LRU cache, capacity 3. A hit is served from memory; a miss must fetch from the slow store, then cache it — evicting the least-recently-used key once the cache is full.

hit rate = 0%
The three failure modes to name

Stampede / dogpile — a hot key expires and 10k requests hit the DB at once. Fix: per-key lock or "early recompute" with a probabilistic TTL. Penetration — queries for keys that don't exist bypass the cache every time. Fix: cache the negative result (short TTL) or a Bloom filter. Avalanche — many keys share an expiry instant. Fix: jitter the TTLs.

Invalidation

"There are only two hard things in Computer Science: cache invalidation and naming things." — Phil Karlton

Practical rules: prefer TTL + explicit delete on write over trying to keep caches perfectly coherent. Use content-hashed URLs for static assets so you never invalidate — you just stop referencing the old file. For derived data, store a version stamp and bump it on write.

Design drills

You can recite cache-aside — now defend the decisions under pressure.

Design drills: Caching0/5 done

Whiteboard each one out loud for 5–10 minutes before you reveal what a strong answer covers — the gap between your sketch and the checklist is your study list. Progress is saved on this device.

Warm-up

A product page is read 100k times/min and edited a few times a day. Design its caching — layers, TTL, and invalidation.

Core

A hot key expires and 10,000 requests hit the DB in the same second. Name the failure and stop it.

Core

Attackers query millions of IDs that don't exist; each one sails past the cache to the DB. Diagnose and fix.

Stretch

Choose write-through vs write-back vs cache-aside for (a) a user's session and (b) a financial ledger. Justify each.

Stretch

Reads are global but your cache lives in one region. Serve a worldwide audience without stale chaos.