Design Twitter/X

The canonical fan-out interview: hybrid timelines under a power law, retweets, trending topics as a streaming problem, and search over everything ever said.

system-designfan-outtrendingcase-study

Prompt

Design a microblogging platform: short posts, follows, home timeline, retweets/likes, trending topics, search. Hundreds of millions of users; some have 100M followers.

1. Requirements

Functional: post (small text + media refs); follow; home timeline; retweet & like; trending topics; search. Non-functional: timeline under ~200 ms; posting feels instant; extreme read-heavy; the follower distribution is a power law — and unlike Instagram, celebrity content is the product: a head-of-state's post must reach 100 M timelines fast.

Why this design exists separately from Instagram: tweets are tiny and text-first (storage is easy, distribution is everything), reshares (retweets) amplify fan-out recursively, and two new subsystems — trending and search — are first-class. The feed machinery transfers; this page builds what doesn't.

2. Estimation (shapes the design)

~500 M tweets/day ≈ 6 K writes/sec (tweets are ~300 bytes — a day's text fits on a laptop; storage is not the problem). Timeline reads: ~100 K/sec. Average followers ~700 → naive full fan-out ≈ 4 M timeline-writes/sec average, with single posts spiking to 100 M. The power law isn't an edge case here — the head of the distribution carries the product, so the hybrid strategy isn't an optimization, it's the core design.

3. High-level design

The structural idea to announce: every tweet enters one durable event log, and four independent consumers build four derived views — timelines, trends, search, analytics. Adding a fifth view never touches the write path (event-driven decoupling, drawn large).

4. Deep dives

Timelines under a power law (the canonical answer, compressed)

Same hybrid as Instagram — push for normal users, pull-and-merge for celebrities, skip dormant accounts — with the Twitter-specific torques:

  • The celebrity threshold matters more (the head is the traffic); celebrity tweets are served from a hot celebrity-recent cache that millions of timeline-reads merge in — one cache line doing 100 M timelines' work.
  • Retweets amplify recursively: a retweet is a new timeline entry referencing the original (never a copy — the original's counters must stay singular), and a mid-tier user retweeting a celebrity triggers fan-out to their followers. Fan-out workers therefore consume from the event log with per-tweet dedup: a user following five retweeters sees the tweet once (timeline insertion checks the referenced id against recent entries).
  • Posting writes the tweet + appends to the log, then returns — fan-out is fully async; "instant post" means "durably logged," not "delivered to 100 M lists."

The famous trace — an 80M-follower account tweets — in one picture:

"What's spiking right now" is a windowed heavy-hitters problem over the firehose:

  1. Tokenize tweets into terms/hashtags as they flow through the log.
  2. Count per term over sliding windows (last 10–60 min) — at firehose volume, exact counts per term are wasteful; use count-min sketch (a probabilistic counting structure: small fixed memory, slight overestimates — name it, one sentence, move on) with exact counting for the current top-K candidates.
  3. Trending ≠ popular: "Monday" is always frequent. Score by acceleration — current window count vs the term's historical baseline; a 50× spike on a small base beats a 1.02× wiggle on a huge one.
  4. Compute per region; smooth over adjacent windows; apply the editorial/abuse filters every real system has.

Output: a tiny top-K list per region, recomputed every few seconds and cached — read by 100 M users, computed once (compute-once-broadcast-many, again).

Search (the second index)

Every tweet ever, searchable in milliseconds: an inverted index (term → list of tweet ids — the hash-map idea applied to text, the structure inside Elasticsearch/Lucene). Twitter-specific shape:

  • Shard by time, not just term — queries overwhelmingly want recent matches, so the freshest index shards are small, hot, in-memory, and searched first; older shards are bigger, colder, searched only when needed (pagination past recent results). The long-tail tiering of YouTube's caches, applied to an index.
  • Indexing lag of seconds is acceptable and is what the async indexer buys; ranking blends recency, engagement and relevance — acknowledge and move on.

Counters, briefly

Likes/retweet counts on viral tweets are the hot-counter problem — sharded counters / windowed aggregation; display lags seconds; nobody pays per like, so no settlement pipeline needed. One sentence in the room; the cross- reference is the point.

Think it through like the interview

Think it through: Design Twitter/XHLD Classic — fan-out + streaming0/5 stages

PROBLEMMicroblogging: short posts, follows, home timeline, retweets, trending topics, search. Hundreds of millions of users; some have 100M followers.

  1. 1

    What's different from Instagram?

    You've done the feed problem. What does THIS prompt add that transfers nothing?

  2. 2

    One log, N views

    Timelines, trending, search, analytics all need every tweet. How many write paths?

    unlocks after the stage above
  3. 3

    Timelines under a power law

    Average 700 followers, but the head of the distribution hits 100M. What does the hybrid look like HERE?

    unlocks after the stage above
  4. 4

    Trending = acceleration, not frequency

    Why does 'Monday' never trend, and what structure counts a firehose in fixed memory?

    unlocks after the stage above
  5. 5

    Search, and the closing trace

    Index every tweet ever — what's the Twitter-specific sharding? Then trace the 80M-follower tweet.

    unlocks after the stage above

5. Bottlenecks & failure modes

  • Fan-out backlog during global events (World Cup final: tweet volume × 5, every tweet hot) → prioritized fan-out (active users first), celebrity-cache absorbs the head, timelines staleness degrades gracefully by minutes — posting never blocks.
  • Trending manipulation (bot brigades) → rate limits per account, spam-scoring upstream of the trend pipeline, editorial gates — name the adversarial reality; it's a product surface, not a footnote.
  • Timeline cache loss → rebuild lazily via pull-model (the rebuildable-derived-data story once more).
  • Search indexer lag → recent tweets invisible in search for minutes; trending (separate pipeline) keeps working — independent consumers fail independently, which is why the event-log architecture wins.

Design drills

The timeline is the fan-out problem in its purest form. Drill the extremes.

Design drills: Timeline (Twitter/X)0/4 done

Whiteboard each one out loud for 5–10 minutes before you reveal what a strong answer covers — the gap between your sketch and the checklist is your study list. Progress is saved on this device.

Warm-up

Build a user's home timeline. Push (fan-out on write) or pull (fan-out on read)?

Core

A celebrity with 40M followers tweets. Why is it special, and what do you do?

Core

Likes and retweets storm in on a viral tweet. Keep the counters from melting a partition.

Stretch

Trending and search must reflect a tweet within seconds. How, without slowing the write path?