The structure where paths spell words
A trie (from retrieval, pronounced "try") is a tree for storing
strings where each edge is a character and each root-to-node path spells a
prefix. Store cat, car, card, dog:
(root)
/ \
c d
| |
a o
/ \ |
t● r● g● ● = "a word ends here" flag
|
d●
cat and car share the path c→a — that's the whole idea. A trie
physically merges common prefixes, so anything prefix-shaped becomes a walk
down the tree:
- "Does any word start with
ca?" — walk c, a: yes, in 2 steps. A hash table cannot answer this without scanning every key — hashing destroys prefix structure. - "All completions of
ca?" — walk to thecanode, then collect every flagged node below it (DFS). That's autocomplete, literally.
| Operation (word of length L) | Hash set | Trie |
|---|---|---|
| insert / exact lookup | O(L)* | O(L) |
| prefix exists? | O(n·L) scan | O(L) |
| all words with prefix | O(n·L) scan | O(L + results) |
| memory | compact | pointer-heavy (the trade-off) |
*hashing a string takes O(L) too — people forget.
The complexity teaching point: trie costs depend on word length, not on how many words are stored. A million-word dictionary answers a 3-letter prefix query in ~3 steps.
Watch it run
Insert words that share beginnings and watch the branches merge — "car", "cat" and "card" pay for c→a→r exactly once. The dashed ring marks "a word ends here"; look up "ca" to see why that mark matters.
1/25An empty trie — just a root with no letters. We'll insert: "car", "cat", "card", "do", "dog".
Implementation
The node is just "children by character, plus an end-flag" — a dict per node:
# Python
class TrieNode:
def __init__(self):
self.children = {} # char → TrieNode
self.is_word = False
class Trie:
def __init__(self):
self.root = TrieNode()
def insert(self, word): # O(L)
node = self.root
for ch in word:
if ch not in node.children:
node.children[ch] = TrieNode()
node = node.children[ch]
node.is_word = True
def search(self, word): # exact match — O(L)
node = self._walk(word)
return node is not None and node.is_word
def starts_with(self, prefix): # the trie's superpower — O(L)
return self._walk(prefix) is not None
def _walk(self, s):
node = self.root
for ch in s:
if ch not in node.children:
return None
node = node.children[ch]
return node
// Java — fixed array beats HashMap for lowercase a–z
class TrieNode {
TrieNode[] children = new TrieNode[26];
boolean isWord;
}
// child lookup: children[ch - 'a']
// C++
struct TrieNode {
TrieNode* children[26] = {};
bool isWord = false;
};
The dict-children version handles any alphabet; the array-of-26 version is faster and the standard choice when the problem says "lowercase English letters" — mention the choice, it's free signal.
Where tries earn their keep
- Autocomplete — search boxes, IDEs, your phone keyboard. Walk the prefix, DFS below it, rank the results. (Real engines bolt on ranking — at each node, cache the top-k completions underneath so queries don't DFS at all: precompute over recompute, a system-design instinct.)
- Spell-check & word games — "is this a word / could it become one?"
The
starts_withcheck is what lets word-search algorithms prune: if no word starts withxq, abandon that entire branch of exploration. This trie+backtracking combo is the famous Word Search II problem — the trie's marquee interview appearance. - IP routing — routers pick the longest matching prefix for a packet's destination address (the internet, Level 0) using binary tries over address bits. Your packets traverse tries to reach this page.
- T9 / contact search — digits map to letter sets; walk all matching branches at once.
Common mistakes
- Forgetting the end-of-word flag — then
search("ca")returns True because the path exists, even though onlycatwas inserted. Prefix existence ≠ word membership; the flag is the difference. - Marking the wrong node — set
is_wordon the last character's node, after the loop; off-by-one here breaks everything quietly. - Reaching for a trie when a set suffices — exact membership only? A hash set is simpler and leaner. The trie's justification is the word "prefix" (or "longest match") in the problem.
- Memory surprise — a node per character with 26 pointers each adds up; for huge static dictionaries, real systems compress chains (radix/Patricia tries — name-drop, don't implement).
- Recursing the collect step without bounding results — autocomplete
wants top-k, not all 40,000 completions of
a.
Interview perspective
Practice
- Build: implement the trie above from memory; insert
cat, car, card, dog; verifysearch("ca") == False,starts_with("ca") == True,search("card") == True. - Extend: add
count_with_prefix(prefix)— maintain apass_counton each node during insert so the answer is O(L) with no DFS. (This node-metadata trick is the heart of harder trie problems.) - Autocomplete: add
suggest(prefix, k)returning the first k completions in alphabetical order — DFS with early stop. - The boss fight: Word Search II (find dictionary words in a letter grid). Solve word-by-word first (backtracking per word), then re-solve with one trie guiding a single grid exploration — and measure the speedup the pruning buys you.
Next: Union-Find — the tiny structure that answers "are these connected?" almost instantly.
Practice — climb the ladder
The trie trigger: "prefix", "starts with", "autocomplete", or a dictionary queried many times. Build it once; every query rides the shared prefixes.
Climb in order — every rung assumes the one above it. Solve on LeetCode, then tick it here; progress is saved on this device.
Warm-up
build the structure once by handinsert / search / startsWith from scratch — children map + end-of-word flag.
Solvable without a trie — then see why a trie makes it trivial.
Core
tries + traversalWildcard search = DFS branching inside the trie — structure meets recursion.
- Replace WordsMedium
Shortest-root lookup — walking the trie and stopping at the first end flag.
Stretch
the famous combination- Word Search IIHard
Trie + grid backtracking — prune the board walk against the dictionary; the canonical trie interview.