Why sorting is the gateway algorithm
Sorting matters for three reasons: it's the classic vehicle for learning divide and conquer; sorted data unlocks faster everything (binary search, two pointers, dedupe, ranges); and "should I sort first?" is the highest-value first question on dozens of interview problems. You will almost never implement a sort at work — you must still know how they work, because the follow-ups ("why is quicksort O(n²) worst case?") are standard.
The complexity landscape
| Algorithm | Average | Worst | Space | Stable? | The one-liner |
|---|---|---|---|---|---|
| Bubble/Insertion | O(n²) | O(n²) | O(1) | ✓ | teaching tools; insertion wins on nearly-sorted data |
| Merge sort | O(n log n) | O(n log n) | O(n) | ✓ | guaranteed, stable, the linked-list choice |
| Quicksort | O(n log n) | O(n²) | O(log n) | ✗ | fastest in practice, in place |
| Heap sort | O(n log n) | O(n log n) | O(1) | ✗ | guaranteed + in place, slower constants |
| Counting/Radix | O(n + k) | O(n + k) | O(k) | ✓ | non-comparison; small integer ranges only |
O(n log n) is provably the floor for comparison sorting — counting sort beats it only by not comparing (it indexes by value, an array trick).
Watch them run
The complexity table above becomes obvious once you watch the work happen. Switch algorithms on the same array — count how many steps each one takes, then shuffle and try again.
1/48Bubble sort: repeatedly swap adjacent out-of-order pairs; the largest value bubbles to the end each pass.
Merge sort — divide and conquer, canonically
Split in half, sort each half (recursively), merge the two sorted halves. The merge is two pointers: repeatedly take the smaller front.
def merge_sort(a):
if len(a) <= 1:
return a
mid = len(a) // 2
left, right = merge_sort(a[:mid]), merge_sort(a[mid:])
return merge(left, right)
def merge(l, r):
out, i, j = [], 0, 0
while i < len(l) and j < len(r):
if l[i] <= r[j]: # <= keeps it STABLE
out.append(l[i]); i += 1
else:
out.append(r[j]); j += 1
return out + l[i:] + r[j:] # one side has leftovers
Why n log n: log n levels of splitting, O(n) merge work per level. Why it matters beyond sorting: the merge step is "merge two sorted lists," "merge k sorted lists" (add a heap), and external sorting — how databases sort data bigger than RAM (sort chunks, merge streams — files, Level 1 meets scale).
Stable means equal elements keep their original order — sort orders by date, then by customer, and each customer's orders stay date-sorted. Interviewers love this word; merge sort has it, quicksort doesn't.
Quicksort — partition and recurse
Pick a pivot, partition the array into < pivot | pivot | > pivot
(one O(n) pass, in place), then recurse on the two sides.
def quicksort(a, lo=0, hi=None):
if hi is None: hi = len(a) - 1
if lo >= hi: return
p = partition(a, lo, hi)
quicksort(a, lo, p - 1)
quicksort(a, p + 1, hi)
def partition(a, lo, hi): # Lomuto: pivot = last element
pivot, write = a[hi], lo # 'write' trails: everything left of it < pivot
for read in range(lo, hi):
if a[read] < pivot:
a[write], a[read] = a[read], a[write]
write += 1
a[write], a[hi] = a[hi], a[write]
return write
The worst case is the famous follow-up: a bad pivot (e.g., always the last element on already-sorted input) splits n into 1 and n−1 → O(n²). Defenses: random pivot or median-of-three. With randomization, expected O(n log n) and the best cache behavior of the bunch — which is why real libraries build on it (with introsort fallbacks to heap sort when recursion gets suspicious).
Partition alone is a tool: quickselect finds the k-th smallest element in expected O(n) — partition, then recurse into one side only. "K-th largest element" is a top-10 interview question, and "quickselect or a heap?" is the expected discussion.
What you actually do: call sort, design the key
words.sort() # Timsort: O(n log n), stable
users.sort(key=lambda u: (-u.score, u.name)) # rank by score desc, tie → name
Arrays.sort(primitives); // dual-pivot quicksort
Collections.sort(users, Comparator.comparing(User::getScore).reversed());
std::sort(v.begin(), v.end()); // introsort
std::stable_sort(v.begin(), v.end(), byScore); // when stability matters
The real-world skill is the comparator/key: multi-field, ascending/
descending mixes, and custom orders ("largest number formed by
concatenation" — sort with comparator a+b > b+a). Python's Timsort is
merge-sort-based, stable, and exploits existing runs — nearly-sorted data
sorts in nearly O(n).
The searching ladder
- Linear scan — O(n), unsorted data, no setup. Fine once; a loop of them is the hash-table cue.
- Binary search — O(log n), needs sorted data. Halve the space each comparison; full treatment (and its killer generalization, binary search on the answer) in its own doc.
- Hash lookup — O(1), needs preprocessing, exact-match only.
- Structure-specific — BST range queries, trie prefixes.
The strategic table — given n items and q queries:
| Situation | Choice |
|---|---|
| One query | scan: O(n) beats sort-then-search O(n log n) |
| Many exact-match queries | hash set/map: O(n) build, O(1) each |
| Many order queries (range, nearest, rank) | sort once + binary search: O(n log n) + O(log n) each |
| Data changes between queries | BST/heap, not re-sorting |
Common mistakes
- Sorting when a hash map answers it — two-sum sorted costs O(n log n) and loses original indices; the dict is O(n).
- Destroying information by sorting — need the original index later?
Sort pairs
(value, i)instead. - Quadratic "sorts" hiding in code —
list.remove(min(list))in a loop is O(n²) selection sort wearing builtins. - Claiming quicksort is O(n log n) flat — it's expected; the worst case and its trigger is the exact follow-up question.
- Stability blindness — multi-pass sorting (by B, then by A) only works with stable sorts; say the word.
Interview perspective
Practice
- Implement from memory: merge sort, then quicksort with random pivot. Time both on 10⁵ random ints, then on 10⁵ already-sorted ints with a fixed last-element pivot — watch the O(n²) happen.
- Comparator reps: sort a list of
(name, dept, salary)by dept asc, salary desc, name asc — one sort call. Then: why does doing it as three separate stable sorts (in reverse priority order) also work? - The canon: k-th largest (all three ways); merge intervals (sort by start — see heaps & intervals); "sort colors" (Dutch national flag — partition with three pointers, no counting allowed).
- Judgment: 10⁹ 32-bit integers on disk, 4 GB RAM — sketch the external merge sort. (Chunks that fit, sort each, k-way merge with a heap. You just designed a database primitive.)
Next pattern: Greedy — when taking the locally best choice is provably globally optimal.
Practice — climb the ladder
Sorting is rarely the ANSWER — it is the move that unlocks two pointers, greedy sweeps, and binary search. Practice choosing it deliberately.
Climb in order — every rung assumes the one above it. Solve on LeetCode, then tick it here; progress is saved on this device.
Warm-up
sort mechanics you must ownThe merge step of merge sort, isolated — fill from the back.
- Sort ColorsMedium
Dutch national flag — three-way partition, the heart of quicksort.
Core
sorting as a tool, not a buttonQuickselect — partition without fully sorting; average O(n) and say why.
- Sort ListMedium
Merge sort on a linked list — the structure picks the algorithm.
- Largest NumberMedium
Custom comparator (a+b vs b+a) — sorting by a rule you invent.
Bucket sort by count — when O(n log n) is already beatable.
- H-IndexMedium
Sort then scan for a crossing point — sorted order creating meaning.
Stretch
divide-and-conquer counting- Reverse PairsHard
Count during merge — the merge-sort skeleton doing analytics on the side.