DBMS — Transactions, Isolation & Indexes

ACID, the isolation levels and the anomalies they prevent, how B-tree indexes really work (and when they hurt), and normalization vs denormalization.

DBMStransactionsisolationindexes

ACID

  • Atomicity — all-or-nothing; a partial transaction rolls back.
  • Consistency — a transaction moves the DB from one valid state to another (constraints hold).
  • Isolation — concurrent transactions don't see each other's partial work.
  • Durability — once committed, it survives a crash (write-ahead log).

StockStump's balance decrement needs all four — which is why the fix for the double-spend was a single atomic, isolated UPDATE.

Isolation levels & the anomalies they stop

LevelDirty readNon-repeatable readPhantom
Read Uncommitted✅ possible
Read Committed
Repeatable Read✅ (varies)
Serializable
  • Dirty read: you read another transaction's uncommitted change.
  • Non-repeatable read: you read a row twice and get different values (someone committed an update between).
  • Phantom: you re-run a range query and new rows appear.

Higher isolation = more correctness, less concurrency. Read Committed is the common default; bump to Serializable only for the operations that need it.

Indexes

An index is usually a B-tree: ordered, balanced, O(log n) lookups, and great for range scans and ORDER BY. It speeds reads but slows writes (every insert/update maintains the tree) and costs space.

  • Composite index follows the leftmost-prefix rule: an index on (a, b) helps queries filtering on a or a,b — not on b alone.
  • Covering index: if the index contains every column a query needs, the DB answers from the index without touching the row ("index-only scan").
  • A hash index is O(1) for equality but useless for ranges.
Indexes aren't free

Over-indexing slows writes and wastes space; an index on a low-cardinality column (e.g. a boolean) barely helps because it can't narrow the scan. Index the columns you actually filter/sort/join on, and measure with EXPLAIN.

Normalization vs denormalization

Normalize to remove redundancy and update anomalies (3NF: every non-key column depends on the key, the whole key, and nothing but the key). Denormalize (duplicate data) to avoid expensive joins on read-heavy paths — trading write complexity for read speed. OLTP leans normalized; analytics/read models lean denormalized.