The actual definition
A monolith is one deployable: all features in one codebase, one process, one database. Microservices split the backend into independently deployable services — each owning one business capability and its own data — talking over the network.
The crucial honesty up front: microservices are not a performance technique or a modernity badge. They are an organizational scaling technique that costs technical complexity. Every "should we?" conversation is really about that trade.
What they buy — and the price
| You gain | You pay |
|---|---|
| Independent deploys — checkout team ships 20×/day without coordinating with search team | Every function call you split becomes a network call: latency, timeouts, retries, partial failure |
| Independent scaling — 50 instances of search, 2 of invoicing (scalability) | No more ACID across features — the transaction that updated orders + inventory together is gone |
| Fault isolation — recommendations down ≠ checkout down | Debugging crosses machines: distributed tracing or blindness |
| Tech freedom per service — the right runtime per workload | Operational surface × N: deploys, monitoring, on-call, versioned contracts between teams |
| Team autonomy — Amazon's "two-pizza teams," each owning services end-to-end | A platform tax you pay before the first feature: service discovery, CI/CD per service, Level 10 machinery |
The rule that survives contact with reality: split when team coordination costs exceed distributed-system costs — rarely before several teams are stepping on each other's deploys. "Monolith-first" (build modular inside one deployable; extract services along the seams that prove painful) is not a junior compromise; it's what Amazon, Shopify and Stack Overflow stories actually teach.
Boundaries: the design problem
Wrong splits are worse than no splits. The unit is a business capability (orders, payments, inventory, notifications) — not a layer (controllers-service! database-service!) and not a table. Two tests for a proposed boundary:
- Does it own its data? Each service gets its own database — others may never reach in; they ask via API/events. Shared databases recreate the coupling you split to escape, minus the transactions.
- Can it change alone? If every feature touches three services, you've built a distributed monolith — all of the costs, none of the autonomy. (This is the modal failure, and naming it is interview gold.)
The encapsulation principle, at organization scale: services are objects, APIs/events are their public methods, databases are their private fields.
Communication: sync vs async
- Synchronous (REST/gRPC): caller needs the answer now — checkout must know payment succeeded. Simple to reason about; couples uptime (callee down = caller failing) and adds latency per hop. gRPC is the internal-call standard: binary, typed contracts (the API-contract discipline, compiled).
- Asynchronous (events via queues):
caller announces a fact —
OrderPlaced— and moves on; interested services react on their own time. Decouples uptime and teams (adding a consumer requires zero changes upstream); costs you eventual consistency and harder reasoning.
Default that earns senior nods: sync for queries and must-know-now commands; async events for everything that's a consequence. Most "service A calls B calls C calls D" latency chains were consequences wearing sync clothing.
Surviving sync failure
Network calls fail; the patterns are vocabulary now: timeouts (always, on everything), retries with backoff + jitter (only on idempotent calls — idempotency keys exist for this), circuit breakers (after K failures, fail fast for a cooldown instead of hammering a dying service — preventing cascade failure), and fallbacks (recommendations down → show bestsellers, not a 500).
The hard part: transactions across services
Order placement must: create order, charge payment, reserve inventory — three services, three databases, no shared transaction. If inventory fails after payment charged?
The saga pattern: a sequence of local transactions, each publishing an event that triggers the next — and for every step, a defined compensating action that undoes it on later failure:
OrderCreated → PaymentCharged → InventoryReserved → OrderConfirmed
↑ ✗ fails
RefundPayment ← ReservationFailed (compensation flows back)
CancelOrder ←
Two flavors: choreography (services react to each other's events — no coordinator, but the flow lives nowhere and is hard to follow) vs orchestration (an order-saga coordinator commands each step — the flow is explicit, the coordinator is a dependency). Either way you've traded ACID's automatic rollback for explicit, designed rollback — and accepted eventual consistency: for a few seconds, the system is mid-saga. (CAP & consistency is the theory; this is where it bites.)
Idempotency is the saga's load-bearing wall: events get redelivered (at-least-once), so every handler must tolerate duplicates — process-once semantics built from dedupe keys.
Seeing anything: observability
A request now crosses 6 services; "it's slow" means where? Three pillars, non-negotiable at this scale: structured logs with a correlation id propagated through every hop (the one header that turns 6 services' logs into one story), metrics per service (rate/errors/duration — the load balancer's health checks plus dashboards), and distributed tracing (OpenTelemetry/Jaeger: one waterfall showing the request's path and where the 800 ms went). Level 10 operationalizes all three.
Common mistakes
- Microservices at product-search time — a two-person startup paying Amazon's coordination tax with nobody to coordinate. Modular monolith first.
- Shared database "just for now" — the distributed monolith's origin story.
- Splitting by noun-table instead of capability — UserService, OrderService, and every request visiting both.
- Sync chains five deep — availability multiplies down: if each service is up 99.9% of the time, a request needing all five in a row succeeds only 0.999⁵ ≈ 99.5% of the time — you made everything less reliable by wiring it in series. Worst-case latency stacks the same way (estimation instincts).
- No timeouts/circuit breakers — one slow dependency exhausts your thread pool and the cascade takes the platform down.
- Events without idempotent consumers — duplicate
PaymentCharged, duplicate refund, real money.
Interview perspective
Practice
- Paper-split BookMyShow (the LLD): propose services, mark each arrow sync or event, and find the saga (booking = seat-lock + payment + ticket + notification). Where does the compensation flow run when payment fails?
- Build a mini-saga: two FastAPI/Express services (orders, payments) + Redis as a bus: OrderPlaced → charge → PaymentResult → confirm/cancel. Kill the payment service mid-flow; make recovery work; then send a duplicate event and watch idempotency save you (or not).
- Break a chain: add a 5 s sleep to one downstream service and watch the caller's latency without a timeout; add timeout + fallback; then a counter-based circuit breaker. Three patterns, one afternoon.
- Connect: Praxivo's deep-dive — argue against microservices for it in three sentences. (Knowing when not to is the Level 7 graduation exam.)
Level 7 complete. The roadmap turns to what the user sees: Level 8 — Frontend Development.