Two problems you created by splitting
The moment a monolith becomes services, two navigation problems appear that the monolith never had:
- Outside → in: clients can't be expected to know that orders live at one address, payments at another, each with its own auth quirks. Who is the front door? → the API gateway.
- Inside → inside: the order service must call the payment service — but instances autoscale, die, and move (the cloud's nature); IP addresses are weather, not geography. How do callers find callees? → service discovery.
The API gateway: one front door
A gateway is a reverse proxy with opinions — the single entry point where edge concerns live exactly once instead of in every service:
What it does, and why there:
- Routing — path/host-based dispatch to services; clients see one API (the contract), the topology behind it can refactor freely. Versioning and canary routing ("send 5% of traffic to v2") live here too.
- Authentication at the edge — verify the JWT/session once; pass a trusted identity header inward. Services still authorize (what may this user do — that's domain logic), but they stop re-implementing who is this.
- Rate limiting & quotas — the algorithms enforced before traffic touches anything expensive.
- TLS termination, request logging, body limits, CORS — the unglamorous edge checklist, centralized.
- Aggregation/BFF — optionally, compose "the mobile home screen" from three service calls so the phone makes one (the BFF pattern, often GraphQL-shaped).
Real ones you'll name: NGINX/Kong/Envoy, cloud-native (AWS API Gateway), or Spring Cloud Gateway. The danger to name in the same breath: the gateway sits on every request — keep it thin (translate and route, never business logic), horizontally scaled, and treat its config as code, or it becomes a single point of failure and a single point of politics (ESB scars, historically).
Service discovery: the phone book
Hardcoding payment-svc:8080 works until the third deploy. Discovery
has three moving parts: a registry (live address book), a
registration flow, and a lookup flow:
- Registration: instances announce themselves on boot ("payments @ 10.0.3.7:8080") and are removed when unhealthy — via TTL'd heartbeats (the session-registry trick) or active health checks. The removal half is the entire point: a registry of dead instances is worse than none.
- Client-side discovery: the caller queries the registry and load-balances itself (Netflix's Eureka+Ribbon lineage). Fewer hops, smarter per-client choices — but discovery logic in every codebase.
- Server-side discovery: the caller hits a stable virtual name; a
load balancer/proxy resolves it. Callers stay dumb — the dominant
pattern today because platforms provide it: in Kubernetes,
http://paymentsjust works — DNS + a virtual IP route to live pods, registration and health-eviction handled by the platform (Level 10 makes discovery ambient).
Registries you'll name: Consul, etcd, ZooKeeper (the same coordination stores — not a coincidence: discovery is small-but-critical shared state), or the platform's built-in.
The service mesh (one honest paragraph)
Once every service also wants retries, timeouts, mutual TLS, circuit breaking and per-call metrics (the resilience kit), you can either build them into every codebase (×N languages) or push them into a sidecar proxy next to each instance — and the fleet of sidecars plus their control plane is a service mesh (Istio/Linkerd, built on Envoy). Mesh = the gateway's concerns, applied to east-west (service-to-service) traffic, uniformly and outside application code. The cost is real operational complexity — the honest take: meshes earn their keep at many-teams/many-services scale, and below that, library-based resilience (or just the platform's basics) is the right call.
Common mistakes
- Business logic in the gateway — discounts computed at the edge: now there are two backends, one in YAML and nobody's tests.
- Clients bypassing the gateway ("just this once") — every edge control now has a hole; network policy should make bypass impossible, not impolite.
- Treating the registry as optional hygiene — stale entries mean traffic to dead instances; health-eviction is the feature, not the add-on.
- Gateway as singleton — the front door needs the same horizontal scaling and health checks as everything behind it.
- Auth only at the edge — edge authN is necessary, not sufficient; services must still authorize per request (zero-trust instinct: never trust the caller, even the internal one).
- Adopting a mesh because the talk was good — two teams and six services don't need Istio; they need timeouts in a library.
Interview perspective
Practice
- Build a toy gateway: an Express proxy
routing
/orders/*and/payments/*to two local services; add JWT verification and a per-IP token bucket at the edge; confirm the services receive a trusted identity header. - Feel discovery: run two instances of one service; first hardcode the port (kill it; watch failure), then route through a registry file your gateway re-reads — your own crude registration/eviction loop.
- In Kubernetes (or minikube): two Deployments + Services; curl one from the other by DNS name; scale and kill pods and watch traffic stay routed — discovery as ambient platform.
- Design rep: sketch the gateway config for BookMyShow: routes, where the idempotency-key check lives, what rate limits protect seat-locking, and which one endpoint you'd canary first.
Next: Distributed Locks — when "only one of us should do this" crosses machine boundaries.