API Gateway & Service Discovery

The front door and the phone book of microservices: edge concerns in one place, how services find each other, and where the mesh fits.

backendapi-gatewayservice-discoverymicroservices

Two problems you created by splitting

The moment a monolith becomes services, two navigation problems appear that the monolith never had:

  1. Outside → in: clients can't be expected to know that orders live at one address, payments at another, each with its own auth quirks. Who is the front door? → the API gateway.
  2. Inside → inside: the order service must call the payment service — but instances autoscale, die, and move (the cloud's nature); IP addresses are weather, not geography. How do callers find callees? → service discovery.

The API gateway: one front door

A gateway is a reverse proxy with opinions — the single entry point where edge concerns live exactly once instead of in every service:

What it does, and why there:

  • Routing — path/host-based dispatch to services; clients see one API (the contract), the topology behind it can refactor freely. Versioning and canary routing ("send 5% of traffic to v2") live here too.
  • Authentication at the edge — verify the JWT/session once; pass a trusted identity header inward. Services still authorize (what may this user do — that's domain logic), but they stop re-implementing who is this.
  • Rate limiting & quotasthe algorithms enforced before traffic touches anything expensive.
  • TLS termination, request logging, body limits, CORS — the unglamorous edge checklist, centralized.
  • Aggregation/BFF — optionally, compose "the mobile home screen" from three service calls so the phone makes one (the BFF pattern, often GraphQL-shaped).

Real ones you'll name: NGINX/Kong/Envoy, cloud-native (AWS API Gateway), or Spring Cloud Gateway. The danger to name in the same breath: the gateway sits on every request — keep it thin (translate and route, never business logic), horizontally scaled, and treat its config as code, or it becomes a single point of failure and a single point of politics (ESB scars, historically).

Service discovery: the phone book

Hardcoding payment-svc:8080 works until the third deploy. Discovery has three moving parts: a registry (live address book), a registration flow, and a lookup flow:

  • Registration: instances announce themselves on boot ("payments @ 10.0.3.7:8080") and are removed when unhealthy — via TTL'd heartbeats (the session-registry trick) or active health checks. The removal half is the entire point: a registry of dead instances is worse than none.
  • Client-side discovery: the caller queries the registry and load-balances itself (Netflix's Eureka+Ribbon lineage). Fewer hops, smarter per-client choices — but discovery logic in every codebase.
  • Server-side discovery: the caller hits a stable virtual name; a load balancer/proxy resolves it. Callers stay dumb — the dominant pattern today because platforms provide it: in Kubernetes, http://payments just works — DNS + a virtual IP route to live pods, registration and health-eviction handled by the platform (Level 10 makes discovery ambient).

Registries you'll name: Consul, etcd, ZooKeeper (the same coordination stores — not a coincidence: discovery is small-but-critical shared state), or the platform's built-in.

The service mesh (one honest paragraph)

Once every service also wants retries, timeouts, mutual TLS, circuit breaking and per-call metrics (the resilience kit), you can either build them into every codebase (×N languages) or push them into a sidecar proxy next to each instance — and the fleet of sidecars plus their control plane is a service mesh (Istio/Linkerd, built on Envoy). Mesh = the gateway's concerns, applied to east-west (service-to-service) traffic, uniformly and outside application code. The cost is real operational complexity — the honest take: meshes earn their keep at many-teams/many-services scale, and below that, library-based resilience (or just the platform's basics) is the right call.

Common mistakes

  • Business logic in the gateway — discounts computed at the edge: now there are two backends, one in YAML and nobody's tests.
  • Clients bypassing the gateway ("just this once") — every edge control now has a hole; network policy should make bypass impossible, not impolite.
  • Treating the registry as optional hygiene — stale entries mean traffic to dead instances; health-eviction is the feature, not the add-on.
  • Gateway as singleton — the front door needs the same horizontal scaling and health checks as everything behind it.
  • Auth only at the edge — edge authN is necessary, not sufficient; services must still authorize per request (zero-trust instinct: never trust the caller, even the internal one).
  • Adopting a mesh because the talk was good — two teams and six services don't need Istio; they need timeouts in a library.

Interview perspective

Practice

  1. Build a toy gateway: an Express proxy routing /orders/* and /payments/* to two local services; add JWT verification and a per-IP token bucket at the edge; confirm the services receive a trusted identity header.
  2. Feel discovery: run two instances of one service; first hardcode the port (kill it; watch failure), then route through a registry file your gateway re-reads — your own crude registration/eviction loop.
  3. In Kubernetes (or minikube): two Deployments + Services; curl one from the other by DNS name; scale and kill pods and watch traffic stay routed — discovery as ambient platform.
  4. Design rep: sketch the gateway config for BookMyShow: routes, where the idempotency-key check lives, what rate limits protect seat-locking, and which one endpoint you'd canary first.

Next: Distributed Locks — when "only one of us should do this" crosses machine boundaries.