Load Balancing Basics

Spreading traffic across multiple servers. Done right, it's invisible. Done wrong, it's the outage.

1 credit

Layers

4 items
L4 (TCP/UDP)
Routes raw bytes by IP + port. Fast, simple. AWS NLB, HAProxy TCP mode.
L7 (HTTP)
Understands requests — route by host / path / header / cookie. AWS ALB, nginx, Cloudflare, Envoy.
DNS-based
Weighted DNS answers across regions. Coarse (cached). Use for geo-routing, not fine load balancing.
Anycast
Same IP in multiple datacenters; routers pick closest. Cloudflare, AWS Global Accelerator.

Algorithms

5 items
Round-robin
Next server in rotation. Ignores load — fine for stateless, uniform work.
Least connections
Send to server with fewest active. Better for long-lived or variable-latency work.
Weighted
Bigger boxes get more. Useful during migrations (shift 10% → new fleet).
IP hash / sticky
Same client always routes to same server. Needed for in-memory session, but kills rebalancing.
Consistent hash
Keys (e.g. user_id) map to same server even when adding/removing. Good for caches.

Health checks

  • Active: LB probes `/health` every few seconds. Fast failover, wastes a few requests/s.
  • Passive: LB watches real traffic for errors/latency. No overhead, slower to react.
  • Health endpoint should check the things that could break — DB connection, required downstream APIs.
  • But not TOO deep — a failing 3rd-party shouldn't take your whole fleet out of rotation.

Common failure modes

  • Thundering herd on restart — 10 servers + 10k connections + rolling deploy = every survivor gets slammed. Solution: slow-start, connection draining.
  • No timeouts on LB → slow backend = LB exhausts connections → LB dies.
  • Sticky sessions + one dead server = those users are logged out. Use a shared session store instead.
  • Missing `X-Forwarded-For` / `X-Real-IP` trust config — all your users appear to come from the LB's IP.

Further reading