When distributing traffic across multiple servers or regions, use this skill to select and configure the appropriate load balancing solution (L4/L7, cloud-managed, self-managed, or Kubernetes ingress) with proper health checks and session management.
apm install @ancoleman/load-balancing-patterns[](https://apm-p1ls2dz87-atlamors-projects.vercel.app/packages/@ancoleman/load-balancing-patterns)---
name: load-balancing-patterns
description: When distributing traffic across multiple servers or regions, use this skill to select and configure the appropriate load balancing solution (L4/L7, cloud-managed, self-managed, or Kubernetes ingress) with proper health checks and session management.
---
# Load Balancing Patterns
Distribute traffic across infrastructure using the appropriate load balancing approach, from simple round-robin to global multi-region failover.
## When to Use This Skill
Use load-balancing-patterns when:
- Distributing traffic across multiple application servers
- Implementing high availability and failover
- Routing traffic based on URLs, headers, or geographic location
- Managing session persistence across stateless backends
- Deploying applications to Kubernetes clusters
- Configuring global traffic management across regions
- Implementing zero-downtime deployments (blue-green, canary)
- Selecting between cloud-managed and self-managed load balancers
## Core Load Balancing Concepts
### Layer 4 vs Layer 7
**Layer 4 (L4) - Transport Layer:**
- Routes based on IP address and port (TCP/UDP packets)
- No application data inspection, lower latency, higher throughput
- Protocol agnostic, preserves client IP addresses
- Use for: Database connections, video streaming, gaming, financial transactions, non-HTTP protocols
**Layer 7 (L7) - Application Layer:**
- Routes based on HTTP URLs, headers, cookies, request body
- Full application data visibility, SSL/TLS termination, caching, WAF integration
- Content-based routing capabilities
- Use for: Web applications, REST APIs, microservices, GraphQL endpoints, complex routing logic
For detailed comparison including performance benchmarks and hybrid approaches, see `references/l4-vs-l7-comparison.md`.
### Load Balancing Algorithms
| Algorithm | Distribution Method | Use Case |
|-----------|-------------------|----------|
| **Round Robin** | Sequential | Stateless, similar servers |
| **Weighted Round Robin** | Capacity-based | Different server specs |
| **Least Connections** | Fewest active connections | Long-lived connections |
| **Least Response Time** | Fastest server | Performance-sensitive |
| **IP Hash** | Client IP-based | Session persistence |
| **Resource-Based** | CPU/memory metrics | Varying workloads |
### Health Check Types
**Shallow (Liveness):** Is the process alive?
- Endpoint: `/health/live` or `/live`
- Returns: 200 if process running
- Use for: Process monitoring, container health
**Deep (Readiness):** Can the service handle requests?
- Endpoint: `/health/ready` or `/ready`
- Validates: Database, cache, external API connectivity
- Use for: Load balancer routing decisions
**Health Check Hysteresis:** Different thresholds for marking up vs down to prevent flapping
- Example: 3 failures to mark down, 2 successes to mark up
For complete health check implementation patterns, see `references/health-check-strategies.md`.
## Cloud Load Balancers
### AWS Load Balancing
**Application Load Balancer (ALB) - Layer 7:**
- Use for: HTTP/HTTPS applications, microservices, WebSocket
- Features: Path/host/header routing, AWS WAF integration, Lambda targets
- Choose when: Content-based routing needed
**Network Load Balancer (NLB) - Layer 4:**
- Use for: Ultra-low latency (<1ms), TCP/UDP, static IPs, millions RPS
- Features: Preserves source IP, TLS termination
- Choose when: Non-HTTP protocols, performance critical
**Global Accelerator - Layer 4 Global:**
- Use for: Multi-region applications, global users, DDoS protection
- Features: Anycast IPs, automatic regional failover
### GCP Load Balancing
**Application LB (L7):** Global HTTPS LB, Cloud CDN integration, Cloud Armor (WAF/DDoS)
**Network LB (L4):** Regional TCP/UDP, pass-through balancing, session affinity
**Cloud Load Balancing:** Single anycast IP, global distribution, backend buckets
### Azure Load Balancing
**Application Gateway (L7):** WAF integration, URL-based routing, SSL termination, autoscaling
**Load Balancer (L4):** Basic and Standard SKUs, health probes, HA ports
**Traffic Manager (Global):** DNS-based routing (priority, weighted, performance, geographic)
For complete cloud provider configurations and Terraform examples, see `references/cloud-load-balancers.md`.
## Self-Managed Load Balancers
### NGINX
**Best for:** General-purpose HTTP/HTTPS load balancing, web application stacks
**Capabilities:**
- HTTP reverse proxy with multiple algorithms
- TCP/UDP stream load balancing
- SSL/TLS termination
- Passive health checks (open source), active health checks (NGINX Plus)
- Cookie-based sticky sessions (NGINX Plus)
**Basic configuration:**
```nginx
upstream backend {
least_conn;
server backend1.example.com:8080 weight=3;
server backend2.example.com:8080 weight=2;
keepalive 32;
}
server {
listen 80;
location / {
proxy_pass http://backend;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
}
```
For complete NGINX patterns and advanced configurations, see `references/nginx-patterns.md`.
### HAProxy
**Best for:** Maximum performance, database load balancing, resource efficiency
**Capabilities:**
- Highest raw throughput, lowest memory footprint
- 10+ load balancing algorithms
- Sophisticated health checks (HTTP, TCP, Redis, MySQL, etc.)
- Cookie or IP-based persistence
**Basic configuration:**
```haproxy
frontend http_front
bind *:80
default_backend web_servers
backend web_servers
balance roundrobin
option httpchk GET /health
server web1 192.168.1.101:8080 check
server web2 192.168.1.102:8080 check
```
For complete HAProxy patterns, see `references/haproxy-patterns.md`.
### Envoy
**Best for:** Microservices, Kubernetes, service mesh integration
**Capabilities:**
- Cloud-native design with dynamic configuration (xDS APIs)
- Circuit breakers, retries, timeouts
- Advanced health checks (TCP, HTTP, gRPC)
- Excellent observability
For complete Envoy patterns, see `references/envoy-patterns.md`.
### Traefik
**Best for:** Docker/Kubernetes environments, dynamic configuration, ease of use
**Capabilities:**
- Automatic service discovery
- Native Kubernetes integration
- Built-in Let's Encrypt support
- Middleware system (auth, rate limiting)
For complete Traefik patterns, see `references/traefik-patterns.md`.
## Kubernetes Ingress Controllers
### Selection Guide
| Controller | Best For | Strengths |
|------------|----------|-----------|
| **NGINX Ingress** (F5) | General purpose | Stability, wide adoption, mature features |
| **Traefik** | Dynamic environments | Easy configuration, service discovery |
| **HAProxy Ingress** | High performance | Advanced L7 routing, reliability |
| **Envoy** (Contour/Gateway) | Service mesh | Rich L7 features, extensibility |
| **Kong** | API-heavy apps | JWT auth, rate limiting, plugins |
| **Cloud Provider** | Single-cloud | Native cloud integration |
### Basic Ingress Example
```yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: app-ingress
annotations:
nginx.ingress.kubernetes.io/ssl-redirect: "true"
nginx.ingress.kubernetes.io/affinity: "cookie"
spec:
ingressClassName: nginx
tls:
- hosts:
- app.example.com
secretName: app-tls
rules:
- host: app.example.com
http:
paths:
- path: /api
pathType: Prefix
backend:
service:
name: api-service
port:
number: 80
- path: /
pathType: Prefix
backend:
service:
name: web-service
port:
number: 80
```
For complete Kubernetes ingress examples and Gateway API patterns, see `references/kubernetes-ingress.md`.
## Session Persistence
### Sticky Sessions (Use Sparingly)
**Cookie-Based:** Load balancer sets cookie to track server affinity
- Accurate routing, works with NAT/proxies
- HTTP only, adds cookie overhead
**IP Hash:** Hash client IP to select backend server
- No cookie required, works for non-HTTP
- Poor distribution with NAT/proxies
**Drawbacks:** Uneven load distribution, session lost on server failure, complicates scaling
### Shared Session Store (Recommended)
Architecture: Stateless application servers + centralized session storage (Redis, Memcached)
**Benefits:**
- No sticky sessions needed
- True load balancing
- Server failures don't lose sessions
- Horizontal scaling trivial
### Client-Side Tokens (Best for APIs)
JWT (JSON Web Tokens): Server generates signed token, client stores and sends with requests
**Benefits:**
- Fully stateless servers
- Perfect load balancing
- No session storage needed
For complete session management patterns and code examples, see `references/session-persistence.md`.
## Global Load Balancing
### GeoDNS Routing
Route users to nearest server based on geographic location:
- DNS returns different IPs based on client location
- Reduces latency, supports compliance and regional content
- Implementation: AWS Route 53, GCP Cloud DNS, Azure Traffic Manager
### Multi-Region Failover
Primary/secondary region configuration:
- Health checks determine primary region health
- Automatic DNS failover to secondary
- Transparent to clients
### CDN Integration
Combine load balancing with CDN:
- GeoDNS routes to closest CDN PoP
- CDN caches content globally
- Origin load balancing for cache misses
For complete global load balancing examples with Terraform, see `references/global-load-balancing.md`.
## Decision Frameworks
### L4 vs L7 Selection
Choose **L4** when:
- Protocol is TCP/UDP (not HTTP)
- Ultra-low latency critical (<1ms)
- High throughput required (millions RPS)
- Client source IP preservation needed
Choose **L7** when:
- Protocol is HTTP/HTTPS
- Content-based routing needed (URL, headers)
- SSL termination required
- WAF integration needed
- Microservices architecture
### Cloud vs Self-Managed
Choose **Cloud-Managed** when:
- Single cloud deployment
- Auto-scaling required
- Team lacks load balancer expertise
- Managed service preferred
Choose **Self-Managed** when:
- Multi-cloud or hybrid deployment
- Advanced routing requirements
- Cost optimization important
- Full control needed
- Vendor lock-in avoidance
### Self-Managed Selection
- **NGINX:** General-purpose, web stacks, HTTP/3 support
- **HAProxy:** Maximum performance, database LB, lowest resource usage
- **Envoy:** Microservices, service mesh, dynamic configuration
- **Traefik:** Docker/Kubernetes, automatic discovery, easy configuration
## Configuration Examples
Complete working examples available in `examples/` directory:
**Cloud Providers:**
- `examples/aws/alb-terraform.tf` - AWS ALB with path-based routing
- `examples/aws/nlb-terraform.tf` - AWS NLB for TCP load balancing
**Self-Managed:**
- `examples/nginx/http-load-balancing.conf` - NGINX HTTP reverse proxy
- `examples/haproxy/http-lb.cfg` - HAProxy configuration
- `examples/envoy/basic-lb.yaml` - Envoy cluster configuration
- `examples/traefik/kubernetes-ingress.yaml` - Traefik IngressRoute
**Kubernetes:**
- `examples/kubernetes/nginx-ingress.yaml` - NGINX Ingress with TLS
- `examples/kubernetes/traefik-ingress.yaml` - Traefik IngressRoute
- `examples/kubernetes/gateway-api.yaml` - Gateway API configuration
## Monitoring and Observability
### Key Metrics
**Throughput:** Requests per second, bytes transferred, connection rate
**Latency:** Request duration (p50, p95, p99), backend response time, SSL handshake time
**Errors:** HTTP error rates (4xx, 5xx), backend connection failures, health check failures
**Resource Utilization:** CPU, memory, active connections, connection queue depth
**Health:** Healthy/unhealthy backend count, health check success rate
### Load Balancer Logs
Enable access logs for request/response details, client IPs, response times, error tracking
- **AWS ALB:** Store in S3, analyze with Athena
- **NGINX:** Custom log format, ship to centralized logging
- **HAProxy:** Syslog integration, structured logging
## Troubleshooting
### Uneven Load Distribution
**Symptoms:** One server receives disproportionate traffic
**Causes:** Sticky sessions with few clients, IP hash with NAT concentration, long-lived connections
**Solutions:** Switch to least connections, disable sticky sessions, implement connection draining
### Health Check Flapping
**Symptoms:** Servers rapidly transition between healthy/unhealthy
**Causes:** Health check timeout too short, threshold too low, network instability
**Solutions:** Increase interval and timeout, implement hysteresis, use deep health checks
### Session Loss After Failover
**Symptoms:** Users logged out when server fails
**Causes:** Sticky sessions without replication, in-memory sessions
**Solutions:** Implement shared session store (Redis), use client-side tokens (JWT)
## Integration Points
**Related Skills:**
- `infrastructure-as-code` - Deploy load balancers via Terraform/Pulumi
- `kubernetes-operations` - Ingress controllers for K8s traffic management
- `network-architecture` - Network design and topology for load balancing
- `deploying-applications` - Blue-green and canary deployments via load balancers
- `observability` - Load balancer metrics, access logs, distributed tracing
- `security-hardening` - WAF integration, rate limiting, DDoS protection
- `service-mesh` - Envoy as both ingress and service mesh proxy
- `implementing-tls` - TLS termination and certificate management
## Quick Reference
### Selection Matrix
| Use Case | Recommended Solution |
|----------|---------------------|
| HTTP web app (AWS) | ALB |
| Non-HTTP protocol (AWS) | NLB |
| Kubernetes HTTP ingress | NGINX Ingress or Traefik |
| Maximum performance | HAProxy |
| Service mesh | Envoy |
| Docker Swarm | Traefik |
| Multi-cloud portable | NGINX or HAProxy |
| Global distribution | CloudFlare, AWS Global Accelerator |
### Algorithm Selection
| Traffic Pattern | Algorithm |
|-----------------|-----------|
| Stateless, similar servers | Round Robin |
| Stateless, different capacity | Weighted Round Robin |
| Long-lived connections | Least Connections |
| Performance-sensitive | Least Response Time |
| Session persistence needed | IP Hash or Cookie |
| Varying server load | Resource-Based |
### Health Check Configuration
| Service Type | Check Type | Interval | Timeout |
|--------------|------------|----------|---------|
| Web app | HTTP /health | 10s | 3s |
| API | HTTP /health/ready | 10s | 5s |
| Database | TCP connect | 5s | 2s |
| Critical service | HTTP deep check | 5s | 3s |
| Background worker | HTTP /live | 30s | 5s |
## Summary
Load balancing is essential for distributing traffic, ensuring high availability, and enabling horizontal scaling. Choose L4 for raw performance and non-HTTP protocols, L7 for intelligent content-based routing. Prefer cloud-managed load balancers for simplicity and auto-scaling, self-managed for multi-cloud portability and advanced features. Implement proper health checks with hysteresis, avoid sticky sessions when possible, and monitor key metrics continuously.
For deployment patterns, see examples in `examples/aws/`, `examples/nginx/`, `examples/kubernetes/`, and other provider directories.