Scaling High Traffic Backend
System handling heavy traffic faced p95 latency spikes, DB contention, failures under load.
Requirements
- Thousands RPS
- p95 < 200ms
- 99.9% uptime
Architecture
Client → API Gateway → Service → Cache → DB
Async:
Service → Queue → Worker → DB
LLD
GET /orders → cache → DB fallback
POST /orders → DB → queue → worker
API
GET /orders/{id}
POST /orders
POST /payments
Failure Handling
Retry • DLQ • Circuit breaker • Idempotency
Improvements
Caching + async queues + DB optimization
Impact
Latency 220 → 150ms
Better throughput
Stable system
Graph