Scaling High Traffic Backend

System handling heavy traffic faced p95 latency spikes, DB contention, failures under load.

Requirements

- Thousands RPS - p95 < 200ms - 99.9% uptime

Architecture

Client → API Gateway → Service → Cache → DB Async: Service → Queue → Worker → DB

LLD

GET /orders → cache → DB fallback POST /orders → DB → queue → worker

API

GET /orders/{id} POST /orders POST /payments

Failure Handling

Retry • DLQ • Circuit breaker • Idempotency

Improvements

Caching + async queues + DB optimization

Impact

Latency 220 → 150ms Better throughput Stable system

Graph