Understanding Scalability: From 0 to 10 Million Users | EngMock Blog

The Single Server Setup

Every great startup begins with a single server. Your web app, database, and background workers all live on one machine. It's simple, cheap, and easy to deploy. But what happens when you go viral?

Stage 1: Separate the Database

The first bottleneck is usually the database fighting for memory/CPU with your web server. Moving the database to a dedicated instance allows independent scaling. You can now vertically scale (upgrade) your DB server without touching the web tier.

Stage 2: Load Balancing & Horizontal Scaling

Eventually, one web server isn't enough. You add a Load Balancer (Nginx, ALB) and run multiple web servers. Now, users are distributed across them. This introduces a new problem: Session State. You can no longer store sessions in memory; you need a shared store like Redis.

Stage 3: Database Replication

Your database is now the bottleneck again. Most apps are read-heavy. By setting up a Master-Slave replication, you can send all writes to the Master and all reads to Slaves (Read Replicas). This relieves pressure on the Master.

Stage 4: Caching

The fastest query is the one you don't make. Implementing caching at multiple layers (CDN for static assets, Redis for database queries) drastically reduces latency and load.

Stage 5: Sharding

When your data volume exceeds a single server's capacity, you must shard (partition) your database. This is complex. You might shard by UserID ranges or hash. Avoid this until absolutely necessary.

Summary

Scalability isn't a feature you add at the end; it's an evolution. Don't over-engineer early, but know the path ahead.