News Feed Architecture: Push vs Pull Models in Social Media
How Twitter and Instagram fan-out billions of posts to users in milliseconds using hybrid architecture patterns.
The Core Challenge of the News Feed
When you open Twitter, Instagram, or Facebook, you expect to instantly see a chronological (or algorithmically sorted) list of posts from people you follow.
Generating this feed on-the-fly via a SQL `JOIN` across millions of rows is impossible at scale: ```sql SELECT * FROM posts WHERE user_id IN (SELECT following_id FROM follows WHERE follower_id = MY_ID) ORDER BY created_at DESC ``` This query will crash your database. So, how do we solve it?
The Two Distinct Approaches
1. The Pull Model (Fan-out on Load)
When a user opens the app, the system fetches all their friends, fetches their recent posts, merges them, sorts them in memory, and returns the result.
- Pros: Writing a post is extremely fast `O(1)`. Minimal storage wasted.
- Cons: Reading the feed is incredibly slow `O(N)`. This is terrible for user experience.
2. The Push Model (Fan-out on Write)
When a user publishes a post, the system immediately pushes (copies) that post's ID into the pre-computed "Feed Cache" (a Redis List) of every single person who follows them.
- Pros: Reading the feed is blazing fast `O(1)`. You just `LRANGE` from Redis.
- Cons: Writing a post is slow and computationally expensive.
The Justin Bieber Problem (Celebrity Accounts)
The Push Model works great for normal users with 500 followers. But what happens when Justin Bieber (with 100 Million followers) posts a photo? Pushing that post ID into 100 Million different Redis lists will choke your worker servers, delay notifications, and consume massive amounts of memory. This is known as the Celebrity Fan-out Problem.
The Industry Solution: Hybrid Architecture
Modern social networks use a Hybrid approach to get the best of both worlds.
- For Normal Users (Push): When a normal user posts, workers use the Push model to fan-out the post to their followers' pre-computed Redis feeds.
- For Celebrities (Pull): Celebrities (users with followers > 100k) are flagged in the database. When they post, their post is NOT fanned out. It is simply saved to their own profile timeline.
- The Merge on Read: When you load your feed, the system does two things in parallel:
- Fetches your pre-computed feed from Redis (containing ordinary friends).
- Fetches the recent posts of any celebrities you follow.
- Merges and sorts these two lists in memory at the edge/application layer before returning them to your phone.
Data Storage Strategy
- Post Content: Stored in a highly scalable NoSQL database like Cassandra or DynamoDB.
- Feed Caches: Stored in Redis (Lists or Sorted Sets). Since users rarely scroll past the first few hundred posts, the Redis list is usually capped at ~500 item IDs. Anything older falls back to the database.
System Design Interview Tip
If an interviewer asks you to design Twitter, identifying the "Celebrity Problem" and proposing the Hybrid Model is often the difference between a Hire and a Strong Hire.
Practice articulating this trade-off clearly using the Twitter Design scenario on EngMock.com. Our AI will evaluate your ability to identify the bottleneck and propose the Hybrid solution dynamically.
Ready to test these skills?
Practice this exact system design scenario with our AI interviewer and get graded on your architecture choices.
