AlgoMaster Logo

Design Twitter

Last Updated: March 31, 2026

medium
4 min read

Understanding the Problem

We are building a simplified social media feed system. There are four operations: posting a tweet, following someone, unfollowing someone, and retrieving a news feed. The interesting operation is getNewsFeed because it needs to merge tweets from multiple users and return the 10 most recent ones.

Think of it this way. Each user has their own timeline of tweets (ordered by when they were posted). When we call getNewsFeed, we need to look at the timelines of the user and everyone they follow, then pick the 10 most recent tweets across all those timelines. This is essentially a "merge K sorted lists" problem, where K is the number of users whose tweets we care about, and each list is sorted by recency.

The key observation is that this merging step is the bottleneck. The follow/unfollow and posting operations are straightforward, but efficiently merging multiple sorted tweet lists to extract the top 10 is where the real design challenge lies.

Key Constraints:

  • 1 <= userId, followerId, followeeId <= 500 -> There are at most 500 users. This is small enough that even storing a follow set per user is fine.
  • 0 <= tweetId <= 10^4 -> Tweet IDs are small, but we can have up to 30,000 total tweets since each call can be a postTweet.
  • At most 3 * 10^4 calls -> With 30,000 operations, we have some flexibility. Even an O(n) scan per getNewsFeed call where n is total tweets would be at most 30,000 * 30,000 = 900 million in the absolute worst case, which is too slow. But since we only need the top 10, we can be smarter.
  • A user cannot follow himself -> We still include the user's own tweets in their feed, but we don't need to handle self-follow logic.

Approach 1: Brute Force (Collect and Sort)

Intuition

The most straightforward approach: when getNewsFeed is called, gather all tweets from the user and everyone they follow into one big list, sort that list by recency, and return the first 10.

For the other operations, postTweet just appends to a user's tweet list, follow adds to a set, and unfollow removes from a set. Nothing fancy.

The sorting step is where the cost lives. If a user follows many people and those people have posted many tweets, we are sorting a potentially large list every single time someone checks their feed.

Algorithm

  1. Maintain a Map<Integer, List<int[]>> where each user maps to their list of tweets. Each tweet is stored as [timestamp, tweetId].
  2. Maintain a Map<Integer, Set<Integer>> for follow relationships.
  3. Keep a global timestamp counter, incremented on every postTweet call.
  4. For postTweet(userId, tweetId): append [timestamp, tweetId] to the user's tweet list.
  5. For follow(followerId, followeeId): add followeeId to followerId's follow set.
  6. For unfollow(followerId, followeeId): remove followeeId from followerId's follow set.
  7. For getNewsFeed(userId): collect all tweets from the user and everyone in their follow set into one list. Sort by timestamp descending. Return the first 10 tweet IDs.

Example Walkthrough

1Twitter(): initialize empty tweetMap (userId -> list of [timestamp, tweetId])
1/8
1Twitter(): initialize empty followMap (userId -> set of followeeIds)
1/8
1Twitter(): no output yet
0
null
1/8

Code

The bottleneck is in getNewsFeed. We collect every single tweet from the user and all their followees, then sort the entire collection, just to return 10 tweets. What if we could take advantage of the fact that each user's tweets are already in chronological order, and merge them efficiently without sorting everything?

Approach 2: Heap-Based Merge (Optimal)

Intuition

Here is the key insight: each user's tweet list is already sorted by time (newest at the end, since we append in order). So getNewsFeed is really asking us to merge K sorted lists and return the top 10 elements. This is the classic "merge K sorted lists" pattern, and the tool for that job is a max-heap (priority queue).

Instead of dumping all tweets into one pile and sorting, we start by putting only the most recent tweet from each relevant user into the heap. We pop the most recent tweet overall, add it to our result, and then push that user's next most recent tweet into the heap. We repeat this until we have 10 tweets or the heap is empty.

This way, we touch at most 10 tweets per user (and usually far fewer), rather than every tweet they have ever posted. The heap ensures we always pick the globally most recent tweet across all users in O(log K) time, where K is the number of users in the feed.

Algorithm

  1. Maintain a Map<Integer, List<int[]>> for tweets (each entry is [timestamp, tweetId]), a Map<Integer, Set<Integer>> for follow relationships, and a global timestamp counter.
  2. postTweet: append [timestamp++, tweetId] to the user's tweet list.
  3. follow: add followeeId to followerId's set.
  4. unfollow: remove followeeId from followerId's set.
  5. getNewsFeed:
    • Build a list of relevant users: the user themselves plus everyone they follow.
    • For each relevant user who has tweets, push their most recent tweet into a max-heap. The heap entry is [timestamp, tweetId, userId, index], where index points to the position in that user's tweet list.
    • Pop from the heap up to 10 times. Each time we pop, add the tweetId to the result, then push the same user's next most recent tweet (if it exists) into the heap.

Example Walkthrough

1Twitter(): initialize empty tweetMap (userId -> list of [timestamp, tweetId])
1/8
1Twitter(): initialize empty followMap (userId -> set of followeeIds)
1/8
1Twitter(): heap not used until getNewsFeed is called
[]
1/8
1Twitter(): no output yet
0
null
1/8

Code