AlgoMaster Logo

Top K Frequent Words

Last Updated: March 29, 2026

medium
4 min read

Understanding the Problem

We have a list of words, some of which repeat, and we need to find the k most frequently occurring ones. So far this sounds identical to "Top K Frequent Elements," but there's a twist: when two words share the same frequency, they must be ordered alphabetically. The output must also be sorted by frequency (highest first), not returned in any order.

This tie-breaking requirement changes the problem significantly. With numbers, you could return results in any order. Here, the custom sort has two keys: frequency (descending) and then lexicographic order (ascending). Every approach needs to handle this dual ordering correctly, which makes data structure choices more interesting.

Key Constraints:

  • 1 <= words.length <= 500 — With n at most 500, even O(n^2) solutions are fast enough. But the follow-up asks for O(n log k), so we should aim higher.
  • 1 <= words[i].length <= 10 — Words are short. String comparisons are effectively O(1) since they're bounded by 10 characters.
  • words[i] consists of lowercase English letters — No unicode or special character edge cases to worry about.
  • k is valid — We won't need to handle k larger than the number of unique words.

Approach 1: Sort by Custom Comparator

Intuition

The most direct approach: count every word's frequency, then sort the unique words using a comparator that handles both criteria. Sort primarily by frequency in descending order, and break ties by alphabetical order ascending. Then grab the first k words.

This is the first thing most people think of, and honestly, for n up to 500, it runs in well under a millisecond. The logic is clean and easy to verify.

Algorithm

  1. Build a hash map freq mapping each word to its count.
  2. Collect all unique words into a list.
  3. Sort the list with a custom comparator: compare by frequency descending first, then by word lexicographically ascending for ties.
  4. Return the first k words from the sorted list.

Example Walkthrough

1Scan words to build frequency map
0
i
1
love
2
leetcode
3
i
4
love
5
coding
1/5
1Build frequency map from words
i
:
2
love
:
2
leetcode
:
1
coding
:
1
1/5

Code

This approach sorts all unique words when we only need the top k. What if we maintained only k candidates as we scanned, discarding the rest immediately?

Approach 2: Min-Heap

Intuition

Instead of sorting everything, we can use a min-heap of size k. The idea: scan through each unique word and push it into the heap. When the heap grows larger than k, we pop the "least desirable" word. After processing all words, the heap holds exactly the top k.

The comparator for the min-heap needs to be the reverse of what we want in the final output. We want high frequency and low alphabetical order in our result. So the heap should evict the word that's "worst" by our criteria: lowest frequency, and for ties, highest alphabetical order (since alphabetically later words are less desirable).

Algorithm

  1. Build a frequency map from words.
  2. Create a min-heap with a custom comparator: compare by frequency ascending first (lower frequency = smaller = gets popped first), and for equal frequencies, compare by word descending alphabetically (alphabetically later = less desirable = gets popped first).
  3. For each unique word, add it to the heap. If heap size exceeds k, pop the minimum.
  4. Extract elements from the heap. Since the heap gives them in min-first order, reverse the result.

Example Walkthrough

1Frequency map: i=2, love=2, leetcode=1, coding=1
i
:
2
love
:
2
leetcode
:
1
coding
:
1
1/6
1Empty heap, will maintain size k=2
[]
1/6

Code

The heap approach meets the follow-up requirement. But what if we could skip comparison-based ordering entirely by grouping words directly by frequency?

Approach 3: Bucket Sort

Intuition

The maximum frequency any word can have is n (if every element is the same word). So we create an array of n+1 buckets, where bucket[i] holds all words with frequency i. Then we walk from the highest bucket down, collecting words until we have k.

Within each bucket, words need to be in alphabetical order to satisfy the tie-breaking rule. We can sort each bucket individually. Since bucket sizes are small in practice, this adds minimal overhead.

Algorithm

  1. Build a frequency map from words.
  2. Create a bucket array of size n+1. For each unique word, add it to bucket[frequency].
  3. Sort each non-empty bucket alphabetically.
  4. Walk from the highest index down to 0. For each bucket, collect words into the result until we have k words total.
  5. Return the result.

Example Walkthrough

1Build frequency map from words
the
:
4
day
:
1
is
:
3
sunny
:
2
1/6
1Create bucket array, index = frequency
0
[]
1
[]
2
[]
3
[]
4
[]
1/6

Code