AlgoMaster Logo

Group Anagrams

Last Updated: April 3, 2026

medium
4 min read

Understanding the Problem

Two strings are anagrams if they contain exactly the same characters with exactly the same frequencies. "eat" and "tea" are anagrams because both have one 'e', one 'a', and one 't'. Order doesn't matter, only the character composition does.

So the real question becomes: how do we create a "fingerprint" for each string such that all anagrams share the same fingerprint? If we can do that, we just group strings by their fingerprint using a hash map.

This is fundamentally a grouping problem, and the entire challenge lies in choosing the right key for your hash map.

Key Constraints:

  • 1 <= strs.length <= 10^4 → Up to 10,000 strings. We need an approach where the per-string work is efficient. O(n^2) comparisons between all pairs would mean up to 10^8 operations, which is too slow.
  • 0 <= strs[i].length <= 100 → Individual strings are short (at most 100 characters). This means sorting each string is cheap, and building a frequency array of 26 characters is trivial.
  • lowercase English letters only → Only 26 possible characters. This is a huge hint. We can use a fixed-size array of 26 to represent character frequencies.

Approach 1: Sorting

Intuition

Here's the simplest observation: if two strings are anagrams, they contain the same characters. So if you sort both strings alphabetically, you get the exact same result. "eat" sorts to "aet". "tea" also sorts to "aet". "ate" sorts to "aet" too. That sorted version acts as a canonical form, a fingerprint that all anagrams share.

So the plan is straightforward. For each string, sort its characters to produce a key. Then use a hash map to group all strings that share the same sorted key.

Algorithm

  1. Create a hash map where the key is a sorted string and the value is a list of original strings.
  2. For each string in the input array:
    • Sort the characters of the string to produce a key.
    • Add the original string to the list associated with that key in the hash map.
  3. Return all the lists from the hash map as the result.

Example Walkthrough

1Initialize: empty hash map
0
eat
1
tea
2
tan
3
ate
4
nat
5
bat
1/7

Code

The bottleneck here is sorting each string. For every string of length k, we do O(k log k) work just to produce a key. But since the input only contains lowercase English letters (26 possible characters), we could build the frequency count directly in O(k) time instead of sorting.

Approach 2: Character Frequency Count (Optimal)

Intuition

Since the input only contains lowercase English letters, there are exactly 26 possible characters. Instead of sorting a string to get its canonical form, we can count how many times each character appears and use that frequency count as the key.

For "eat": a=1, e=1, t=1. For "tea": a=1, e=1, t=1. Same frequency count, same key.

The trick is converting this frequency array into a hashable key. A common approach is to join the counts with a delimiter, like #1#0#0#0#1#0#0...#1#0#0... (where each number represents the count for a, b, c, ..., z). This produces a unique string for each distinct frequency pattern.

This avoids sorting entirely and brings the per-string cost down from O(k log k) to O(k).

Algorithm

  1. Create a hash map where the key is a frequency-encoded string and the value is a list of original strings.
  2. For each string in the input array:
    • Create an array of size 26 (initialized to zeros) to count character frequencies.
    • Iterate through the string, incrementing the count for each character.
    • Convert the frequency array into a string key (e.g., join counts with a delimiter).
    • Add the original string to the list associated with that key in the hash map.
  3. Return all the lists from the hash map as the result.

Example Walkthrough

1Initialize: empty hash map, process each string
0
eat
1
tea
2
tan
3
ate
4
nat
5
bat
1/7

Code

The frequency count approach is already optimal in terms of time complexity. You can't do better than O(n * k) since you must read every character of every string at least once. But there's an alternative approach using prime numbers that produces a single numeric key instead of a string key, which is worth knowing because interviewers sometimes ask for it.

Approach 3: Prime Number Hashing (Alternative)

Intuition

Here's a clever mathematical trick. Assign each letter a unique prime number: a=2, b=3, c=5, d=7, e=11, and so on. For each string, multiply the primes corresponding to its characters. Because of the fundamental theorem of arithmetic (every integer has a unique prime factorization), two strings produce the same product if and only if they contain the same characters with the same frequencies.

For "eat": 11 * 2 * 71 = 1562. For "tea": 71 * 11 * 2 = 1562. Same product. For "bat": 3 * 2 * 71 = 426. Different product.

This gives us a numeric key that requires no string building, no sorting, and no delimiter handling.

Algorithm

  1. Define a mapping from each letter (a-z) to a unique prime number.
  2. Create a hash map where the key is a long integer (the prime product) and the value is a list of original strings.
  3. For each string in the input array:
    • Compute the product of primes for each character.
    • Add the original string to the list associated with that product in the hash map.
  4. Return all the lists from the hash map as the result.

Example Walkthrough

1Primes: a=2, b=3, e=11, n=47, t=71. Compute product for each string
0
eat
1
tea
2
tan
3
ate
4
nat
5
bat
1/7

Code