AlgoMaster Logo

Remove Duplicate Letters

Last Updated: March 29, 2026

medium
4 min read

Understanding the Problem

At first glance, this problem might look like a simple "remove duplicates" task, but the lexicographical ordering requirement is what makes it tricky. We cannot just take the first occurrence of each letter, because that might not give us the smallest result. For example, in "cbabc", taking first occurrences gives "cba", but the answer is "abc".

We need to select exactly one occurrence of each distinct character, and the characters we pick must appear in the same relative order as in the original string (it is a subsequence). Among all such valid subsequences, we want the lexicographically smallest one.

The key insight is that this is a greedy problem with a monotonic stack flavor. At each position, we want to keep the result as small as possible by removing larger characters from our answer, but only when we know those characters appear again later in the string.

Key Constraints:

  • 1 <= s.length <= 10^4 --> An O(n^2) approach would give at most 10^8 operations, which is borderline. O(n) or O(n log n) is preferred.
  • s consists of lowercase English letters --> Only 26 possible characters. This means frequency arrays and boolean arrays are O(1) in space.

Approach 1: Brute Force (Generate All Subsequences)

Intuition

The most straightforward way to think about this: generate every possible subsequence that contains each distinct character exactly once, then pick the lexicographically smallest one.

If the string has k distinct characters, we need to choose exactly one position for each character. For each character, there might be multiple positions to choose from. We try every combination and keep the smallest valid subsequence.

This is conceptually simple but wildly inefficient. If a character appears m times and we have k distinct characters, the number of combinations can blow up fast.

Algorithm

  1. Find all distinct characters in the string and their positions.
  2. Generate all possible ways to pick one position for each distinct character.
  3. For each combination, check that the selected positions form a valid subsequence (positions are in increasing order).
  4. Among all valid subsequences, return the lexicographically smallest one.

Example Walkthrough

1Input: s = "bcabc", distinct chars: {a, b, c}
0
b
1
c
2
a
3
b
4
c
1/4

Code

The brute force explores an exponential number of combinations, doing huge amounts of wasted work by not making smart choices along the way. What if instead of trying all possibilities, we built the answer one character at a time, always making the greedily best choice?

Approach 2: Greedy with Monotonic Stack

Intuition

Here is the key observation: we want the lexicographically smallest result, and we are building a subsequence. Think of it as building a string character by character. At each step, we want the smallest character possible at each position.

Imagine you are building the result left to right. You have already added some characters. Now you see a new character that is smaller than the last character in your result. Should you keep the last character or remove it? That depends on one thing: does the last character appear again later in the string? If yes, you can safely remove it now and pick it up later. If no, you are stuck with it.

This "remove the top if it is larger and appears later" logic is exactly what a monotonic stack does. The stack maintains characters in a way that keeps the result as small as possible, while ensuring we do not permanently lose any character we need.

We also need to avoid adding a character that is already in the stack, since we want each character exactly once.

Algorithm

  1. Count the frequency of each character in the string (or record the last index of each character).
  2. Initialize an empty stack and a boolean array tracking which characters are currently in the stack.
  3. Iterate through each character in the string:
    • If the character is already in the stack, skip it.
    • Otherwise, while the stack is not empty, and the top of the stack is greater than the current character, and the top character appears later in the string (last index > current index), pop the top and mark it as not in the stack.
    • Push the current character onto the stack and mark it as in the stack.
  4. The stack contains the answer.

Example Walkthrough

1Initialize: scan string left to right, stack is empty
0
c
i
1
b
2
a
3
c
4
d
5
c
6
b
7
c
1/9

Code