Last Updated: April 6, 2026
We need to find every index i (from 1 to n-1) where we cut the string into s[0..i-1] and s[i..n-1], and both halves have the same number of distinct characters.
The brute force idea is obvious: try every split point and count distinct characters on each side. But counting distinct characters from scratch at every position is wasteful. The key observation is that as we slide the split point one position to the right, only one character moves from the right half to the left half. So the distinct counts change incrementally, and we can track them efficiently.
1 <= s.length <= 10^5 → We need O(n) or O(n log n). An O(n^2) brute force that rebuilds sets at every split will be too slow.s consists of only lowercase English letters → At most 26 distinct characters. This means any frequency map or set we maintain has bounded size, so operations on it are O(1).The most direct approach: for every possible split position, build a set of characters for the left half and a set for the right half, then check if their sizes match. There are n-1 split positions, and at each one we scan the left and right substrings.
i from 1 to n-1:s[0..i-1] using a set.s[i..n-1] using a set.At every split point, we rebuild the left and right sets from scratch. But moving the split one position to the right only changes one character. What if we precomputed the distinct counts from both ends so each check is O(1)?
Instead of recomputing distinct counts at every split, we can precompute them. Scan left-to-right and record how many distinct characters we've seen up to each index. Do the same from right-to-left. Then for each split point, just compare the two precomputed values.
This is essentially the prefix sum idea, but instead of summing values, we're tracking set sizes.
prefixDistinct of size n, where prefixDistinct[i] is the number of distinct characters in s[0..i].suffixDistinct of size n, where suffixDistinct[i] is the number of distinct characters in s[i..n-1].i from 1 to n-1, check if prefixDistinct[i-1] == suffixDistinct[i].The prefix array captures a monotonically non-decreasing sequence: as we include more characters from the left, we can only see the same or more distinct characters. The suffix array works the same way from the right. At any split point i, prefixDistinct[i-1] tells us the distinct count of the left half and suffixDistinct[i] tells us the distinct count of the right half, both computed in O(1) after the preprocessing.
This approach achieves O(n) time but uses two auxiliary arrays of size n. Can we track the same information using just two counters and a single sweep, eliminating both arrays?
Here's the idea: start by putting all character frequencies into a "right" frequency map. Then sweep the split point from left to right. At each step, move the current character from the right side to the left side. When a character is added to the left for the first time, the left distinct count goes up. When a character's frequency in the right map drops to zero, the right distinct count goes down. We just need to track these two counts.
This eliminates both arrays and solves the problem in a single pass (after the initial frequency count).
s (this represents the "right" side initially).rightDistinct as the number of keys in this map and leftDistinct as 0.i from 0 to n-2 (these are the split points):s[i] to the left side: if it's the first occurrence on the left, increment leftDistinct.s[i] from the right side: decrement its frequency. If frequency reaches 0, decrement rightDistinct.leftDistinct == rightDistinct, this is a good split.