Last Updated: April 9, 2026
We need to find the shortest contiguous substring of s that contains every character in t, counting duplicates. So if t = "AAB", our window must contain at least two 'A's and one 'B'.
Notice that order doesn't matter. We're not looking for t as a subsequence in some particular arrangement. We just need the right characters in the right quantities. This is fundamentally a frequency-matching problem layered on top of a substring search.
The key observation: once we find a valid window, we can try shrinking it from the left to find a smaller one. And once the window becomes invalid, we expand it from the right. This expand-then-shrink pattern is the signature of a sliding window approach.
m and n can be up to 10^5 → An O(m^2) brute force that checks every substring is borderline, and verifying each substring adds more cost. We need something closer to O(m).t matter → We need frequency counts, not just a set of characters.The most straightforward approach: generate every possible substring of s and check whether it contains all the characters of t. Among all valid substrings, return the shortest one.
To check whether a substring contains all characters of t, we compare character frequencies. Build a frequency map for t, build one for the substring, and verify that every character in t appears at least as many times in the substring.
ti from 0 to m - 1j from i to m - 1, maintain a running frequency map of s[i..j]s[j], check if the current window satisfies all frequency requirements of t"" if none existsThe bottleneck is that for every starting position i, we expand rightward until we find a valid window, then move to i + 1 and start scanning rightward again from scratch. What if instead of restarting, we just shrank the window from the left?
Instead of restarting a fresh window for every starting position, we maintain a single window defined by two pointers, left and right. We expand right to include more characters until the window is valid (contains all of t). Then we shrink left to try to make the window smaller while keeping it valid. Every time we have a valid window, we check if it's the smallest we've seen.
The trick to making the validity check O(1) is tracking a counter called formed. We count how many distinct characters in t have their frequency requirement fully met by the current window. When formed equals the number of distinct characters in t, the window is valid. Each time we add or remove a character, we only need to update one frequency and potentially increment or decrement formed.
The left pointer never moves backward. The right pointer never moves backward. So each pointer traverses s at most once, giving us O(m) pointer movements total. At each step, we do O(1) work: updating a frequency, comparing two integers, and possibly updating the best result.
The formed counter is the key optimization. Without it, we'd need to scan the entire frequency map every time to check validity, which would cost O(C) per step. By maintaining formed incrementally, we reduce the validity check to O(1).
tCount for t. Count the number of distinct characters in t as requiredleft = 0, formed = 0, and a window frequency map windowCountright from 0 to m - 1:s[right] to windowCountwindowCount[s[right]] now equals tCount[s[right]], increment formedformed == required (window is valid):s[left] from windowCountwindowCount[s[left]] drops below tCount[s[left]], decrement formedleft forwardtCount takes O(n). The two pointers together visit each character of s at most twice (once when right expands, once when left shrinks), so the main loop is O(m). Total: O(m + n).windowCount and O(n) entries for tCount. In practice, both are bounded by the character set size (52 or 128), but in Big-O terms with respect to input, it's O(m + n).The sliding window approach is already O(m + n), so we can't improve the asymptotic complexity. But when s contains many characters not in t, we waste time stepping through irrelevant positions. What if we could skip over characters that don't appear in t at all?
When s is much longer than t and contains many characters not in t, the standard sliding window wastes time stepping through irrelevant positions. We can pre-filter s to create a list of only the positions (and characters) that appear in t. Then we apply the exact same sliding window logic on this filtered list.
For example, with s = "XXXXAXXXXBXXXXCXXXX" and t = "ABC", the filtered list would be [(4,'A'), (9,'B'), (14,'C')]. The sliding window runs on 3 elements instead of 19.
This doesn't change the worst-case complexity (if every character in s is in t, the filtered list is the same size as s). But in practice, when s has many irrelevant characters, this runs significantly faster.
tCount for t(index, character) pairs from s, keeping only characters that appear in tright to include more entries until the window is validleft while the window remains valids spans from filtered[left].index to filtered[right].indextCount is O(n). Building the filtered list is O(m). The sliding window iterates over the filtered list, where each entry is visited at most twice (once by right, once by left). The filtered list has at most m entries, so the total is still O(m + n).tCount and O(n) for windowCount (bounded by distinct characters in t).