Last Updated: April 5, 2026
We need to find the shortest path from beginWord to endWord, where each step changes exactly one letter, and every intermediate word must exist in the given wordList. The answer is the total number of words in that path (including both beginWord and endWord), not the number of steps.
This is fundamentally a shortest-path problem on an implicit graph. Each word is a node, and an edge exists between two words if they differ by exactly one letter. Since every edge has the same "cost" (one transformation), BFS is the natural choice for finding the shortest path.
The key challenge is efficiency. With up to 5000 words of length up to 10, naively comparing every pair of words to check if they differ by one letter would be expensive. The trick is to find a smarter way to discover neighbors.
1 <= wordList.length <= 5000 -> We have at most 5000 words. An O(n^2) approach that compares every pair would be 25 million comparisons, each costing O(m) for string comparison. That's 250 million operations at word length 10, which is borderline.1 <= beginWord.length <= 10 -> Words are short. This means we can enumerate all possible one-letter mutations of a word (26 * 10 = 260 possibilities) very quickly.beginWord != endWord -> No trivial case where start equals end.The most straightforward way to solve this is to treat it as a graph problem. Each word is a node, and two nodes are connected if they differ by exactly one character. We want the shortest path from beginWord to endWord, so BFS is the right tool since it explores nodes level by level and guarantees the shortest path in an unweighted graph.
For each word we dequeue during BFS, we scan the entire word list to find all words that differ by exactly one letter. Those are our neighbors.
endWord is not in wordList, return 0 immediately since no transformation is possible.wordList into a set for O(1) lookup.beginWord and a level counter starting at 1.endWord, return the current level.endWord, return 0.Input:
BFS explores level by level. Starting from "hit", we find "hot" (one letter different). From "hot", we find "dot" and "lot". From "dot", we find "dog". From "dog", we find "cog", our target. That gives us the path hit -> hot -> dot -> dog -> cog, which is 5 words.
This approach works but is slow for large inputs because we scan the entire word set for every dequeued word. What if we could generate all possible one-letter mutations and check if they exist in constant time?
Instead of comparing every pair of words, we can flip the problem around. For a word like "hot", we can generate wildcard patterns by replacing each character with a placeholder: "ot", "ht", "ho*". Any word matching the same pattern is exactly one letter away.
So we preprocess the word list: for each word, generate all its wildcard patterns, and store them in a map. The map key is the pattern, and the value is a list of words matching that pattern. Now during BFS, finding neighbors is just a matter of generating the current word's patterns and looking them up in the map.
This is much faster because generating patterns for a word of length m takes O(m) time, and each pattern lookup is O(1). Instead of comparing against n words, we check at most 26 * m possible mutations.
The pattern map is essentially a precomputed adjacency list. Two words are neighbors if and only if they share at least one wildcard pattern. By grouping words under their patterns, we avoid the O(n) scan per word during BFS. Instead, for each word, we generate m patterns (one per character position) and look up each pattern's word list in O(1). The total neighbors we examine per word is bounded by the actual number of neighbors, not the entire word list.
endWord is not in wordList, return 0.beginWord), replace each character position with '*' and map the pattern to all words matching it.beginWord, a visited set containing beginWord, and a level counter starting at 1.endWord, return level + 1. Otherwise, mark it as visited and add it to the queue.The pattern map BFS is efficient, but it still explores outward from one direction. What if we searched from both ends simultaneously and stopped when the two frontiers meet?
Standard BFS explores outward from the start like a ripple in water. The search space grows exponentially at each level. Bidirectional BFS is a clever optimization: instead of searching only from beginWord, we also search from endWord and stop when the two frontiers meet.
Why is this faster? If the shortest path has length d and the branching factor is b, standard BFS explores roughly b^d nodes. Bidirectional BFS explores about 2 * b^(d/2) nodes, which is exponentially smaller. The trick is to always expand from the smaller frontier, keeping both frontiers roughly balanced.
endWord is not in wordList, return 0.frontSet containing beginWord and backSet containing endWord. Also maintain a visited set.nextSet.nextSet and increment the level.