AlgoMaster Logo

Shortest Path Visiting All Nodes

Last Updated: April 2, 2026

hard
5 min read

Understanding the Problem

We need to find the shortest path (fewest edges) that visits every node in an undirected graph. There are a few things that make this different from typical shortest path problems.

First, we can start at any node. There's no fixed source. Second, we can revisit nodes and reuse edges. This is critical because the graph might not have a Hamiltonian path (a path that visits every node exactly once), so backtracking through already-visited nodes might be necessary. Third, "shortest" means fewest edges traversed, not fewest unique nodes.

The key challenge is tracking which nodes we've visited along the way. A simple BFS won't work because the "state" isn't just our current position. Two paths that arrive at the same node but have visited different subsets of nodes are fundamentally different states. That observation, that our state includes both where we are and what we've visited, is the core insight that drives every approach to this problem.

Key Constraints:

  • 1 <= n <= 12 → With at most 12 nodes, this is a strong signal toward exponential approaches. Bitmask techniques are very feasible since 2^12 = 4096 is tiny.
  • The graph is connected → We're guaranteed we can reach every node from every other node. No need to worry about disconnected components.
  • We can start at any node → We need to consider all possible starting positions.

Approach 1: Brute Force (Backtracking)

Intuition

The most straightforward idea is to try every possible order of visiting nodes. Since we need to visit all n nodes, one way to think about it is: what's the shortest walk through the graph that covers every node?

A brute force approach would try starting from each node and use backtracking to explore all possible next moves. At each step, we pick an unvisited neighbor (or walk through visited nodes to reach an unvisited one). We track the minimum number of edges needed to visit all nodes.

This is essentially enumerating paths, and it's wildly expensive. But it establishes what we're trying to optimize.

Algorithm

  1. For each starting node s from 0 to n-1, run a DFS/backtracking search.
  2. Maintain a visited set. At each step, try moving to each neighbor.
  3. If a neighbor hasn't been visited, mark it visited and recurse.
  4. If all nodes are visited, record the current path length.
  5. To reach an unvisited node, we might need to traverse through visited nodes.
  6. Return the minimum path length found across all starting nodes.

Example Walkthrough

Input:

0
[1,2,3]
1
[0]
2
[0]
3
[0]
graph

We try starting from each node. Starting from node 1: visit 1, go to 0 (visit 0), go to 2 (visit 2), back to 0, go to 3 (visit 3). Total: 4 edges.

4
result

Code

The backtracking approach explores a massive search tree and doesn't remember whether it's already found the shortest path for a given (currentNode, visitedSet) combination. What if we could guarantee each unique state is explored at most once?

Approach 2: BFS on State Space (Bitmask BFS)

Intuition

Here's the key insight: we can redefine what "state" means. In normal BFS on a graph, the state is just the current node. But in this problem, two paths arriving at the same node with different sets of visited nodes are completely different situations. The path that's visited {0, 1, 2} has different future possibilities than the one that's visited {0, 1, 3}.

So our state becomes a pair: (current node, set of visited nodes). Since n ≤ 12, we can represent the visited set as a bitmask. Bit i is set if node i has been visited. This gives us at most n 2^n total states, which for n=12 is 12 4096 = 49,152. That's tiny.

Now we just run BFS over this state space. BFS guarantees that the first time we reach a state where all bits are set (meaning all nodes are visited), that's the shortest path.

Since we can start at any node, we seed the BFS queue with all n starting states: (node i, bitmask with only bit i set) for each node i. The target is any state where the bitmask equals (1 << n) - 1, meaning all n bits are on.

Algorithm

  1. Let n be the number of nodes and fullMask = (1 << n) - 1 (all n bits set).
  2. Create a queue and a visited set for the state space.
  3. For each node i from 0 to n-1, enqueue the state (i, 1 << i) with distance 0. Mark it visited.
  4. Run BFS: dequeue a state (node, mask).
  5. If mask equals fullMask, return the current distance.
  6. For each neighbor of node, compute newMask = mask | (1 << neighbor).
  7. If state (neighbor, newMask) hasn't been visited, enqueue it with distance + 1 and mark visited.
  8. The BFS guarantees we find the shortest path first.

Example Walkthrough

1Seed BFS: start from all nodes. States: (0,0001), (1,0010), (2,0100), (3,1000)
0mask=00011mask=00102mask=01003mask=1000
1/5
1Seed all starting states at distance 0
Front
(0,0001)
(1,0010)
(2,0100)
(3,1000)
Rear
1/5

Code

The BFS approach is already efficient, but we can also solve this with dynamic programming. The DP approach decomposes the problem differently, separating shortest-path computation from the visiting-order optimization.

Approach 3: Bitmask DP

Intuition

Instead of BFS, we can solve this with dynamic programming. The idea is similar: our state is (current node, visited mask). But instead of exploring layer by layer, we build up solutions from smaller visited sets to larger ones.

Define dp[mask][i] as the minimum number of edges to reach node i with the set of visited nodes represented by mask. We want the minimum of dp[fullMask][i] across all nodes i.

The base case is straightforward: dp[1 << i][i] = 0 for all nodes i. Starting at node i with only node i visited costs 0 edges.

For the transitions, we precompute the shortest distance between every pair of nodes. If we know dist[i][j] for all pairs, the DP transition becomes simple: from state (mask, i), to visit a new node j, the cost is dp[mask][i] + dist[i][j]. The dist[i][j] value already accounts for any intermediate steps through other nodes.

This decomposition is elegant because it separates two concerns. The BFS precomputation answers "how do I get from A to B?" while the bitmask DP answers "in what order should I visit all nodes?"

Algorithm

  1. Precompute shortest distances between all pairs using BFS from each node.
  2. Initialize dp[1 << i][i] = 0 for each node i.
  3. Iterate through all masks from smallest to largest (by value).
  4. For each mask and each node i where bit i is set in mask:
    • For each node j not in mask, update dp[mask | (1 << j)][j] = min(dp[mask | (1 << j)][j], dp[mask][i] + dist[i][j]).
  5. Return the minimum of dp[fullMask][i] for all i.

Example Walkthrough

1Initialize: dp[00001][0]=0, dp[00010][1]=0, ..., dp[10000][4]=0
0cost=01cost=02cost=04cost=03cost=0
1/5
1Base case: each node starts with cost 0 in its own singleton mask
dp[00001][0]
:
0
dp[00010][1]
:
0
dp[00100][2]
:
0
dp[01000][3]
:
0
dp[10000][4]
:
0
1/5

Code