Last Updated: March 28, 2026
We have a grid of non-negative integers with m rows and n columns. Starting from the top-left cell (0, 0), we need to reach the bottom-right cell (m - 1, n - 1). At each step, we can only move right or down. Among all possible paths from start to destination, we want the one whose cell values add up to the smallest total.
This is closely related to the "Unique Paths" problem, but instead of counting paths, we are optimizing over them. The restricted movement (only right or down) means there are no cycles, which is exactly the structure dynamic programming needs. Every path makes exactly (m - 1) down moves and (n - 1) right moves, so every path visits the same number of cells: m + n - 1. The question is which combination of cells gives us the smallest sum.
The key observation: the minimum-cost path to any cell (i, j) must arrive from either (i - 1, j) or (i, j - 1), whichever offers the cheaper route. This gives us an optimal substructure property that makes DP the natural approach.
1 <= m, n <= 200 → The grid has at most 200 x 200 = 40,000 cells. An O(m * n) solution runs in well under a millisecond. Anything exponential is out.0 <= grid[i][j] <= 200 → All values are non-negative. This means we never benefit from taking a longer path. The minimum sum fits easily in an integer: worst case is 200 * (200 + 200 - 1) = 79,800.The most natural starting point is to think recursively from the destination backward. To find the minimum path sum to cell (i, j), we ask: what is the cheapest way to get to (i - 1, j) and to (i, j - 1)? We pick the cheaper one and add grid[i][j].
The base case is the top-left cell (0, 0), where the cost is just grid[0][0]. If we go out of bounds (negative row or column), we return infinity since that is not a valid path.
This works correctly, but it is brutally slow. The recursion tree branches into two at every cell, and the same subproblems get solved over and over.
minCost(i, j) that returns the minimum path sum from (0, 0) to (i, j).i == 0 and j == 0, return grid[0][0].i < 0 or j < 0, return a very large number (infinity).grid[i][j] + min(minCost(i - 1, j), minCost(i, j - 1)).minCost(m - 1, n - 1).The recursion recomputes the same cells over and over. There are only m * n distinct subproblems, but we solve each one multiple times. What if we computed the minimum cost for each cell exactly once, building up from the top-left corner?
Instead of recursing backward and recomputing the same cells, we flip the direction and build the answer forward. We create a 2D table dp where dp[i][j] represents the minimum path sum from (0, 0) to (i, j).
The recurrence is clean. Since the only ways to reach (i, j) are from above or from the left:
dp[i][j] = grid[i][j] + min(dp[i - 1][j], dp[i][j - 1])
The first row is special: you can only reach any cell in the first row by going right from the start, so dp[0][j] is just the prefix sum of the first row. Same idea for the first column, going straight down.
Every path to (i, j) must pass through either (i - 1, j) or (i, j - 1) as its second-to-last cell. So the cheapest path to (i, j) must use the cheapest path to one of those two predecessors. There is no way a more expensive path to a predecessor could lead to a cheaper overall path, because all grid values are non-negative.
dp of size m x n.dp[0][0] = grid[0][0].dp[0][j] = dp[0][j - 1] + grid[0][j].dp[i][0] = dp[i - 1][0] + grid[i][0].(i, j): set dp[i][j] = grid[i][j] + min(dp[i - 1][j], dp[i][j - 1]).dp[m - 1][n - 1].The time complexity is already optimal, but we are using O(m * n) extra space for the DP table. When filling row i, we only look at row i - 1 and the current row. What if we used a single 1D array and updated it row by row?
Since each row of the DP table only depends on the row directly above it, we can compress the 2D table into a single 1D array of length n. When we process columns left to right, dp[j] still holds the value from the previous row (the "from above" value) because we have not updated it yet, and dp[j - 1] holds the current row's value (the "from left" value) because we just updated it.
The update becomes: dp[j] = grid[i][j] + min(dp[j], dp[j - 1])
The data dependency structure makes this safe. Processing columns left to right means dp[j - 1] has already been updated for the current row (so it is the "left" value), and dp[j] has not been touched yet (so it still holds the previous row's value, which is the "above" value). After the update, dp[j] stores the current row's value, ready to serve as the "above" value when processing the next row.
dp of size n.dp with the prefix sums of the first row.i from 1 to m - 1:dp[0] = dp[0] + grid[i][0] (first column, can only come from above).j from 1 to n - 1: update dp[j] = grid[i][j] + min(dp[j], dp[j - 1]).dp[n - 1].