AlgoMaster Logo

Longest Common Subsequence

Last Updated: March 28, 2026

medium
5 min read

Understanding the Problem

We have two strings and need to find the longest sequence of characters that appears in both, in the same order, but not necessarily consecutively. This is different from the longest common substring problem, where characters must be consecutive.

For instance, given "abcde" and "ace", the character a appears in both, c appears in both after a, and e appears in both after c. So "ace" is a common subsequence of length 3. There is no common subsequence of length 4, so the answer is 3.

The key observation is that this problem has optimal substructure. If we compare the last characters of both strings and they match, then the LCS includes that character plus the LCS of the remaining prefixes. If they don't match, the LCS is the better of two choices: skip the last character of the first string, or skip the last character of the second string. This recursive structure is what makes dynamic programming the right tool here.

Key Constraints:

  • 1 <= text1.length, text2.length <= 1000 -> Both strings can be up to 1000 characters. With m and n up to 1000, an O(m * n) solution means about 1 million operations, which is very comfortable. An O(2^n) brute force would be astronomically slow.
  • text1 and text2 consist of only lowercase English characters. -> No special character handling needed. Only 26 possible characters.

Approach 1: Brute Force (Recursion)

Intuition

The most natural way to think about this problem is recursively. Start from the end of both strings and compare characters. If the last characters match, they are part of the LCS, so we add 1 and recurse on the remaining prefixes. If they don't match, we have two choices: drop the last character of text1 and recurse, or drop the last character of text2 and recurse. We take whichever gives a longer subsequence.

This directly mirrors the mathematical definition of LCS:

  • If text1[i] == text2[j], then LCS(i, j) = 1 + LCS(i-1, j-1)
  • Otherwise, LCS(i, j) = max(LCS(i-1, j), LCS(i, j-1))
  • Base case: if either index is less than 0, return 0

Algorithm

  1. Define a recursive function solve(i, j) where i is the current index in text1 and j is the current index in text2.
  2. Base case: if i < 0 or j < 0, return 0 (one string is exhausted).
  3. If text1[i] == text2[j], return 1 + solve(i - 1, j - 1).
  4. Otherwise, return max(solve(i - 1, j), solve(i, j - 1)).
  5. Start the recursion with solve(m - 1, n - 1) where m and n are the lengths of the two strings.

Example Walkthrough

Input:

0
a
1
b
2
c
3
d
4
e
text1
0
a
1
c
2
e
text2

The recursion starts at solve(4, 2). Since text1[4]='e' matches text2[2]='e', we get 1 + solve(3, 1). At solve(3, 1), text1[3]='d' does not match text2[1]='c', so we branch into solve(2, 1) and solve(3, 0). At solve(2, 1), text1[2]='c' matches text2[1]='c', giving 1 + solve(1, 0). This pattern continues until we reach the base cases, ultimately returning 3.

3
result

Code

The recursion recomputes the same subproblems over and over. The number of unique (i, j) pairs is only m * n, but the brute force explores exponentially many paths. What if we cached each result the first time we compute it?

Approach 2: Top-Down DP (Memoization)

Intuition

The brute force recursion has the right logic, it just does too much redundant work. The fix is memoization: cache the result of solve(i, j) so we never compute the same subproblem twice. Since i ranges from 0 to m-1 and j ranges from 0 to n-1, we need an m x n table to store results.

This is a classic example of overlapping subproblems, one of the two ingredients that make a problem suitable for dynamic programming. The other ingredient, optimal substructure, is already present: the LCS of the full strings can be built from the LCS of smaller prefixes.

Algorithm

  1. Create a 2D memo table of size m x n, initialized to -1 (meaning "not computed yet").
  2. Define solve(i, j) with the same logic as Approach 1.
  3. Before computing, check if memo[i][j] != -1. If so, return the cached value.
  4. After computing, store the result in memo[i][j] before returning.
  5. Call solve(m - 1, n - 1).

Example Walkthrough

Input:

0
a
1
b
2
c
3
d
4
e
text1
0
a
1
c
2
e
text2

The same recursion as Approach 1, but now each solve(i, j) stores its result in memo[i][j]. The first call to solve(2, 1) computes and caches the value. Any later call to solve(2, 1) returns instantly from the cache. Instead of exponential calls, we make at most m * n = 15 unique computations, each doing O(1) work.

3
result

Code

The memoized solution has the right time complexity, but the recursion still adds stack overhead. What if we filled the table iteratively instead, row by row, eliminating recursion entirely?

Approach 3: Bottom-Up DP (Tabulation)

Intuition

Instead of recursion with memoization, we can build the DP table iteratively. The idea is the same: dp[i][j] represents the LCS length of text1[0..i-1] and text2[0..j-1]. We shift indices by 1 so that dp[0][...] and dp[...][0] serve as the base cases (empty string has LCS 0 with anything).

We fill the table row by row, left to right. When we compute dp[i][j], we already have dp[i-1][j-1], dp[i-1][j], and dp[i][j-1] available. This eliminates recursion entirely and gives us a clean iterative solution.

Algorithm

  1. Create a 2D table dp of size (m + 1) x (n + 1), initialized to 0.
  2. For each i from 1 to m and each j from 1 to n:
    • If text1[i-1] == text2[j-1], set dp[i][j] = 1 + dp[i-1][j-1].
    • Otherwise, set dp[i][j] = max(dp[i-1][j], dp[i][j-1]).
  3. Return dp[m][n].

Example Walkthrough

1Initialize 6x4 dp table with all zeros. Rows=text1 chars, Cols=text2 chars.
0
1
2
3
0
0
0
0
0
1
0
0
0
0
2
0
0
0
0
3
0
0
0
0
4
0
0
0
0
5
0
0
0
0
1/7

Code

The full 2D table works, but when filling row i, we only ever look at row i-1 and the current row. What if we only kept two rows in memory at a time?

Approach 4: Space-Optimized DP

Intuition

Looking at the recurrence, dp[i][j] depends on dp[i-1][j-1], dp[i-1][j], and dp[i][j-1]. That means we only need the previous row and the current row. We can reduce space from O(m * n) to O(n) by using two 1D arrays.

Algorithm

  1. Create two 1D arrays prev and curr of size n + 1, initialized to 0.
  2. For each row i from 1 to m:
    • For each column j from 1 to n:
      • If text1[i-1] == text2[j-1], set curr[j] = 1 + prev[j-1].
      • Otherwise, set curr[j] = max(prev[j], curr[j-1]).
    • Swap prev and curr, then reset curr to zeros.
  3. Return prev[n].

Example Walkthrough

1Initialize: prev = [0, 0, 0, 0]. Columns represent text2 = "abc".
0
0
1
0
2
0
3
0
1/7

Code