Definition
Longest Common Subsequence Problem
Given two strings and over an alphabet . The longest common subsequence problem asks for the length of a longest string that is a subsequence of both and .
Input:
Brute-Force Algorithm
Brute-Force Algorithm
A brute-force approach enumerates all subsequences of , checks for each whether it is also a subsequence of , and keeps the maximum length found.
int lcs_bruteforce(String S, String T) int best := 0 for each subsequence A of S: if is_subsequence(A, T): best := max(best, |A|) return bestChecking whether is a subsequence of takes time by scanning left to right.
Time complexity: where and .
Recursive Algorithm
Recursive Dynamic Programme
Use zero-based indices. For a string , is the letter at index , and is the substring from index to index , including both endpoints.
Let be the length of an LCS of the prefixes and .
If one prefix is empty, no non-empty common subsequence is possible:
Now consider an LCS of and .
- If ends with and , then the last letters can be matched. This gives .
- If ends with but , then is not used. This case is covered by .
- If does not end with , then is not used. This case is covered by .
In the matching case, taking the matching last letters dominates dropping one of them. Hence the recurrence simplifies to:
int lcs_rec(int i, int j) if i < 0 or j < 0: return 0 if memo[i][j] is known: return memo[i][j] if S[i] == T[j]: memo[i][j] := lcs_rec(i-1, j-1) + 1 else: memo[i][j] := max(lcs_rec(i, j-1), lcs_rec(i-1, j)) return memo[i][j]Correctness follows from the case split above. The diagram shows the two behaviours of a recursive call.
With memoisation or bottom-up table filling, there are values to compute, and each value takes work after its dependencies are known.
Time complexity: . Without memoisation, the recursion tree can be exponential because the same subproblems reappear.
Examples
The longest common subsequence is with length .
Both and are longest common subsequences, each of length .
The longest common subsequence is with length .
No character appears in both strings. The only common subsequence is the empty string , so .