Leetcode 1092. Shortest Common Supersequence
Table of Contents
Problem Informations
- Problem Index: 1092
- Problem Link: Shortest Common Supersequence
- Topics: String, Dynamic Programming
Problem Description
Given two strings str1
and str2
, return the shortest string that has both str1
and str2
as subsequences. If there are multiple valid strings, return any of them.
A string s
is a subsequence of string t
if deleting some number of characters from t
(possibly 0
) results in the string s
.
Example 1:
Input: str1 = "abac", str2 = "cab"
Output: "cabac"
Explanation:
str1 = "abac" is a subsequence of "cabac" because we can delete the first "c".
str2 = "cab" is a subsequence of "cabac" because we can delete the last "ac".
The answer provided is the shortest such string that satisfies these properties.
Example 2:
Input: str1 = "aaaaaaaa", str2 = "aaaaaaaa"
Output: "aaaaaaaa"
Constraints:
- $1 \leq \text{str1.length}, \text{str2.length} \leq 1000$
str1
andstr2
consist of lowercase English letters.
Intuition
The problem at hand is to find the shortest string that contains two given strings as subsequences. Recognizing that subsequences do not require consecutive characters but rather a preservation of relative order, this problem bears similarity to finding a common ancestor in terms of sequence alignment. It shares conceptual grounding with the concept of a shortest common supersequence (SCS), which is closely associated with the longest common subsequence (LCS). By determining the LCS, we can strategically interweave the differing characters from both strings around this sequence to achieve the desired SCS length.
Approach
The approach to solving this problem involves dynamic programming to build a table where each entry at index $(i, j)$ represents the length of the shortest common supersequence for the substrings of str1
and str2
up to indices $i$ and $j$, respectively.
Initialization: Create a 2D list
dp
of dimensions $(\text{len1} + 1) \times (\text{len2} + 1)$, wherelen1
andlen2
are the lengths ofstr1
andstr2
. Initialize the base cases where one string is empty:- When one string is empty, the SCS is simply the length of the other string. Therefore,
dp[i][0] = i
anddp[0][j] = j
.
- When one string is empty, the SCS is simply the length of the other string. Therefore,
DP Table Construction:
- Iterate over each character index of
str1
andstr2
. - If characters at the current positions in both strings match,
dp[i][j] = dp[i-1][j-1] + 1
because matching characters can contribute directly to the SCS. - If characters do not match,
dp[i][j] = \min(dp[i-1][j], dp[i][j-1]) + 1
because the optimal SCS would include the character requiring less deletions either fromstr1
orstr2
.
- Iterate over each character index of
Reconstruction of the Shortest Common Supersequence:
- Starting from
dp[len1][len2]
, backtrack throughdp
to construct the SCS. - If characters match, include this character and move diagonally.
- If
dp[i][j-1]
leads to the current cell, include the character fromstr2
. - If
dp[i-1][j]
leads to the current cell, include the character fromstr1
. - Continue until one of the strings is fully traversed, at which point append the remaining characters of the other string.
- Starting from
By following this procedure, the shortest common supersequence is constructed efficiently, capturing both input strings as subsequences.
Code
C++
class Solution {
public:
string shortestCommonSupersequence(string str1, string str2) {
int len1 = str1.size();
int len2 = str2.size();
vector<vector<int>> dp(len1 + 1, vector<int>(len2 + 1));
// Initialize the base cases: the length of supersequence
// if one of the strings is empty.
for (int i = 0; i <= len1; i++) dp[i][0] = i;
for (int j = 0; j <= len2; j++) dp[0][j] = j;
// Build the dp table, where dp[i][j] represents
// the length of the shortest common supersequence for
// substrings str1[0...i-1] and str2[0...j-1].
for (int i = 1; i <= len1; i++) {
for (int j = 1; j <= len2; j++) {
if (str1[i - 1] == str2[j - 1]) {
dp[i][j] = dp[i - 1][j - 1] + 1;
} else {
dp[i][j] = min(dp[i - 1][j], dp[i][j - 1]) + 1;
}
}
}
// Reconstruct the shortest common supersequence using the dp table.
int curIndex1 = len1;
int curIndex2 = len2;
string result = "";
while (curIndex1 != 0 || curIndex2 != 0) {
if (curIndex2 == 0) {
result = str1[--curIndex1] + result;
} else if (curIndex1 == 0) {
result = str2[--curIndex2] + result;
} else if (dp[curIndex1 - 1][curIndex2 - 1] == dp[curIndex1][curIndex2] - 1 && str1[curIndex1 - 1] == str2[curIndex2 - 1]) {
result = str1[--curIndex1] + result;
--curIndex2;
} else if (dp[curIndex1][curIndex2 - 1] == dp[curIndex1][curIndex2] - 1) {
result = str2[--curIndex2] + result;
} else {
result = str1[--curIndex1] + result;
}
}
return result;
}
};
Python
class Solution:
def shortestCommonSupersequence(self, str1, str2):
len1 = len(str1)
len2 = len(str2)
dp = [[0] * (len2 + 1) for _ in range(len1 + 1)]
# Initialize the base cases: the length of supersequence
# if one of the strings is empty.
for i in range(len1 + 1):
dp[i][0] = i
for j in range(len2 + 1):
dp[0][j] = j
# Build the dp table, where dp[i][j] represents
# the length of the shortest common supersequence for
# substrings str1[0...i-1] and str2[0...j-1].
for i in range(1, len1 + 1):
for j in range(1, len2 + 1):
if str1[i - 1] == str2[j - 1]:
dp[i][j] = dp[i - 1][j - 1] + 1
else:
dp[i][j] = min(dp[i - 1][j], dp[i][j - 1]) + 1
# Reconstruct the shortest common supersequence using the dp table.
curIndex1 = len1
curIndex2 = len2
result = ""
while curIndex1 != 0 or curIndex2 != 0:
if curIndex2 == 0:
result = str1[curIndex1 - 1] + result
curIndex1 -= 1
elif curIndex1 == 0:
result = str2[curIndex2 - 1] + result
curIndex2 -= 1
elif dp[curIndex1 - 1][curIndex2 - 1] == dp[curIndex1][curIndex2] - 1 and str1[curIndex1 - 1] == str2[curIndex2 - 1]:
result = str1[curIndex1 - 1] + result
curIndex1 -= 1
curIndex2 -= 1
elif dp[curIndex1][curIndex2 - 1] == dp[curIndex1][curIndex2] - 1:
result = str2[curIndex2 - 1] + result
curIndex2 -= 1
else:
result = str1[curIndex1 - 1] + result
curIndex1 -= 1
return result
Complexity
Time complexity: The time complexity is $O(\text{len1} \times \text{len2})$, where $\text{len1}$ and $\text{len2}$ are the lengths of
str1
andstr2
, respectively. The solution constructs a dynamic programming table (dp
) which has dimensions $(\text{len1} + 1) \times (\text{len2} + 1)$. Both the building of the table and the reconstruction process of the shortest common supersequence take $O(\text{len1} \times \text{len2})$ time.Space complexity: The space complexity is $O(\text{len1} \times \text{len2})$. This space is used to maintain the
dp
table to store the lengths of shortest common supersequences for all possible substrings ofstr1
andstr2
.
Disclaimer: All reference materials on this website are sourced from the internet and are intended for learning purposes only. If you believe any content infringes upon your rights, please contact me at csnote.cc@gmail.com, and I will remove the relevant content promptly.
Feedback Welcome: If you notice any errors or areas for improvement in the articles, I warmly welcome your feedback and corrections. Your input will help this blog provide better learning resources. This is an ongoing process of learning and improvement, and your suggestions are valuable to me. You can reach me at csnote.cc@gmail.com.