Leetcode 1055. Shortest Way to Form String
Table of Contents
Problem Informations
- Problem Index: 1055
- Problem Link: Shortest Way to Form String
- Topics: Two Pointers, String, Binary Search, Greedy
Problem Description
A subsequence of a string is a new string that is formed from the original string by deleting some (can be none) of the characters without disturbing the relative positions of the remaining characters. (i.e., “ace” is a subsequence of “abcde” while “aec” is not).
Given two strings source and target, return the minimum number of subsequences of source such that their concatenation equals target. If the task is impossible, return -1.
Example 1:
- Input: source = “abc”, target = “abcbc”
- Output: 2
- Explanation: The target “abcbc” can be formed by “abc” and “bc”, which are subsequences of source “abc”.
Example 2:
- Input: source = “abc”, target = “acdbc”
- Output: -1
- Explanation: The target string cannot be constructed from the subsequences of source string due to the character “d” in target string.
Example 3:
- Input: source = “xyz”, target = “xzyxz”
- Output: 3
- Explanation: The target string can be constructed as follows “xz” + “y” + “xz”.
Constraints:
- $1 \leq \text{source.length}, \text{target.length} \leq 1000$
- source and target consist of lowercase English letters.
Intuition
The problem involves forming the target string by concatenating subsequences of the source string. A subsequence can be formed by deleting characters, while preserving the order of the remaining characters. The crux of the problem is to determine how many such subsequences are required at a minimum. If any character in the target does not exist in the source, it is impossible to form the target string, and the solution should return -1. Otherwise, the task is to navigate through both strings and count how many complete traversals or partial traversals of the source are needed to match the target.
Approach
The solution leverages a greedy and two-pointer approach to solve the problem efficiently.
Set Initialization: First, store each unique character from the
sourcestring into a set. This helps in quickly determining if a character in thetargetis absent from thesource.Immediate Impossibility Check: Before proceeding, verify if every character in the
targetis present in thesource. If the set does not contain any character from thetarget, return -1 as it’s impossible to form such a character with any subsequences ofsource.Two-Pointer Technique: Utilize two pointers:
sourceIdx, which traverses thesource, andtargetIdx, which traverses thetarget.- Begin by initializing
subsequencesCountto 1, signifying that at least one subsequence will be needed. - Loop through until the
targetIdxreaches the length oftarget.
- Begin by initializing
Traversal Logic:
- Increment the
sourceIdxto progress through thesourcestring. - If
sourceIdxsurpasses the length of thesource, increment thesubsequencesCount, indicating the need for a new subsequence, and resetsourceIdxto 0.
- Increment the
Matching Characters: Use another loop within the main loop to locate the next character in the
sourcethat matches the current character of thetarget. If a match is found, advance thetargetIdx.Subsequence Count: Continue advancing through the
sourceand resetting as needed until all characters oftargetcan be formed. The value ofsubsequencesCountat the end of this traversal will represent the minimal number of subsequences of thesourcerequired to form thetarget.
Overall, this approach efficiently combines scanning and matching tasks under the control of two pointers, while accounting for full resets whenever a traversal of the source is exhausted, ensuring optimal subsequences enumeration.
Code
C++
class Solution {
public:
int shortestWay(string source, string target) {
// Create a set to hold unique characters from the `source` string.
set<char> sourceSet;
// Insert each character from `source` into the set for quick lookup.
for (char c : source) {
sourceSet.insert(c);
}
// Check if every character in `target` exists in `source`. If not, return -1.
for (char c : target) {
if (sourceSet.count(c) == 0) {
return -1;
}
}
// Initialize the number of subsequences needed.
int subsequencesCount = 1;
int sourceIdx = -1; // Index in the `source` string.
int targetIdx = 0; // Index in the `target` string.
// Traverse the `target` string.
while (targetIdx < target.size()) {
sourceIdx++;
// If `sourceIdx` exceeds the length of `source`, we need an additional subsequence.
if (sourceIdx == source.size()) {
subsequencesCount++;
sourceIdx = 0;
}
// Move through `source` until we find a matching character for `target[targetIdx]`.
while (source[sourceIdx] != target[targetIdx]) {
sourceIdx++;
if (sourceIdx == source.size()) {
subsequencesCount++;
sourceIdx = 0;
}
}
// Move to the next character in `target`.
targetIdx++;
}
// Return the total number of subsequences needed.
return subsequencesCount;
}
};
Python
class Solution:
def shortestWay(self, source: str, target: str) -> int:
# Create a set to hold unique characters from the `source` string.
sourceSet = set()
# Insert each character from `source` into the set for quick lookup.
for c in source:
sourceSet.add(c)
# Check if every character in `target` exists in `source`. If not, return -1.
for c in target:
if c not in sourceSet:
return -1
# Initialize the number of subsequences needed.
subsequencesCount = 1
sourceIdx = -1 # Index in the `source` string.
targetIdx = 0 # Index in the `target` string.
# Traverse the `target` string.
while targetIdx < len(target):
sourceIdx += 1
# If `sourceIdx` exceeds the length of `source`, we need an additional subsequence.
if sourceIdx == len(source):
subsequencesCount += 1
sourceIdx = 0
# Move through `source` until we find a matching character for `target[targetIdx]`.
while source[sourceIdx] != target[targetIdx]:
sourceIdx += 1
if sourceIdx == len(source):
subsequencesCount += 1
sourceIdx = 0
# Move to the next character in `target`.
targetIdx += 1
# Return the total number of subsequences needed.
return subsequencesCount
Complexity
- Time complexity: The time complexity is $O(n \times m)$, where $n$ is the length of the
sourcestring and $m$ is the length of thetargetstring. This is because for each character in thetarget, the algorithm may need to traverse the entiresourcestring to find a matching character. - Space complexity: The space complexity is $O(n)$, where $n$ is the length of the
sourcestring. This is due to the space used by the setsourceSetto store unique characters from thesourcestring. The set can contain at most all characters of thesource.
Disclaimer: All reference materials on this website are sourced from the internet and are intended for learning purposes only. If you believe any content infringes upon your rights, please contact me at csnote.cc@gmail.com, and I will remove the relevant content promptly.
Feedback Welcome: If you notice any errors or areas for improvement in the articles, I warmly welcome your feedback and corrections. Your input will help this blog provide better learning resources. This is an ongoing process of learning and improvement, and your suggestions are valuable to me. You can reach me at csnote.cc@gmail.com.