Problem Informations

Problem Index: 3
Problem Link: Longest Substring Without Repeating Characters
Topics: Hash Table, String, Sliding Window

Problem Description

Given a string $s$, find the length of the longest substring without duplicate characters.

Example 1:

Input: s = "abcabcbb"
Output: 3
Explanation: The answer is "abc", with the length of 3.

Example 2:

Input: s = "bbbbb"
Output: 1
Explanation: The answer is "b", with the length of 1.

Example 3:

Input: s = "pwwkew"
Output: 3
Explanation: The answer is "wke", with the length of 3.
Notice that the answer must be a substring, "pwke" is a subsequence and not a substring.

Constraints:

$0 \leq s.length \leq 5 \times 10^4$
$s$ consists of English letters, digits, symbols, and spaces.

Intuition

The problem of finding the length of the longest substring without duplicate characters can be effectively addressed using the concept of a sliding window. The sliding window approach provides a way to examine a subset of the overall data (a section of the string in this case) while adjusting its size dynamically. By utilizing two pointers (one for the start and one for the end of the window), we can efficiently traverse the string. The key insight here is to maintain a record of characters’ occurrences and adjust the window boundaries to ensure that no character repeats within the window.

Approach

To solve this problem, we use a sliding window approach with two pointers: left and right. Additionally, a hashmap (or map in C++) is employed to keep track of the last occurrence index of each character. The main goal is to adjust the right pointer to expand the window and use the left pointer to ensure the window remains valid (i.e., without duplicate characters).

Initialization: We begin by initializing a map characterCount to store the count of each character within the current window. We also set two integer variables, left and right, to 0, representing the beginning and end of the window. Another integer, maxLength, is initialized to 0 to store the maximum length of a valid substring encountered so far.
Iterating with Right Pointer: We iterate through the string using the right pointer. For each character in the string, perform the following steps:
- Increment the count of the character at the current right pointer in the characterCount map.
- If a character count becomes 2 (indicating a duplicate is found within the window), enter a while loop to adjust the window:
  - Decrease the count of the character at the current left pointer.
  - Move the left pointer one step to the right, effectively shrinking the window from the left until no duplicates remain.
Updating Maximum Length: After ensuring the current window [left, right] is valid (no duplicates), calculate its length (right - left + 1) and update maxLength if the current window length surpasses the previous maximum.
Return Result: After completing the traversal of the string with the right pointer, return maxLength, which holds the length of the longest valid substring found.

This approach ensures that each character is processed at most twice (once by right and once by left), resulting in a time complexity of $O(n)$. This efficiency is suitable given the problem constraints.

Code

C++

class Solution {
public:
    int lengthOfLongestSubstring(string s) {
        // Map to store the last occurrence index of each character
        map<char, int> characterCount;
        
        int maxLength = 0; // Stores the maximum length of substring found
        int left = 0; // Left pointer for the sliding window
        int right = 0; // Right pointer for the sliding window

        // Iterate through the string with the right pointer
        for (; right < s.size(); right++) {
            characterCount[s[right]]++; // Increment the count of the current character
            
            // If a duplicate character is found (count becomes 2)
            while (characterCount[s[right]] == 2) {
                characterCount[s[left]]--; // Remove the character at the left pointer
                left++; // Move the left pointer to shrink the window
            }
            
            // Update the maximum substring length found
            maxLength = max(maxLength, right - left + 1);
        }
        
        return maxLength;
    }
};

Python

class Solution:
    def lengthOfLongestSubstring(self, s: str) -> int:
        # Map to store the last occurrence index of each character
        characterCount = {}

        maxLength = 0  # Stores the maximum length of substring found
        left = 0  # Left pointer for the sliding window
        right = 0  # Right pointer for the sliding window

        # Iterate through the string with the right pointer
        for right in range(len(s)):
            characterCount[s[right]] = characterCount.get(s[right], 0) + 1  # Increment the count of the current character
            
            # If a duplicate character is found (count becomes 2)
            while characterCount[s[right]] == 2:
                characterCount[s[left]] -= 1  # Remove the character at the left pointer
                left += 1  # Move the left pointer to shrink the window
            
            # Update the maximum substring length found
            maxLength = max(maxLength, right - left + 1)
        
        return maxLength

Complexity

Time complexity: $O(n)$. The algorithm uses a sliding window approach with two pointers, left and right, traversing the string s of length $n$. Each character is processed at most twice, once when it is encountered by the right pointer and potentially again if it is counted as duplicate and processed by the left pointer. This results in a linear time complexity of $O(n)$.
Space complexity: $O(m)$, where $m$ is the number of unique characters in the input string s. The characterCount map may store each unique character from the string along with its occurrence count. In the worst case, if all characters are unique, the space complexity is $O(m)$, which is bounded by the character set size. If considering English letters, digits, symbols, and spaces, $m$ can be at most 128.

CS Note
中

Leetcode 3. Longest Substring Without Repeating Characters