leecode/problems/1371.find-the-longest-substring-containing-vowels-in-even-counts.en.md
2020-05-22 18:17:19 +08:00

10 KiB

Problem

https://leetcode.com/problems/find-the-longest-substring-containing-vowels-in-even-counts/

Description

Given the string s, return the size of the longest substring containing each vowel an even number of times. That is, 'a', 'e', 'i', 'o', and 'u' must appear an even number of times.

Example 1:

Input: s = "eleetminicoworoep"
Output: 13
Explanation: The longest substring is "leetminicowor" which contains two each of the vowels: e, i and o and zero of the vowels: a and u.

Example 2:

Input: s = "leetcodeisgreat"
Output: 5
Explanation: The longest substring is "leetc" which contains two e's.

Example 3:

Input: s = "bcbcbc"
Output: 6
Explanation: In this case, the given string "bcbcbc" is the longest because all vowels: a, e, i, o and u appear zero times.
 
Constraints:

1 <= s.length <= 5 x 10^5
s contains only lowercase English letters.

Approach1: Brute Force and Pruning

Algorithm

My first thought on this problem is to try it with the sliding window technique, which is abandoned immediately because we need to expand or narrow the range of the sliding window (a resizable one), which is not an easy job in this particular case for that this problem involves parity, not like the ones asking for "the longest substring with the most vowels".

Suddenly I'm at a loss, so I decide to try brute force, which is simple and straightforward.

  • Find all the substrings by using two loops.
  • Use a variable max to record the maximum length of the substring that meets the condition.
  • For each substring, count the numbers of vowels. If all numbers are even, update max.

I do a little trick in the implementation. While enumerating all possible substrings, I start with the longest one, in which case an early return can be achieved, that is, the desired result will be returned once it's found, eliminating the necessity to maintain maximum value. We get less code and higher efficiency in this way.

Code(Python3/JavaScript)

Python3 Code:

class Solution:
    def findTheLongestSubstring(self, s: str) -> int:
        for i in range(len(s), 0, -1):
            for j in range(len(s) - i + 1):
                sub = s[j:j + i]
                has_odd_vowel = False
                for vowel in ['a', 'e', 'i', 'o', 'u']:
                    if sub.count(vowel) % 2 != 0:
                        has_odd_vowel = True
                        break
                if not has_odd_vowel: return  i
        return 0

JavaScript Code:

 * @param {string} s
 * @return {number}
 */
var findTheLongestSubstring = function (s) {
  const vowels = ['a', 'e', 'i', 'o', 'u']
  const hasEvenVowels = s => !vowels.some(v => (s.match(new RegExp(v, 'g'))||[]).length % 2 !== 0)
  
  for (let subStrLen = s.length; subStrLen >= 0; subStrLen--) {
    let remove = s.length - subStrLen + 1

    for (let start = 0; start < remove; start++) {
      let subStr = s.slice(start, start + subStrLen)
      if (hasEvenVowels(subStr)) {
        return subStrLen
      }
    }
  }
};

Complexity Analysis

  • Time complexity: O(n^3). Considering every substring takes O(n^2) time. For each of the subarray we calculate the numbers of vowels taking O(n^2) time in the worst case, taking a total of O(n^3) time.
  • Space complexity: O(1).

Approach2: Prefix Sum + Pruning

Algorithm

Notice that in the last approach there is a step for counting the numbers of vowels for each substring. If we look closely, we will discover a lot of duplicate computations. Optimization at this part will get us better efficiency.

For problems involving consecutive numbers, we can consider using prefix sum to get to a better solution.

By using this strategy we trade space complexity for time complexity, reducing the time complexity to O(n ^ 2), while increasing the space complexity to O(n), which is a worthwhile trade-off in many situations.

Code(Python3/Java/JavaScript)

Python3 Code:

class Solution:
    i_mapper = {
        "a": 0,
        "e": 1,
        "i": 2,
        "o": 3,
        "u": 4
    }
    def check(self, s, pre, l, r):
        for i in range(5):
            if s[l] in self.i_mapper and i == self.i_mapper[s[l]]: cnt = 1
            else: cnt = 0
            if (pre[r][i] - pre[l][i] + cnt) % 2 != 0: return False
        return True
    def findTheLongestSubstring(self, s: str) -> int:
        n = len(s)

        pre = [[0] * 5 for _ in range(n)]

        # pre
        for i in range(n):
            for j in range(5):
                if s[i] in self.i_mapper and self.i_mapper[s[i]] == j:
                    pre[i][j] = pre[i - 1][j] + 1
                else:
                    pre[i][j] = pre[i - 1][j]
        for i in range(n - 1, -1, -1):
            for j in range(n - i):
                if self.check(s, pre, j, i + j):
                    return i + 1
        return 0

Java Code:

class Solution {
    public int findTheLongestSubstring(String s) {

        int len = s.length();

        if (len == 0)
            return 0;

        int[][] preSum = new int[len][5];
        int start = getIndex(s.charAt(0));
        if (start != -1)
            preSum[0][start]++;

        // preSum
        for (int i = 1; i < len; i++) {

            int idx = getIndex(s.charAt(i));

            for (int j = 0; j < 5; j++) {

                if (idx == j)
                    preSum[i][j] = preSum[i - 1][j] + 1;
                else
                    preSum[i][j] = preSum[i - 1][j];
            }
        }

        for (int i = len - 1; i >= 0; i--) {

            for (int j = 0; j < len - i; j++) {
                if (checkValid(preSum, s, i, i + j))
                    return i + 1
            }
        }
        return 0
    }


    public boolean checkValid(int[][] preSum, String s, int left, int right) {

        int idx = getIndex(s.charAt(left));

        for (int i = 0; i < 5; i++)
            if (((preSum[right][i] - preSum[left][i] + (idx == i ? 1 : 0)) & 1) == 1)
                return false;

        return true;
    }
    public int getIndex(char ch) {

        if (ch == 'a')
            return 0;
        else if (ch == 'e')
            return 1;
        else if (ch == 'i')
            return 2;
        else if (ch == 'o')
            return 3;
        else if (ch == 'u')
            return 4;
        else
            return -1;
    }
}

JavaScript Code:

/**
 * @param {string} s
 * @return {number}
 */
var findTheLongestSubstring = function (s) {
  const prefixes = Array(s.length + 1).fill(0).map(el => Array(5).fill(0))
  const vowels = {
    a: 0,
    e: 1,
    i: 2,
    o: 3,
    u: 4
  }

  for (let i = 1; i < s.length + 1; i++) {
    const letter = s[i - 1]
    for (let j = 0; j < 5; j++) {
      prefixes[i][j] = prefixes[i - 1][j]
    }
    if (letter in vowels) {
      prefixes[i][vowels[letter]] = prefixes[i - 1][vowels[letter]] + 1
    }
  }

  const check = (s, prefixes, l, r) => {
    for (let i = 0; i < 5; i++) {
      const count = s[l] in vowels && vowels[s[l]] === i
      if ((prefixes[r + 1][i] - prefixes[l + 1][i] + count) % 2 !== 0) {
        return false
      }
    }
    return true
  }

  for (let r = s.length - 1; r >= 0; r--) {
    for (let l = 0; l < s.length - r; l++) {
      if (check(s, prefixes, l, l + r)) {
        return r + 1
      }
    }
  }

  return 0
};

Complexity Analysis

  • Time complexity: O(n^2).
  • Space complexity: O(n).

Approach 3: Prefix Sum + State Compression

Algorithm

In approach 2 we reduce the time complexity by trading space (prefix) for time. However, the time complexity of O(n^2) is still a lot. Is there still room for optimization?

All we care about is parity. We don't need to count the specific number of occurrences of each vowel. Instead, we can use two states odd or even. Since we only need to deal with two states, we can consider using bit operation.

  • Use a 5-bit binary to represent the parity of the number of occurrences of each vowel with 0 for even and 1 for odd.
  • The 5 bits of the binary represent 'uoiea' respectively. For example, 10110 means the current substring includes even numbers of 'a' and 'o' and odd numbers of 'e', 'i', and 'u'.
  • This binary is assigned to cur in the code below.

Why are we using 0 for even numbers and 1 for odd numbers? Keep reading.

This algorithm involves elementary mathematics knowledge.

  • If two numbers are of the same parity, then the subtraction must be even.
  • If two numbers are of different parity, then the subtraction must be odd.

Now let's look at the question again. Why are we using 0 for even numbers and 1 for odd numbers? Because we want to use the XOR bitwise operation, which works as follows.

  • If XOR is performed on the two binaries, each bit will be bit-operated. -If two bits are the same, we get 0, otherwise, we get 1.

This is very similar to the above mathematics knowledge. If the parity of two numbers is the same, we get an even number, otherwise, we get an odd number. So it is natural to use 0 for even numbers and 1 for odd numbers.

Code(Python3/JavaScript)

Python3 Code:

class Solution:
    def findTheLongestSubstring(self, s: str) -> int:
        mapper = {
            "a": 1,
            "e": 2,
            "i": 4,
            "o": 8,
            "u": 16
        }
        seen = {0: -1}
        res = cur = 0

        for i in range(len(s)):
            if s[i] in mapper:
                cur ^= mapper.get(s[i])
            # If all numbers are of the same parity, then the subtraction must be even.
            if cur in seen:
                res = max(res, i - seen.get(cur))
            else:
                seen[cur] = i
        return res

JavaScript Code:

/**
 * @param {string} s
 * @return {number}
 */
var findTheLongestSubstring = function (s) {
  const mapper = {
    "a": 1,
    "e": 2,
    "i": 4,
    "o": 8,
    "u": 16
  }

  let max = 0, cur = 0
  const seen = { 0: -1 }
  for (let i = 0; i < s.length; i++) {
    if (s[i] in mapper) {
      cur ^= mapper[s[i]]
    }
    if (cur in seen) {
      max = Math.max(max, i - seen[cur])
    }
    else {
      seen[cur] = i
    }
  }

  return max
};

Complexity Analysis

  • Time complexity: O(n).
  • Space complexity: O(n).

Keypoints

  • Prefix Sum
  • State Compression

Extension