Description

You are given a string s and an array of strings words. All the strings of words are of the same length.

concatenated string is a string that exactly contains all the strings of any permutation of words concatenated.

  • For example, if words = ["ab","cd","ef"], then "abcdef""abefcd""cdabef""cdefab""efabcd", and "efcdab" are all concatenated strings. "acdbef" is not a concatenated string because it is not the concatenation of any permutation of words.

Return an array of the starting indices of all the concatenated substrings in s. You can return the answer in any order.

Idea

Sliding window with frequency counter

from collections import defaultdict
 
class Solution:
    def findSubstring(self, s: str, words: List[str]) -> List[int]:
        ans = []
        words_dict = defaultdict(int)
        found_freq = defaultdict(int)
        window_size = len(words[0])
        concat_size = window_size * len(words)
 
        for i in range(len(words)):
            words_dict[words[i]] += 1
 
        for w in range(window_size):
            counter = 0
            found_freq.clear()
            i = w
            while i < len(s) - window_size + 1:
                candidate = s[i:i+window_size]
                
                if not candidate in words_dict:
                    found_freq.clear()
                    counter = 0
                    i += window_size
                    continue
                found_freq[candidate] += 1
                counter += 1
                i += window_size
 
                while found_freq[candidate] > words_dict[candidate] or counter > len(words):
                    last_word_idx = i - counter * window_size
                    last_word = s[last_word_idx:last_word_idx + window_size]
                    counter -= 1
                    found_freq[last_word] -= 1
                    if counter == 0:
                        break
                
                if counter == len(words):
                    ans.append(i - counter * window_size)
 
        return ans