【LetMeFly】792.匹配子序列的单词数

力扣题目链接：https://leetcode.cn/problems/number-of-matching-subsequences/

给定字符串 s 和字符串数组 words, 返回 words[i] 中是s的子序列的单词个数 。

字符串的 子序列 是从原始字符串中生成的新字符串，可以从中删去一些字符(可以是none)，而不改变其余字符的相对顺序。

例如， “ace” 是 “abcde” 的子序列。

示例 1:

输入: s = "abcde", words = ["a","bb","acd","ace"]
输出: 3
解释: 有三个是 s 的子序列的单词: "a", "acd", "ace"。

Example 2:

输入: s = "dsahjpjauf", words = ["ahjpjau","ja","ahbwzgqnuk","tnmlanowax"]
输出: 2

提示:

1 <= s.length <= 5 * 10⁴
1 <= words.length <= 5000
1 <= words[i].length <= 50
words[i]和 s 都只由小写字母组成。

方法一：二分查找

方法一的思路是每个字符串单独处理。

首先需要预处理字符串s，记录下来s中每个字母的出现位置。假如s = "aba"，那么a出现的下标为[0, 2]，b出现的下标为[1]

这样，在处理words中每个字符串的时候，只需要从前到后遍历字符串，在s中二分查找当前遍历到的字母即可。

时间复杂度$O(len(s) + N\times len(s))$，其中$N$是$words$中所有单词的个数
空间复杂度$O(len(s))$

AC代码

C++

class Solution {
public:
    int numMatchingSubseq(string& s, vector<string>& words) {
        vector<int> a[26];
        for (int i = 0; i < s.size(); i++)
            a[s[i] - 'a'].push_back(i);
        int ans = 0;
        for (string& word : words) {
            bool ok = true;
            int loc = -1;
            for (char c : word) {
                vector<int>::iterator it = lower_bound(a[c - 'a'].begin(), a[c - 'a'].end(), loc + 1);  // 在s中所有出现过字符c的下标中，找到大于loc的第一个下标
                if (it == a[c - 'a'].end()) {
                    ok = false;
                    break;
                }
                loc = *it;
            }
            ans += ok;
        }
        return ans;
    }
};

方法二：多指针

方法二的思路是遍历字符串s，在遍历的过程中，不断将这个字符对应的字符串的指针后移。

例如样例一：s = "abcde", words = ["a","bb","acd","ace"]

首先建立$4$个指针（因为有$4$个字符串）

a
↑

bb
↑

acd
↑

ace
↑

然后建立一个大小为26的队列数组，队列中存放二十六个字母对应的指针

[0]: 0, 2, 3  // 是因为四个指针(0, 1, 2, 3)中，第0、2、3个指针所指的元素为a
[1]: 1  // 是因为四个指针中，第1号指针所指元素为b
[2]:
[3]:
[4]:
...
[25]:

接下来遍历字符串s

s的第一个字母为a，看a的队列，有三个指针0, 2, 3

将它们分别后移一位：

0号指针对应字符串为a，指针后移一位达到了字符串的末尾，也就是说0号指针把字符串a“指完了”，因此a是s的子序列
2号指针对应字符串为acd，指针后移一位，移动到c。因此队列[2]: 2
3号指针对应字符串为ace，指针后移一位，移动到c。因此队列[2]: 3

[0]:
[1]: 1  // 是因为四个指针中，第1号指针所指元素为b
[2]: 2, 3
[3]:
[4]:
...
[25]:

s的第二个字母为b，看b的队列，有一个指针1

将它后移一位：

1号指针对应字符串为bb，指针后移一位，移动到第二个b。因此队列[1]: 1

[0]:
[1]: 1
[2]: 2, 3
[3]:
[4]:
...
[25]:

s的第三个字母为c，看c的队列，有两个指针2, 3

将它们分别后移一位：

2号指针对应字符串为acd，指针后移一位，移动到d。因此队列[3]: 2
3号指针对应字符串为ace，指针后移一位，移动到e。因此队列[4]: 3

[0]:
[1]: 1
[2]:
[3]: 2
[4]: 3
...
[25]:

s的第四个字母为d，看d的队列，有一个指针2

将它后移一位：

2号指针对应字符串为acd，指针后移一位达到了字符串的末尾，也就是说2号指针把字符串acd“指完了”，因此acd是s的子序列

[0]:
[1]: 1
[2]:
[3]:
[4]: 3
...
[25]:

s的第五个字母为e，看e的队列，有一个指针3

将它后移一位：

3号指针对应字符串为ace，指针后移一位达到了字符串的末尾，也就是说3号指针把字符串ace“指完了”，因此ace是s的子序列

[0]:
[1]: 1
[2]:
[3]:
[4]:
...
[25]:

字符串s遍历结束，words中三个字符串是s的子序列

时间复杂度$O(len(s) + N)$，其中$N$是$words$中所有单词的个数
空间复杂度$O(N + C)$，其中$C$是字符种类数小写字母个数$26$

AC代码

C++

class Solution {
public:
    int numMatchingSubseq(string& s, vector<string>& words) {
        queue<int> q[26];  // q[0]: 下一个是'a'的word在words中的index
        for (int index = 0; index < words.size(); index++)
            q[words[index][0] - 'a'].push(index);
        vector<int> loc(words.size(), 0);  // loc[0]: words[0]该匹配哪个单词了
        int ans = 0;
        for (char c : s) {
            for (int i = q[c - 'a'].size(); i > 0; i--) {
                int index = q[c - 'a'].front();
                q[c - 'a'].pop();
                loc[index]++;
                if (loc[index] == words[index].size()) {
                    ans++;
                    continue;
                }
                q[words[index][loc[index]] - 'a'].push(index);
            }
        }
        return ans;
    }
};

同步发文于CSDN，原创不易，转载请附上原文链接哦~
Tisfy：https://letmefly.blog.csdn.net/article/details/127908867

#题解 #中等 #字符串 #排序 #LeetCode #哈希表 #字典树

792.匹配子序列的单词数

https://blog.letmefly.xyz/2022/11/17/LeetCode 0792.匹配子序列的单词数/

作者

Tisfy

发布于

2022年11月17日

许可协议

CSDN - CSDN9题解上一篇

775.全局倒置与局部倒置下一篇