LeetCode算法题整理(单词查找树篇)Trie

此篇为预留篇章,其中的算法有些还未使用Trie进行优化。

1023. Camelcase Matching

驼峰匹配。原题

1
2
3
4
5
6
7
8
9
10
Input: queries = ["FooBar","FooBarTest","FootBall","FrameBuffer","ForceFeedBack"], pattern = "FB"
Output: [true,false,true,true,false]
Explanation:
"FooBar" can be generated like this "F" + "oo" + "B" + "ar".
"FootBall" can be generated like this "F" + "oot" + "B" + "all".
"FrameBuffer" can be generated like this "F" + "rame" + "B" + "uffer".

["CompetitiveProgramming","CounterPick","ControlPanel"]
"CooP"
[false,false,true]

方法一:竞赛时AC的方法。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
def camelMatch(self, queries: List[str], pattern: str) -> List[bool]:
ans = []
for word in queries:
p = 0
for i, c in enumerate(word):
if p == len(pattern):
ans.append(all(char.islower() for char in word[i:]))
break
if c == pattern[p]:
p += 1
else:
if c.isupper():
ans.append(False)
break
else:
ans.append(all(char.islower() for char in pattern[p:]))
return ans
方法二:首先判断单词和pattern是否具有相同的大写字母。iter保证了每次c开始的位置是从上一个匹配结束的地方开始的。这个非常重要,保证了比较的顺序。
1
2
3
4
5
6
7
8
9
def camelMatch(self, qs: List[str], p: str) -> List[bool]:

def u(s):
return [c for c in s if c.isupper()]

def issub(s, t):
it = iter(t) # importand, pass the `Coop` testcase
return all(c in it for c in p)
return [u(q)==u(p) and issub(p, q) for q in qs]

212. Word Search II

在矩阵中搜索多个单词,返回存在的单词。原题

1
2
3
4
5
6
7
8
9
10
Input: 
board = [
['o','a','a','n'],
['e','t','a','e'],
['i','h','k','r'],
['i','f','l','v']
]
words = ["oath","pea","eat","rain"]

Output: ["eat","oath"]

方法一:和单词搜索一样,大体的原理还是回溯法,但是要考虑一些优化,所以用Trie 这里有个地方注意要在循环刚开始前将g[i][j]设为’#’不能过早地放在return 前。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
class TrieNode:

def __init__(self):
self.child = collections.defaultdict(TrieNode)
self.is_word = False

class Trie:

def __init__(self):
self.root = TrieNode()

def insert(self, word: str) -> None:
cur = self.root
for letter in word:
cur = cur.child[letter]
cur.is_word = True

def search(self, word: str) -> bool:
cur = self.root
for letter in word:
cur = cur.child.get(letter)
if not cur:
return False
return cur.is_word

class Solution:
def findWords(self, g: List[List[str]], words: List[str]) -> List[str]:
def dfs(i, j, node, path):
if node.is_word:
ans.append(path)
node.is_word = False
if not R>i>=0<=j<C:
return
tmp = g[i][j]
node = node.child.get(tmp)
if not node:
return
g[i][j] = '#'
for x, y in ((i-1, j), (i+1, j), (i, j-1), (i, j+1)):
dfs(x, y, node, path+tmp)
g[i][j] = tmp

R, C = len(g), len(g[0])
ans = []
trie = Trie()
node = trie.root
for w in words:
trie.insert(w)
for i in range(R):
for j in range(C):
dfs(i, j, node, "")
return ans

211. Add and Search Word - Data structure design

和前缀单词查找树题一样,区别在于,这里搜索的时候可以使用’.’来匹配所有的字母。原题

1
2
3
4
5
6
7
addWord("bad")
addWord("dad")
addWord("mad")
search("pad") -> false
search("bad") -> true
search(".ad") -> true
search("b..") -> true

方法一:Trie + dfs.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
class TrieNode:
def __init__(self):
self.child = collections.defaultdict(TrieNode)
self.is_word = False

class WordDictionary:

def __init__(self):
self.root = TrieNode()

def addWord(self, word: str) -> None:
cur = self.root
for c in word:
cur = cur.child[c]
cur.is_word = True

def search(self, word: str) -> bool:
def dfs(cur, w):
if not cur:
return False
if not w:
return cur.is_word
if w[0] != '.':
return dfs(cur.child.get(w[0]), w[1:])
else:
return any(dfs(c, w[1:]) for c in cur.child.values())

return dfs(self.root, word)

820. Short Encoding of Words

这个题描述的有点乱,简单来说就是将一个单词列表转成另一个单词列表,使前者的单词每一个都是后者的后缀。求列表的长度加上分隔符#。原题

1
2
3
Input: words = ["time", "me", "bell"]
Output: 10
Explanation: S = "time#bell#" and indexes = [0, 2, 5].

方法一:因为数据不都大,所以直接用set。

1
2
3
4
5
6
def minimumLengthEncoding(self, words: List[str]) -> int:
s = set(words)
for w in words:
for i in range(1, len(w)):
s.discard(w[i:])
return sum(len(w)+1 for w in s)
方法二:单词查找树。最后不为空的node表示只是后缀的一部分。
1
2
3
4
5
6
7
8
9
def minimumLengthEncoding(self, words: List[str]) -> int:
root = dict()
leaves = []
for word in set(words):
cur = root
for w in reversed(word):
cur[w] = cur = cur.get(w, dict())
leaves.append((cur, len(word)+1))
return sum(d for node, d in leaves if len(node)==0)

1032. Stream of Characters

给定一些单词,查询字母,判断是否能和前k步的字母组成某个单词。原题

1
2
3
4
5
6
7
8
9
10
11
12
13
StreamChecker streamChecker = new StreamChecker(["cd","f","kl"]); // init the dictionary.
streamChecker.query('a'); // return false
streamChecker.query('b'); // return false
streamChecker.query('c'); // return false
streamChecker.query('d'); // return true, because 'cd' is in the wordlist
streamChecker.query('e'); // return false
streamChecker.query('f'); // return true, because 'f' is in the wordlist
streamChecker.query('g'); // return false
streamChecker.query('h'); // return false
streamChecker.query('i'); // return false
streamChecker.query('j'); // return false
streamChecker.query('k'); // return false
streamChecker.query('l'); // return true, because 'kl' is in the wordlist

方法一:我写的方法和这个差不多,但是超时了,这里Lee用了一些内置的方法,才不会超时。但是耗时也很长。

1
2
3
4
5
6
7
8
9
10
11
12
class StreamChecker:

def __init__(self, words: List[str]):
T = lambda: collections.defaultdict(T)
self.trie = T()
for w in words:
reduce(dict.__getitem__, w, self.trie)['#'] = True
self.waiting = []

def query(self, letter: str) -> bool:
self.waiting = [node[letter] for node in self.waiting + [self.trie] if letter in node]
return any('#' in node for node in self.waiting)
方法二:这个方法倒置单词存储,这样可以提前退出循环,而且维持一个最大的单词的长度即可。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
class StreamChecker:

def __init__(self, words: List[str]):
T = lambda: collections.defaultdict(T)
self.trie = T()
for w in words:
reduce(dict.__getitem__, w[::-1], self.trie)['#'] = True
self.S = ""
self.W = max(map(len, words))

def query(self, letter: str) -> bool:
self.S = (letter + self.S)[:self.W]
cur = self.trie
for c in self.S:
if c in cur:
cur = cur[c]
if cur['#']:
return True
else:
break
return False