前缀树结构
Trie树,即字典树,又称单词查找树或键树,是一种树形结构,是一种哈希树的变种。典型应用是用于统计和排序大量的字符串(但不仅限于字符串),所以经常被搜索引擎系统用于文本词频统计。它的优点是:利用字符串的公共前缀来减少查询时间,最大限度地减少无谓的字符串比较。
前缀树的基本操作
插入字符串:从根节点开始,依次插入字符串的每个字符。如果路径不存在,则创建新节点。最后一个节点标记为字符串的结束。
删除字符串:从根节点开始,依次减少路径上节点的计数。如果某节点的计数为零,则删除该节点。
查询字符串出现次数:沿路径遍历字符串,若路径完整且结束节点存在,则返回字符串的计数。
查询前缀数量:沿路径遍历前缀,若路径完整,则返回最后节点的通过计数。
节点的结构
1 2 3 4 5 6 7 8 9 10 11
| class TrieNode { public int pass; public int end; public TrieNode[] nexts;
public TrieNode() { pass = 0; end = 0; nexts = new TrieNode[26]; } }
|
插入字符串
1 2 3 4 5 6 7 8 9 10 11 12 13
| public void insert(String word) { TrieNode node = root; node.pass++; for (int i = 0, path; i < word.length(); i++) { path = word.charAt(i) - 'a'; if (node.nexts[path] == null) { node.nexts[path] = new TrieNode(); } node = node.nexts[path]; node.pass++; } node.end++; }
|
查询字符串出现次数
1 2 3 4 5 6 7 8 9 10 11 12
| public int countWordsEqualTo(String word) { TrieNode node = root; for (int i = 0, path; i < word.length(); i++) { path = word.charAt(i) - 'a'; if (node.nexts[path] == null) { return 0; } node = node.nexts[path]; } return node.end; }
|
查询前缀数量
1 2 3 4 5 6 7 8 9 10 11 12
| public int countWordsStartingWith(String pre) { TrieNode node = root; for (int i = 0, path; i < pre.length(); i++) { path = pre.charAt(i) - 'a'; if (node.nexts[path] == null) { return 0; } node = node.nexts[path]; } return node.pass; }
|
删除字符串
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
|
public void erase(String word) { if (countWordsEqualTo(word) > 0) { TrieNode node = root; node.pass--; for (int i = 0, path; i < word.length(); i++) { path = word.charAt(i) - 'a'; if (--node.nexts[path].pass == 0) { node.nexts[path] = null; return; } node = node.nexts[path]; } node.end--; } }
|
使用静态数组实现前缀树
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78
| public static int MAXN = 150001;
public static int[][] tree = new int[MAXN][26];
public static int[] end = new int[MAXN];
public static int[] pass = new int[MAXN];
public static int cnt;
public static void build() { cnt = 1; }
public static void insert(String word) { int cur = 1; pass[cur]++; for (int i = 0, path; i < word.length(); i++) { path = word.charAt(i) - 'a'; if (tree[cur][path] == 0) { tree[cur][path] = ++cnt; } cur = tree[cur][path]; pass[cur]++; } end[cur]++; }
public static int search(String word) { int cur = 1; for (int i = 0, path; i < word.length(); i++) { path = word.charAt(i) - 'a'; if (tree[cur][path] == 0) { return 0; } cur = tree[cur][path]; } return end[cur]; }
public static int prefixNumber(String pre) { int cur = 1; for (int i = 0, path; i < pre.length(); i++) { path = pre.charAt(i) - 'a'; if (tree[cur][path] == 0) { return 0; } cur = tree[cur][path]; } return pass[cur]; }
public static void delete(String word) { if (search(word) > 0) { int cur = 1; pass[cur]--; for (int i = 0, path; i < word.length(); i++) { path = word.charAt(i) - 'a'; if (--pass[tree[cur][path]] == 0) { tree[cur][path] = 0; return; } cur = tree[cur][path]; } end[cur]--; } }
public static void clear() { for (int i = 1; i <= cnt; i++) { Arrays.fill(tree[i], 0); end[i] = 0; pass[i] = 0; } }
|
相关题目
查询数组中两个数的异或最大值
思路:我们使用每一个数,根据他的每一位去构建一个前缀树。然后依次遍历每一个数,从最高位开始遍历,我希望找到跟我不一样的位,于是我在前缀树里面查找。如果存在那么把这个位置变成1加入答案,如果不存在那么把这个位置变成0加入答案。然后跳到下一个节点树去。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76
|
public static int findMaximumXOR1(int[] nums) { build(nums); int ans = 0; for (int num : nums) { ans = Math.max(ans, maxXor(num)); } clear(); return ans; }
public static int MAXN = 3000001;
public static int[][] tree = new int[MAXN][2];
public static int cnt;
public static int high;
public static void build(int[] nums) { cnt = 1; int max = Integer.MIN_VALUE; for (int num : nums) { max = Math.max(num, max); } high = 31 - Integer.numberOfLeadingZeros(max); for (int num : nums) { insert(num); } }
public static void insert(int num) { int cur = 1; for (int i = high, path; i >= 0; i--) { path = (num >> i) & 1; if (tree[cur][path] == 0) { tree[cur][path] = ++cnt; } cur = tree[cur][path]; } }
public static int maxXor(int num) { int ans = 0; int cur = 1; for (int i = high, status, want; i >= 0; i--) { status = (num >> i) & 1; want = status ^ 1; if (tree[cur][want] == 0) { want ^= 1; } ans |= (status ^ want) << i; cur = tree[cur][want]; } return ans; }
public static void clear() { for (int i = 1; i <= cnt; i++) { tree[i][0] = tree[i][1] = 0; } }
|
在二维网格中,搜索单词组返回能找到的单词组
思路:回溯+前缀树减枝
关键点:1.把走过的格子,标为0,这样不会重复走,但是需要恢复现场
2.利用前缀树减枝,首先利用单词组构建出前缀树。每次递归的时候根据前缀树有没有路径来判断这个格子里面的单词能不能要。除了路径还要判断这条路径下还有没有剩余的未探索单词(利用pass数组)。
3.巧用pass跟end数组。pass[i] 表示当前节点剩余还未收集的单词,end[i] 表示在当前节点完结的单词,可以直接加入ans。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92
| public static List<String> findWords(char[][] board, String[] words) { build(words); List<String> ans = new ArrayList<>(); for (int i = 0; i < board.length; i++) { for (int j = 0; j < board[0].length; j++) { dfs(board, i, j, 1, ans); } } clear(); return ans; }
public static int dfs(char[][] board, int i, int j, int t, List<String> ans) { if (i < 0 || i == board.length || j < 0 || j == board[0].length || board[i][j] == 0) { return 0; } char tmp = board[i][j]; int road = tmp - 'a'; t = tree[t][road]; if (pass[t] == 0) { return 0; } int fix = 0; if (end[t] != null) { fix++; ans.add(end[t]); end[t] = null; } board[i][j] = 0; fix += dfs(board, i - 1, j, t, ans); fix += dfs(board, i + 1, j, t, ans); fix += dfs(board, i, j - 1, t, ans); fix += dfs(board, i, j + 1, t, ans); pass[t] -= fix; board[i][j] = tmp; return fix; }
public static int MAXN = 10001;
public static int[][] tree = new int[MAXN][26];
public static int[] pass = new int[MAXN];
public static String[] end = new String[MAXN];
public static int cnt;
public static void build(String[] words) { cnt = 1; for (String word : words) { int cur = 1; pass[cur]++; for (int i = 0, path; i < word.length(); i++) { path = word.charAt(i) - 'a'; if (tree[cur][path] == 0) { tree[cur][path] = ++cnt; } cur = tree[cur][path]; pass[cur]++; } end[cur] = word; } }
public static void clear() { for (int i = 1; i <= cnt; i++) { Arrays.fill(tree[i], 0); pass[i] = 0; end[i] = null; } }
|