CodeBook
  • Introduction
  • Array
    • Maximum Product of Three Numbers
    • Set Mismatch
    • Find the Duplicate Number
    • Find All Duplicates in an Array
    • Find All Numbers Disappeared in an Array
    • Missing Number
    • Single Number
    • Find Difference
    • Find the Celebrity
    • Word Distance
    • Product of Array Except Self
  • Binary Search
    • First Bad Version
    • Search in a Big Sorted Array
    • Search Range
    • Find the Peak
    • Maximum Number in Mountain Sequence
    • Search in Rotated Sorted Array
    • Find Minimum in Rotated Sorted Array
    • Search a 2D Matrix
    • Search a 2D Matrix II
    • Smallest Rectangle Enclosing Black Pixels
    • [Binary Search] Merge Two Sorted Array
    • Single Element in a Sorted Array
  • Two Pointers
    • 1 Forward
      • Moving Zeros
      • Remove Elements
      • Remove Duplicated
      • Longest Continuous Increasing Subsequence
      • Replace all Occurrences of String AB with C
    • 2 Oppsite
      • Rotate Array
      • Container With Most Water
      • Trapping Rain Water
      • Triangle Count
    • 3 Sliding Windows
      • Permutation in String
      • Find All Anagrams in a String
      • Longest Substring with At Most K Distinct Characters
      • Max Consecutive Ones II
      • Minimum Size Subarray Sum
      • Longest Substring Without Repeating Characters
      • Minimum Window Substring
      • Subarrays with K Different Integers
    • 4 Partition
      • Color Sort
      • Color Sort II
      • Partition Array
    • # Sum
      • Two Sum
      • Two Sum - Unique Pairs
      • Two Sum - Less Or Equal
      • Two Sum - Greater Than Target
      • Two Sum - Closest
      • Two Sum - Difference
      • Two Sum - Data Structure Design
      • Three Sum
      • Three Sum With Multiplicity
      • Three Sum Smaller
      • Three Sum - Triangle Count
      • Four Sum
  • BFS
    • 1 Traverse
      • Number of Islands
      • Clone Gragh
      • Number of Distinct Islands
      • Pacific Atlantic Water Flow
      • Surrounded Regions
      • Walls and Gates
      • Max Area of Island
    • 2 Shortest
      • 01 Matrix
      • Knight Shortest Path
      • Shortest Distance from All Buildings
      • Best Meet Point
      • Shortest Bridge
      • Snakes and Ladders
      • Bus Route
  • DFS
    • 0 Basic
      • Subsets
      • Subsets II
      • Permutations
      • Permutations II
      • Prev/Next Permutation
      • Kth Permutation
      • Permutation Index
      • Combination Sum
      • Combination Sum II
      • Combination Sum III
      • Combination
      • Path Sum
    • 1 Enumeration
      • Cartesian Product
      • Letter Combinations of a Phone Number
      • Split String
      • Palindrome Partitioning
      • Expression Add Operators
      • Target Sum
      • Restore IP Addresses
      • Generate Parentheses
      • Generalized Abbreviation
      • Remove Invalid Parentheses
      • Letter Case Permutation
      • Factor Combinations
      • Find the Missing Number II
    • 2 Search
      • N-Queens
      • Sudoku
      • Employee Importance
      • Increasing Subsequences
      • Nested List Weight Sum
    • 3 Flood Fill
      • Flood Fill
      • Number of Enclaves
    • 4 Path
      • Longest Increasing Path in a Matrix
      • Unique Paths III
    • 5 Memo
      • Knight Dialer
      • Regular Expression Matching
      • Wildcard Matching
    • # Word Big Four
      • Word Break
      • Word Break II
      • Word Pattern
      • Word Pattern II
      • Word Search
      • Word Search II
      • Word Ladder
      • Word Ladder II
  • Tree
    • 0 Binary Search Tree
      • Validate Binary Search Tree
      • Recover Binary Search Tree
      • Minimum Absolute Difference in BST
      • Find Mode in Binary Search Tree
      • Verify Preorder Sequence in Binary Search Tree
      • Unique Binary Search Trees
      • Count of Smaller Numbers After Self
      • Trim a Binary Search Tree
      • Closest Binary Search Tree Value
      • Closest Binary Search Tree Value II
    • 1 Traversal
      • Binary Tree Inorder Traversal
      • Binary Tree Preorder Traversal
      • Binary Tree Postorder Traversal
      • Binary Tree Level Order Traversal
    • 2 Divide and Conquer
      • Balanced Binary Tree
      • Max/Min Depth of Binary Tree
      • Diameter of Tree
      • DiffSum
      • Find Leaves of Binary Tree
    • 3 SubTree
      • Same/Symmetric Tree
      • TreeIsomorphism
      • Subtree of Another Tree
      • Find Duplicate Subtrees
      • Most Frequent Subtree Sum
      • Minimum Subtree
      • Subtree with Maximum Average
      • Equal Tree Partition
      • Flip Binary Tree To Match Preorder Traversal
    • 4 Path
      • Path Sum
      • Path Sum II
      • Path Sum III
      • Path Sum IV
      • Path Sum with Digits Representation
      • Binary Tree Paths
      • Binary Tree Longest Consecutive Sequence
      • Binary Tree Longest Consecutive Sequence II
      • Binary Tree Maximum Path Sum
      • Sum Root to Leaf Numbers
      • Boundary of Binary Tree
      • Smallest String Starting From Leaf
    • 5 Level Order
      • Level Order Traversal
      • Maximum Width of Binary Tree
      • Binary Tree Right Side View
      • Binary Tree Vertical Order Traversal
    • 6 LCA
      • LCA
      • LCA II
      • LCA III
      • LCA IV
      • Smallest Deepest Subtree
      • LCA of a BST
      • Cousins in Binary Tree
    • 7 Build Tree
      • Build Maximum Binary Tree
      • Convert Sorted List to Binary Search Tree
      • Serialize Deserialize
      • Verify Preorder Serialization of a Binary Tree
      • Construct Binary Tree from Traversals
    • 8 Distance
      • Closest Leaf in a Binary Tree
      • All Nodes Distance K in Binary Tree
    • 9 Structure
      • Flatten Binary Tree to Linked List
      • Binary Tree Upside Down
      • BST to Doubly LinkedList
      • Populating Next Right Pointers in Each Node
      • Populating Next Right Pointers in Each Node II
      • Invert Binary Tree
    • # N-ary Tree
  • String
    • 1 Pattern
      • Is Subsequence
      • One Edit Distance
      • Backspace String Compare
    • 2 Implementation
      • Reverse
      • Find the Closest Palindrome
      • Reverse Words in a String
      • Text Justification
    • 3 Substring
      • Implement Str
      • Longest Substring with At Least K Repeating Characters
      • Longest Common Prefix
    • 4 Number
      • Maximum Swap
      • Add Strings
      • Nth Digit
      • Compare Version Numbers
      • String to Integer (atoi)
      • Integer to English Words
      • Integer to Roman
      • Roman to Integer
      • Multiply Strings
      • Reverse Integer
    • 5 Decode
      • Decode String
      • Encode and Decode Strings
    • 6 Palindrome
      • Valid Palindrome
      • Valid Palindrome II
      • Palindrome Number
      • Palindrome Linked List
      • Palindromic Substrings
      • Palindrome Permutation
      • Palindrome Partitioning
      • Find Longest Palindromic Substring
      • Longest Palindromic Subsequence
      • Longest Palindromic Substrings
    • 7 Evaluation
      • Solve the Equation
      • Simplify Path
      • Valid Number
    • 8 Binary String
      • Count Binary Substrings
    • 9 Parenthesis
      • Valid Parenthesis String
      • Valid Parentheses
  • Data Structures
    • 0 Design
      • LRU
      • LFU
    • 1 LinkedList
      • Merge Sort
      • Find Cycle
      • Palindrome Linked List
      • Remove Duplicates
      • Flatten a Multilevel Doubly Linked List
      • Copy List with Random Pointer
    • 2 Stack
      • Min Stack
      • Max Stack
      • Implement Queue by Stacks
      • Implement Stack by Queues
    • 3 Queue/Deque
      • Moving Average
      • Design Circular Queue
      • Design Circular Deque
    • 4 Heap
      • Median for Data Stram
      • Kth Largest Data Stream
      • Top K Words
      • Top K Elements
      • Kth Smallest Number in Sorted Matrix
    • 5 Interval
      • Merge Intervals
      • Insert Interval
      • Non-overlapping Intervals
      • Maximum Length of Pair Chain
      • Meeting Room
      • Merge Two Sorted Interval List
      • Merge K Sorted Interval List
      • Intersection of Two Sorted Intervals
      • Meeting Room II
    • 6 Matrix
      • Multiply Sparse Matrix
      • Matrix Diagonal Traverse
      • Valid Sudoku
      • Spiral Matrix
    • 7 Iterator
      • Flatten 2D Vector
      • Pair Iterator
      • Peeking Iterator
      • Zigzag Iterator
    • 8 Hash
      • Design HashSet
      • Design HashMap
      • Hash Function
      • ReHash
      • Consistent Hash
      • Bloom Filter
      • Robin-Karp Algorithm
  • Advanced Data Structures
    • 1 Trie
      • Implement Trie
      • Stream of Characters
    • 2 Union Find
      • Number of Islands II
      • [Union Find]Graph Connect Tree
      • Minimum Spanning Tree
      • Bricks Falling When Hit
      • Most Stones Removed with Same Row or Column
      • Satisfiability of Equality Equations
    • 3 Monotonous Stack
      • Increasing Triplet Subsequence
      • Largest Rectangle in Histogram
      • Maximal Rectangle
      • Remove K Digits
      • Remove Duplicate Letters
      • Next Greater Element I
      • Next Greater Element II
    • 4 TreeSet/TreeMap
      • My Calendar I
      • My Calendar II
    • 5 Random
      • Shuffle an Array
      • Random Pick with Weight
      • Random Pick Index
    • 6 Binary Index Tree
    • 7 Segment Tree
      • Range Sum Query - Mutable
  • Graph
    • 1 General
      • Graph Deep Copy
    • 2 Topological Sorting
      • Course Schedule
      • Sequence Reconstruction
      • Alien Dictionary
    • 3 Bipartition
      • Is Graph Bipartition
      • Possible Bipartition
    • Detect Cycle in an Undirected Graph
    • Shortest Path in Undirected Graph
    • All Paths From Source to Target
    • Graph Valid Tree
    • Number of Connected Components in an Undirected Graph
    • Minimum Height Trees
  • Dynamic Programming
    • 0 Basic DP
      • Triangle
      • House Robber
      • House Robber II
      • Paint Fence
      • Paint House
    • 1 Sequence DP
      • Decode Ways
    • 2 Match Sequence DP
      • Edit Distance
      • K Edit Distance
      • Longest Common Subsequence
      • Minimum Swaps To Make Sequences Increasing
      • Scramble String
    • 3 Interval DP
      • Burst Ballons
      • Stone Game
    • 4 Matrix DP
      • Number Of Corner Rectangles
      • Max Square
      • Longest Increasing Path in a Matrix
    • 5 Backpack
      • K-Sum
      • Backpack1-01
      • Backpack2-01
      • Backpack4-Complete
      • Backpack3-Complete
      • Backpack7-Multiply
    • 6 Game DP
      • Predict the Winner
      • Can I Win
      • Coins In Line I
      • Coins In Line II
      • Coins In Line III
  • Common Methods
    • 1 Presum
      • Subarray Sum Equals K
      • Continuous Subarray Sum
      • Path Sum II
      • Min/Max Subarry
      • Contiguous 01-Array
      • Flip 01String to Monotone Increasing
      • Maximum Sum of Two Non-Overlapping Subarrays
    • 2 Bucket
      • Maximum Gap
    • 3 Simulation
      • Pour Water
    • 4 Buffer
      • Read N Characters Given Read4 II - Call multiple times
      • Read N Characters Given Read4
      • Third Maximum Number
    • 5 Merge/Union
      • Merge k Sorted Lists
      • Merge k Sorted Arrays
      • Merge k Sorted Intervals
      • Intersection of Three Sorted Array
      • Intersection of Two Arrays
      • Intersection of Two Arrays II
    • 6 Geometry
      • Max Points on a Line
    • 7 Math
      • GCD
      • Matrix Coordinate
      • Sqrt(x)
      • Divide Two Integers
      • pow(x, n)
    • 8 Sorting
      • Merge Sort
      • Quick Sort
      • Quick Select
  • Design/OOD
    • Design Rate Limiter
    • Design Hit Counter
    • Design Twitter
    • Design MapWithExpiration
    • Design Tiny URL
  • Appendix
    • Java Built-in
      • Comparator
      • Stream
      • String Pool
    • Multithreading
      • Synchronized
      • Producer-Consumer
      • CountDownLatch
      • Semaphore
      • Thread Pool
      • DeadLock
      • Inter-thread Communication
Powered by GitBook
On this page
  • Trie node structure
  • Insertion of a key to a trie
  • Complexity Analysis
  • Search for a key in a trie
  • Complexity Analysis
  • Search for a key prefix in a trie
  • Complexity Analysis
  • Hash
  1. Advanced Data Structures

1 Trie

PreviousAdvanced Data StructuresNextImplement Trie

Last updated 6 years ago

A Trie is a special data structure used to store strings that can be visualized like a tree. It consists of nodes and edges. Each node consists of at max 26 children and edges connect each parent node to its children. These 26 pointers are nothing but pointers for each of the 26 letters of the English alphabet A separate edge is maintained for every edge.

Strings are stored in a top to bottom manner on the basis of their prefix in a trie. All prefixes of length 1 are stored at until level 1, all prefixes of length 2 are sorted at until level 2 and so on.

A Trie data structure is very commonly used for representing the words stored in a dictionary. Each level represents one character of the word being formed. A word available in the dictionary can be read off from the Trie by starting from the root and going till the leaf.

There are several other data structures, like balanced trees and hash tables, which give us the possibility to search for a word in a dataset of strings. Then why do we need trie? Although hash table has O(1) time complexity for looking for a key, it is not efficient in the following operations:

  • Finding all keys with a common prefix.

  • Enumerating a dataset of strings in lexicographical order.

Another reason why trie outperforms hash table, is that as hash table increases in size, there are lots of hash collisions and the search time complexity could deteriorate to O(n), where n is the number of keys inserted. Trie could use less space compared to Hash Table when storing many keys with the same prefix. In this case using trie has only O(m) time complexity, where m is the key length. Searching for a key in a balanced tree costs O(mlogn) time complexity.

Trie node structure

Trie is a rooted tree. Its nodes have the following fields:

  • Maximum of R links to its children, where each link corresponds to one of R character values from dataset alphabet. In this article we assume that R is 26, the number of lowercase latin letters.

  • Boolean field which specifies whether the node corresponds to the end of the key, or is just a key prefix.

Representation of a key in trie

Insertion of a key to a trie

We insert a key by searching into the trie. We start from the root and search a link, which corresponds to the first key character. There are two cases :

  • A link exists. Then we move down the tree following the link to the next child level. The algorithm continues with searching for the next key character.

  • A link does not exist. Then we create a new node and link it with the parent's link matching the current key character. We repeat this step until we encounter the last character of the key, then we mark the current node as an end node and the algorithm finishes.

  • Time complexity: O(m), where m is the key length.

In each iteration of the algorithm, we either examine or create a node in the trie till we reach the end of the key. This takes only m operations.

  • Space complexity: O(m)

In the worst case newly inserted key doesn't share a prefix with the the keys already inserted in the trie. We have to add m new nodes, which takes us O(m) space.

Search for a key in a trie

Each key is represented in the trie as a path from the root to the internal node or leaf. We start from the root with the first key character. We examine the current node for a link corresponding to the key character. There are two cases :

  • A link exist. We move to the next node in the path following this link, and proceed searching for the next key character.

  • A link does not exist. If there are no available key characters and current node is marked asisEndwe return true. Otherwise there are possible two cases in each of them we return false :

    • There are key characters left, but it is impossible to follow the key path in the trie, and the key is missing.

    • No key characters left, but current node is not marked as isEnd. Therefore the search key is only a prefix of another key in the trie.

Complexity Analysis

  • Time complexity: O(m) In each step of the algorithm we search for the next key character. In the worst case the algorithm performs m operations.

  • Space complexity: O(1)

Search for a key prefix in a trie

The approach is very similar to the one we used for searching a key in a trie. We traverse the trie from the root, till there are no characters left in key prefix or it is impossible to continue the path in the trie with the current key character. The only difference with the mentioned abovesearch for a keyalgorithm is that when we come to an end of the key prefix, we always return true. We don't need to consider the isEndmark of the current trie node, because we are searching for a prefix of a key, not for a whole key.

  • Time complexity: O(m)

  • Space complexity: O(1)

Hash

Hash vs Trie

互相可替代

Trie 耗费更少的空间,单次查询 Trie 耗费更多的时间(复杂度相同,Trie 系数大一些)

Complexity Analysis

Search of a key in a trie

Complexity Analysis

Search of a key prefix in a trie
Insertion of keys into a trie