CodeBook
  • Introduction
  • Array
    • Maximum Product of Three Numbers
    • Set Mismatch
    • Find the Duplicate Number
    • Find All Duplicates in an Array
    • Find All Numbers Disappeared in an Array
    • Missing Number
    • Single Number
    • Find Difference
    • Find the Celebrity
    • Word Distance
    • Product of Array Except Self
  • Binary Search
    • First Bad Version
    • Search in a Big Sorted Array
    • Search Range
    • Find the Peak
    • Maximum Number in Mountain Sequence
    • Search in Rotated Sorted Array
    • Find Minimum in Rotated Sorted Array
    • Search a 2D Matrix
    • Search a 2D Matrix II
    • Smallest Rectangle Enclosing Black Pixels
    • [Binary Search] Merge Two Sorted Array
    • Single Element in a Sorted Array
  • Two Pointers
    • 1 Forward
      • Moving Zeros
      • Remove Elements
      • Remove Duplicated
      • Longest Continuous Increasing Subsequence
      • Replace all Occurrences of String AB with C
    • 2 Oppsite
      • Rotate Array
      • Container With Most Water
      • Trapping Rain Water
      • Triangle Count
    • 3 Sliding Windows
      • Permutation in String
      • Find All Anagrams in a String
      • Longest Substring with At Most K Distinct Characters
      • Max Consecutive Ones II
      • Minimum Size Subarray Sum
      • Longest Substring Without Repeating Characters
      • Minimum Window Substring
      • Subarrays with K Different Integers
    • 4 Partition
      • Color Sort
      • Color Sort II
      • Partition Array
    • # Sum
      • Two Sum
      • Two Sum - Unique Pairs
      • Two Sum - Less Or Equal
      • Two Sum - Greater Than Target
      • Two Sum - Closest
      • Two Sum - Difference
      • Two Sum - Data Structure Design
      • Three Sum
      • Three Sum With Multiplicity
      • Three Sum Smaller
      • Three Sum - Triangle Count
      • Four Sum
  • BFS
    • 1 Traverse
      • Number of Islands
      • Clone Gragh
      • Number of Distinct Islands
      • Pacific Atlantic Water Flow
      • Surrounded Regions
      • Walls and Gates
      • Max Area of Island
    • 2 Shortest
      • 01 Matrix
      • Knight Shortest Path
      • Shortest Distance from All Buildings
      • Best Meet Point
      • Shortest Bridge
      • Snakes and Ladders
      • Bus Route
  • DFS
    • 0 Basic
      • Subsets
      • Subsets II
      • Permutations
      • Permutations II
      • Prev/Next Permutation
      • Kth Permutation
      • Permutation Index
      • Combination Sum
      • Combination Sum II
      • Combination Sum III
      • Combination
      • Path Sum
    • 1 Enumeration
      • Cartesian Product
      • Letter Combinations of a Phone Number
      • Split String
      • Palindrome Partitioning
      • Expression Add Operators
      • Target Sum
      • Restore IP Addresses
      • Generate Parentheses
      • Generalized Abbreviation
      • Remove Invalid Parentheses
      • Letter Case Permutation
      • Factor Combinations
      • Find the Missing Number II
    • 2 Search
      • N-Queens
      • Sudoku
      • Employee Importance
      • Increasing Subsequences
      • Nested List Weight Sum
    • 3 Flood Fill
      • Flood Fill
      • Number of Enclaves
    • 4 Path
      • Longest Increasing Path in a Matrix
      • Unique Paths III
    • 5 Memo
      • Knight Dialer
      • Regular Expression Matching
      • Wildcard Matching
    • # Word Big Four
      • Word Break
      • Word Break II
      • Word Pattern
      • Word Pattern II
      • Word Search
      • Word Search II
      • Word Ladder
      • Word Ladder II
  • Tree
    • 0 Binary Search Tree
      • Validate Binary Search Tree
      • Recover Binary Search Tree
      • Minimum Absolute Difference in BST
      • Find Mode in Binary Search Tree
      • Verify Preorder Sequence in Binary Search Tree
      • Unique Binary Search Trees
      • Count of Smaller Numbers After Self
      • Trim a Binary Search Tree
      • Closest Binary Search Tree Value
      • Closest Binary Search Tree Value II
    • 1 Traversal
      • Binary Tree Inorder Traversal
      • Binary Tree Preorder Traversal
      • Binary Tree Postorder Traversal
      • Binary Tree Level Order Traversal
    • 2 Divide and Conquer
      • Balanced Binary Tree
      • Max/Min Depth of Binary Tree
      • Diameter of Tree
      • DiffSum
      • Find Leaves of Binary Tree
    • 3 SubTree
      • Same/Symmetric Tree
      • TreeIsomorphism
      • Subtree of Another Tree
      • Find Duplicate Subtrees
      • Most Frequent Subtree Sum
      • Minimum Subtree
      • Subtree with Maximum Average
      • Equal Tree Partition
      • Flip Binary Tree To Match Preorder Traversal
    • 4 Path
      • Path Sum
      • Path Sum II
      • Path Sum III
      • Path Sum IV
      • Path Sum with Digits Representation
      • Binary Tree Paths
      • Binary Tree Longest Consecutive Sequence
      • Binary Tree Longest Consecutive Sequence II
      • Binary Tree Maximum Path Sum
      • Sum Root to Leaf Numbers
      • Boundary of Binary Tree
      • Smallest String Starting From Leaf
    • 5 Level Order
      • Level Order Traversal
      • Maximum Width of Binary Tree
      • Binary Tree Right Side View
      • Binary Tree Vertical Order Traversal
    • 6 LCA
      • LCA
      • LCA II
      • LCA III
      • LCA IV
      • Smallest Deepest Subtree
      • LCA of a BST
      • Cousins in Binary Tree
    • 7 Build Tree
      • Build Maximum Binary Tree
      • Convert Sorted List to Binary Search Tree
      • Serialize Deserialize
      • Verify Preorder Serialization of a Binary Tree
      • Construct Binary Tree from Traversals
    • 8 Distance
      • Closest Leaf in a Binary Tree
      • All Nodes Distance K in Binary Tree
    • 9 Structure
      • Flatten Binary Tree to Linked List
      • Binary Tree Upside Down
      • BST to Doubly LinkedList
      • Populating Next Right Pointers in Each Node
      • Populating Next Right Pointers in Each Node II
      • Invert Binary Tree
    • # N-ary Tree
  • String
    • 1 Pattern
      • Is Subsequence
      • One Edit Distance
      • Backspace String Compare
    • 2 Implementation
      • Reverse
      • Find the Closest Palindrome
      • Reverse Words in a String
      • Text Justification
    • 3 Substring
      • Implement Str
      • Longest Substring with At Least K Repeating Characters
      • Longest Common Prefix
    • 4 Number
      • Maximum Swap
      • Add Strings
      • Nth Digit
      • Compare Version Numbers
      • String to Integer (atoi)
      • Integer to English Words
      • Integer to Roman
      • Roman to Integer
      • Multiply Strings
      • Reverse Integer
    • 5 Decode
      • Decode String
      • Encode and Decode Strings
    • 6 Palindrome
      • Valid Palindrome
      • Valid Palindrome II
      • Palindrome Number
      • Palindrome Linked List
      • Palindromic Substrings
      • Palindrome Permutation
      • Palindrome Partitioning
      • Find Longest Palindromic Substring
      • Longest Palindromic Subsequence
      • Longest Palindromic Substrings
    • 7 Evaluation
      • Solve the Equation
      • Simplify Path
      • Valid Number
    • 8 Binary String
      • Count Binary Substrings
    • 9 Parenthesis
      • Valid Parenthesis String
      • Valid Parentheses
  • Data Structures
    • 0 Design
      • LRU
      • LFU
    • 1 LinkedList
      • Merge Sort
      • Find Cycle
      • Palindrome Linked List
      • Remove Duplicates
      • Flatten a Multilevel Doubly Linked List
      • Copy List with Random Pointer
    • 2 Stack
      • Min Stack
      • Max Stack
      • Implement Queue by Stacks
      • Implement Stack by Queues
    • 3 Queue/Deque
      • Moving Average
      • Design Circular Queue
      • Design Circular Deque
    • 4 Heap
      • Median for Data Stram
      • Kth Largest Data Stream
      • Top K Words
      • Top K Elements
      • Kth Smallest Number in Sorted Matrix
    • 5 Interval
      • Merge Intervals
      • Insert Interval
      • Non-overlapping Intervals
      • Maximum Length of Pair Chain
      • Meeting Room
      • Merge Two Sorted Interval List
      • Merge K Sorted Interval List
      • Intersection of Two Sorted Intervals
      • Meeting Room II
    • 6 Matrix
      • Multiply Sparse Matrix
      • Matrix Diagonal Traverse
      • Valid Sudoku
      • Spiral Matrix
    • 7 Iterator
      • Flatten 2D Vector
      • Pair Iterator
      • Peeking Iterator
      • Zigzag Iterator
    • 8 Hash
      • Design HashSet
      • Design HashMap
      • Hash Function
      • ReHash
      • Consistent Hash
      • Bloom Filter
      • Robin-Karp Algorithm
  • Advanced Data Structures
    • 1 Trie
      • Implement Trie
      • Stream of Characters
    • 2 Union Find
      • Number of Islands II
      • [Union Find]Graph Connect Tree
      • Minimum Spanning Tree
      • Bricks Falling When Hit
      • Most Stones Removed with Same Row or Column
      • Satisfiability of Equality Equations
    • 3 Monotonous Stack
      • Increasing Triplet Subsequence
      • Largest Rectangle in Histogram
      • Maximal Rectangle
      • Remove K Digits
      • Remove Duplicate Letters
      • Next Greater Element I
      • Next Greater Element II
    • 4 TreeSet/TreeMap
      • My Calendar I
      • My Calendar II
    • 5 Random
      • Shuffle an Array
      • Random Pick with Weight
      • Random Pick Index
    • 6 Binary Index Tree
    • 7 Segment Tree
      • Range Sum Query - Mutable
  • Graph
    • 1 General
      • Graph Deep Copy
    • 2 Topological Sorting
      • Course Schedule
      • Sequence Reconstruction
      • Alien Dictionary
    • 3 Bipartition
      • Is Graph Bipartition
      • Possible Bipartition
    • Detect Cycle in an Undirected Graph
    • Shortest Path in Undirected Graph
    • All Paths From Source to Target
    • Graph Valid Tree
    • Number of Connected Components in an Undirected Graph
    • Minimum Height Trees
  • Dynamic Programming
    • 0 Basic DP
      • Triangle
      • House Robber
      • House Robber II
      • Paint Fence
      • Paint House
    • 1 Sequence DP
      • Decode Ways
    • 2 Match Sequence DP
      • Edit Distance
      • K Edit Distance
      • Longest Common Subsequence
      • Minimum Swaps To Make Sequences Increasing
      • Scramble String
    • 3 Interval DP
      • Burst Ballons
      • Stone Game
    • 4 Matrix DP
      • Number Of Corner Rectangles
      • Max Square
      • Longest Increasing Path in a Matrix
    • 5 Backpack
      • K-Sum
      • Backpack1-01
      • Backpack2-01
      • Backpack4-Complete
      • Backpack3-Complete
      • Backpack7-Multiply
    • 6 Game DP
      • Predict the Winner
      • Can I Win
      • Coins In Line I
      • Coins In Line II
      • Coins In Line III
  • Common Methods
    • 1 Presum
      • Subarray Sum Equals K
      • Continuous Subarray Sum
      • Path Sum II
      • Min/Max Subarry
      • Contiguous 01-Array
      • Flip 01String to Monotone Increasing
      • Maximum Sum of Two Non-Overlapping Subarrays
    • 2 Bucket
      • Maximum Gap
    • 3 Simulation
      • Pour Water
    • 4 Buffer
      • Read N Characters Given Read4 II - Call multiple times
      • Read N Characters Given Read4
      • Third Maximum Number
    • 5 Merge/Union
      • Merge k Sorted Lists
      • Merge k Sorted Arrays
      • Merge k Sorted Intervals
      • Intersection of Three Sorted Array
      • Intersection of Two Arrays
      • Intersection of Two Arrays II
    • 6 Geometry
      • Max Points on a Line
    • 7 Math
      • GCD
      • Matrix Coordinate
      • Sqrt(x)
      • Divide Two Integers
      • pow(x, n)
    • 8 Sorting
      • Merge Sort
      • Quick Sort
      • Quick Select
  • Design/OOD
    • Design Rate Limiter
    • Design Hit Counter
    • Design Twitter
    • Design MapWithExpiration
    • Design Tiny URL
  • Appendix
    • Java Built-in
      • Comparator
      • Stream
      • String Pool
    • Multithreading
      • Synchronized
      • Producer-Consumer
      • CountDownLatch
      • Semaphore
      • Thread Pool
      • DeadLock
      • Inter-thread Communication
Powered by GitBook
On this page
  • HASH
  • 哈希函数(Hash Function)
  • 冲突(Collision)
  • 重哈希(Rehashing)
  • HashMap与HashTable的主要区别
  • 问题
  1. Data Structures

8 Hash

HASH

Get

Find

Add/Remove

Space

O(N/Bucket_size)

O(N/Bucket_size)

O(N/Bucket_size)

O(N)

可以看出HashMap整体上性能都非常不错,但是不稳定,为O(N/Buckets),N就是以数组中没有发生碰撞的元素,Buckets是因碰撞产生的链表。(发生碰撞实际上是非常稀少的,所以N/Bucket_size约等于1)

哈希函数(Hash Function)

Keys -> Hashing Function(sha1/sha2/md5) -> Buckets

In data structure Hash, hash function is used to convert a string(or any other type) into an integer smaller than hash size and bigger or equal to zero. The objective of designing a hash function is to "hash" the key as unreasonable as possible. A good hash function can avoid collision as less as possible. A widely used hash function algorithm is using a magic number 33, consider any string as a 33 based big integer like follow:

hashcode("abcd") = (ascii(a) * 33^3+ ascii(b) * 33^2+ ascii(c) * 33^1 + ascii(d) * 33^0) % HASH_SIZE

                          = (97* 33^3+ 98 * 33^2 + 99 * 33 +100) % HASH_SIZE

                          = 3595978 % HASH_SIZE

here HASH_SIZE is the capacity of the hash table (you can assume a hash table is like an array with index 0 ~ HASH_SIZE-1).

冲突(Collision)

是说两个不同的 key 经过哈希函数的计算后,得到了两个相同的值。解决冲突的方法,主要有两种:

  1. 开散列法(Open Hashing)。是指哈希表所基于的数组中,每个位置是一个 Linked List 的头结点。这样冲突的 <key, value> 二元组,就都放在同一个链表中。

  2. 闭散列法(Closed Hashing)。是指在发生冲突的时候,后来的元素,往下一个位置去找空位。

重哈希(Rehashing)

The size of the hash table is not determinate at the very beginning. If the total size of keys is too large (e.g. size >= capacity / 10), we should double the size of the hash table and rehash every keys.

HashMap与HashTable的主要区别

在很多的Java基础书上都已经说过了,他们的主要区别其实就是Table全局加了线程同步保护

  • HashTable线程更加安全,代价就是因为它粗暴的添加了同步锁,所以会有性能损失。

  • 其实有更好的concurrentHashMap可以替代HashTable,一个是方法级,一个是Class级。

问题

1. 什么时候会使用HashMap?他有什么特点? 是基于Map接口的实现,存储键值对时,它可以接收null的键值,是非同步的,HashMap存储着Entry(hash, key, value, next)对象。

2. 你知道HashMap的工作原理吗? 通过hash的方法,通过put和get存储和获取对象。存储对象时,我们将K/V传给put方法时,它调用hashCode计算hash从而得到bucket位置,进一步存储,HashMap会根据当前bucket的占用情况自动调整容量(超过Load Facotr则resize为原来的2倍)。获取对象时,我们将K传给get,它调用hashCode计算hash从而得到bucket位置,并进一步调用equals()方法确定键值对。如果发生碰撞的时候,Hashmap通过链表将产生碰撞冲突的元素组织起来,在Java 8中,如果一个bucket中碰撞冲突的元素超过某个限制(默认是8),则使用红黑树来替换链表,从而提高速度。

3. 你知道get和put的原理吗?equals()和hashCode()的都有什么作用? 通过对key的hashCode()进行hashing,并计算下标( n-1 & hash),从而获得buckets的位置。如果产生碰撞,则利用key.equals()方法去链表或树中去查找对应的节点

4. 你知道hash的实现吗?为什么要这样实现? 在Java 1.8的实现中,是通过hashCode()的高16位异或低16位实现的:(h = k.hashCode()) ^ (h >>> 16),主要是从速度、功效、质量来考虑的,这么做可以在bucket的n比较小的时候,也能保证考虑到高低bit都参与到hash的计算中,同时不会有太大的开销。

5. 如果HashMap的大小超过了负载因子(load factor)定义的容量,怎么办? 如果超过了负载因子(默认0.75),则会重新resize一个原来长度两倍的HashMap,并且重新调用hash方法。

以Entry[]数组实现的哈希桶数组,用Key的哈希值取模桶数组的大小可得到数组下标。

插入元素时,如果两条Key落在同一个桶(比如哈希值1和17取模16后都属于第一个哈希桶),Entry用一个next属性实现多个Entry以单向链表存放,后入桶的Entry将next指向桶当前的Entry。

查找哈希值为17的key时,先定位到第一个哈希桶,然后以链表遍历桶里所有元素,逐个比较其key值。

当Entry数量达到桶数量的75%时(很多文章说使用的桶数量达到了75%,但看代码不是),会成倍扩容桶数组,并重新分配所有原来的Entry,所以这里也最好有个预估值。

取模用位运算(hash & (arrayLength-1))会比较快,所以数组的大小永远是2的N次方, 你随便给一个初始值比如17会转为32。默认第一次放入元素时的初始值是16。

iterator()时顺着哈希桶数组来遍历,看起来是个乱序。

在JDK8里,新增默认为8的閥值,当一个桶里的Entry超过閥值,就不以单向链表而以红黑树来存放以加快Key的查找速度。

PreviousZigzag IteratorNextDesign HashSet

Last updated 6 years ago

中是这样描述的:

关于Java集合的小抄