赞
踩
散列表(Hash Table),也称为哈希表,是一种数据结构,它使用哈希函数将键映射到数组的一个索引位置,从而可以快速地插入、查找和删除数据。散列表的核心在于哈希函数和解决冲突的策略。
在Java中,散列表的实现通常涉及以下几个关键部分:
下面是一个简单的散列表实现,使用链表解决冲突:
import java.util.LinkedList; public class SimpleHashTable<K, V> { private static final float DEFAULT_LOAD_FACTOR = 0.75f; private LinkedList<Entry<K, V>>[] buckets; private int size; private float loadFactor; private int capacity; public SimpleHashTable(int initialCapacity) { this(initialCapacity, DEFAULT_LOAD_FACTOR); } public SimpleHashTable(int initialCapacity, float loadFactor) { if (initialCapacity < 0) { throw new IllegalArgumentException("Illegal capacity: " + initialCapacity); } if (loadFactor <= 0 || Float.isNaN(loadFactor)) { throw new IllegalArgumentException("Illegal load factor: " + loadFactor); } this.loadFactor = loadFactor; this.capacity = initialCapacity; this.buckets = new LinkedList[capacity]; } public V put(K key, V value) { int hash = key.hashCode(); int index = hash % capacity; LinkedList<Entry<K, V>> bucket = buckets[index]; if (bucket == null) { bucket = new LinkedList<>(); buckets[index] = bucket; } for (Entry<K, V> entry : bucket) { if (entry.getKey().equals(key)) { V oldValue = entry.getValue(); entry.setValue(value); return oldValue; } } bucket.add(new Entry<>(key, value)); size++; if ((float)size / capacity >= loadFactor) { resize(); } return null; } public V get(K key) { int hash = key.hashCode(); int index = hash % capacity; LinkedList<Entry<K, V>> bucket = buckets[index]; if (bucket != null) { for (Entry<K, V> entry : bucket) { if (entry.getKey().equals(key)) { return entry.getValue(); } } } return null; } private void resize() { LinkedList<Entry<K, V>>[] oldBuckets = buckets; capacity *= 2; buckets = new LinkedList[capacity]; for (LinkedList<Entry<K, V>> bucket : oldBuckets) { if (bucket != null) { for (Entry<K, V> entry : bucket) { put(entry.getKey(), entry.getValue()); } } } } private static class Entry<K, V> { private K key; private V value; public Entry(K key, V value) { this.key = key; this.value = value; } public K getKey() { return key; } public V getValue() { return value; } public void setValue(V value) { this.value = value; } } }
public class Main {
public static void main(String[] args) {
SimpleHashTable<String, Integer> table = new SimpleHashTable<>(16);
table.put("one", 1);
table.put("two", 2);
table.put("three", 3);
System.out.println(table.get("one")); // 输出: 1
System.out.println(table.get("two")); // 输出: 2
System.out.println(table.get("three")); // 输出: 3
}
}
HashMap
类提供了更复杂和高效的散列表实现,包括更好的哈希函数和冲突解决策略。以上示例展示了如何在Java中实现一个基本的散列表。在实际项目中,你可能会根据具体需求调整和优化这个实现,例如使用更高效的数据结构替代链表,或者实现更复杂的哈希函数。
为了进一步优化上述散列表的实现,我们可以做以下几点改进:
remove
方法、containsKey
、size
等。下面是一个采用线性探测开放寻址法的散列表实现,同时使用了更高效的散列函数:
import java.util.Arrays; public class OptimizedHashTable<K, V> { private static final float DEFAULT_LOAD_FACTOR = 0.75f; private Entry<K, V>[] table; private int size; private float loadFactor; private int capacity; public OptimizedHashTable(int initialCapacity) { this(initialCapacity, DEFAULT_LOAD_FACTOR); } public OptimizedHashTable(int initialCapacity, float loadFactor) { if (initialCapacity < 0) { throw new IllegalArgumentException("Illegal capacity: " + initialCapacity); } if (loadFactor <= 0 || Float.isNaN(loadFactor)) { throw new IllegalArgumentException("Illegal load factor: " + loadFactor); } this.loadFactor = loadFactor; this.capacity = initialCapacity; this.table = new Entry[capacity]; } public V put(K key, V value) { int hash = key.hashCode(); int index = hash & (capacity - 1); // 使用位运算优化 int originalIndex = index; while (table[index] != null && !table[index].getKey().equals(key)) { index = (index + 1) & (capacity - 1); // 线性探测 if (index == originalIndex) { throw new IllegalStateException("Table is full."); } } if (table[index] == null) { table[index] = new Entry<>(key, value); size++; } else { V oldValue = table[index].getValue(); table[index].setValue(value); return oldValue; } if ((float)size / capacity >= loadFactor) { resize(); } return null; } public V get(K key) { int hash = key.hashCode(); int index = hash & (capacity - 1); int originalIndex = index; while (table[index] != null) { if (table[index].getKey().equals(key)) { return table[index].getValue(); } index = (index + 1) & (capacity - 1); if (index == originalIndex) { break; // 防止无限循环 } } return null; } private void resize() { Entry<K, V>[] oldTable = table; capacity *= 2; table = new Entry[capacity]; size = 0; for (Entry<K, V> entry : oldTable) { if (entry != null) { put(entry.getKey(), entry.getValue()); } } } private static class Entry<K, V> { private K key; private V value; public Entry(K key, V value) { this.key = key; this.value = value; } public K getKey() { return key; } public V getValue() { return value; } public void setValue(V value) { this.value = value; } } }
在这个版本中,我们使用了线性探测来解决冲突,并且通过位运算优化了散列函数。请注意,开放寻址法可能导致散列表在接近满载时性能下降,因此需要谨慎选择负载因子和适当的重新散列策略。
此外,我们没有在上述代码中实现线程安全,因为这通常会增加复杂性和开销。在多线程环境下,可以考虑使用ConcurrentHashMap
或其他线程安全的散列表实现。
为了进一步完善代码,我们可以添加一些实用的方法,比如remove
方法来删除键值对,containsKey
方法来检查是否存在某个键,以及size
方法来返回散列表当前的元素数量。同时,为了保证线程安全,我们可以添加同步机制。以下是完善后的代码:
import java.util.Arrays; public class OptimizedHashTable<K, V> { private static final float DEFAULT_LOAD_FACTOR = 0.75f; private Entry<K, V>[] table; private int size; private float loadFactor; private int capacity; public OptimizedHashTable(int initialCapacity) { this(initialCapacity, DEFAULT_LOAD_FACTOR); } public OptimizedHashTable(int initialCapacity, float loadFactor) { if (initialCapacity < 0) { throw new IllegalArgumentException("Illegal capacity: " + initialCapacity); } if (loadFactor <= 0 || Float.isNaN(loadFactor)) { throw new IllegalArgumentException("Illegal load factor: " + loadFactor); } this.loadFactor = loadFactor; this.capacity = initialCapacity; this.table = new Entry[capacity]; } public synchronized V put(K key, V value) { int hash = key.hashCode(); int index = hash & (capacity - 1); int originalIndex = index; while (table[index] != null && !table[index].getKey().equals(key)) { index = (index + 1) & (capacity - 1); if (index == originalIndex) { throw new IllegalStateException("Table is full."); } } if (table[index] == null) { table[index] = new Entry<>(key, value); size++; } else { V oldValue = table[index].getValue(); table[index].setValue(value); return oldValue; } if ((float)size / capacity >= loadFactor) { resize(); } return null; } public synchronized V get(K key) { int hash = key.hashCode(); int index = hash & (capacity - 1); int originalIndex = index; while (table[index] != null) { if (table[index].getKey().equals(key)) { return table[index].getValue(); } index = (index + 1) & (capacity - 1); if (index == originalIndex) { break; } } return null; } public synchronized V remove(K key) { int hash = key.hashCode(); int index = hash & (capacity - 1); int originalIndex = index; while (table[index] != null) { if (table[index].getKey().equals(key)) { V removedValue = table[index].getValue(); table[index] = null; size--; // Re-hash the remaining entries to avoid clustering rehashAfterRemove(index); return removedValue; } index = (index + 1) & (capacity - 1); if (index == originalIndex) { break; } } return null; } private synchronized void rehashAfterRemove(int removedIndex) { for (int i = (removedIndex + 1) & (capacity - 1); table[i] != null; i = (i + 1) & (capacity - 1)) { Entry<K, V> entry = table[i]; table[i] = null; size--; put(entry.getKey(), entry.getValue()); } } public synchronized boolean containsKey(K key) { int hash = key.hashCode(); int index = hash & (capacity - 1); int originalIndex = index; while (table[index] != null) { if (table[index].getKey().equals(key)) { return true; } index = (index + 1) & (capacity - 1); if (index == originalIndex) { break; } } return false; } public synchronized int size() { return size; } private synchronized void resize() { Entry<K, V>[] oldTable = table; capacity *= 2; table = new Entry[capacity]; size = 0; for (Entry<K, V> entry : oldTable) { if (entry != null) { put(entry.getKey(), entry.getValue()); } } } private static class Entry<K, V> { private K key; private V value; public Entry(K key, V value) { this.key = key; this.value = value; } public K getKey() { return key; } public V getValue() { return value; } public void setValue(V value) { this.value = value; } } }
在这个版本中,我们添加了remove
、containsKey
和size
方法,并且所有修改散列表状态的方法都被声明为synchronized
,以保证线程安全。但是,请注意,完全的线程安全可能会影响性能,因此在实际应用中,你可能需要根据具体场景权衡线程安全性和性能。
此外,rehashAfterRemove
方法用于在删除元素后重新散列受影响的元素,以避免由于删除操作导致的散列表局部热点。这有助于维持散列表的良好性能。
让我们通过一个表格逐步展示上述优化后的散列表代码的运作流程。我们将使用一个初始容量为8,负载因子为0.75的散列表,并执行一系列的插入、查找、删除和扩容操作。
Index | Key | Value |
---|---|---|
0 | ||
1 | ||
2 | ||
3 | ||
4 | ||
5 | ||
6 | ||
7 |
Index | Key | Value |
---|---|---|
0 | 1 | A |
1 | ||
2 | ||
3 | ||
4 | ||
5 | ||
6 | ||
7 |
由于9 % 8 = 1
,但Index 1
是空的,所以直接插入。
Index | Key | Value |
---|---|---|
0 | 1 | A |
1 | 9 | B |
2 | ||
3 | ||
4 | ||
5 | ||
6 | ||
7 |
由于17 % 8 = 1
,但Index 1
已被占用,进行线性探测,下一个位置2
是空的,所以插入Index 2
。
Index | Key | Value |
---|---|---|
0 | 1 | A |
1 | 9 | B |
2 | 17 | C |
3 | ||
4 | ||
5 | ||
6 | ||
7 |
由于25 % 8 = 1
,并且Index 1
和Index 2
均被占用,继续线性探测,直到找到Index 3
是空的,所以插入Index 3
。
Index | Key | Value |
---|---|---|
0 | 1 | A |
1 | 9 | B |
2 | 17 | C |
3 | 25 | D |
4 | ||
5 | ||
6 | ||
7 |
从Index 1
开始,线性探测直到找到Key=17
位于Index 2
。
Index | Key | Value |
---|---|---|
0 | 1 | A |
1 | 9 | B |
2 | 17 | C |
3 | 25 | D |
4 | ||
5 | ||
6 | ||
7 |
删除Key=9
,并重新散列Index 2
之后的所有元素。
Index | Key | Value |
---|---|---|
0 | 1 | A |
1 | ||
2 | 17 | C |
3 | 25 | D |
4 | ||
5 | ||
6 | ||
7 |
由于33 % 8 = 1
,并且Index 1
是空的,所以直接插入Index 1
。
Index | Key | Value |
---|---|---|
0 | 1 | A |
1 | 33 | E |
2 | 17 | C |
3 | 25 | D |
4 | ||
5 | ||
6 | ||
7 |
此时,size=5
,达到扩容条件(size / capacity >= loadFactor
),因此先进行扩容操作,容量变为16,然后重新散列所有元素。
Index | Key | Value |
---|---|---|
0 | 1 | A |
1 | 33 | E |
2 | 17 | C |
3 | 25 | D |
… | … | … |
12 | 41 | F |
… | … | … |
15 |
通过上述步骤,我们可以清晰地看到散列表如何通过散列函数和线性探测来管理元素的插入、查找和删除操作,以及如何在达到特定条件时自动进行扩容,以保持良好的性能。
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。