赞
踩
heapq模块是python库中实现的小顶堆相关的函数的集合。对外暴露的接口函数有’heappush’, ‘heappop’, ‘heapify’, ‘heapreplace’, ‘merge’, ‘nlargest’, ‘nsmallest’, ‘heappushpop’。我挑选前两个进行详细的讲解。
在Python中,heapq模块提供了对小顶堆(Min Heap)的支持。小顶堆是一种数据结构,其中每个节点的值都小于或等于其子节点的值。这意味着堆的根节点始终是最小的元素。
堆的定义如下,n个关键字序列L[0…n-1】称为堆,当且仅当该序列满足∶
①L(i)>=L(2i+1)且L(i)>=L(2i+2)或
②L(i)<=L(2i+1)且L(i)<=L(2i+2)(0≤i≤n//2)
可以将该一维数组视为一棵完全二叉树,满足条件①的堆称为大根堆(大顶堆)。大根堆的最大元素存放在根结点。满足条件②的堆称为小根堆(小顶堆), 小根堆的定义刚好相反, 根结点是最小元素。
让我们通过一个简单的例子来理解小顶堆的特性:
import heapq # 创建一个空的小顶堆 heap = [] # 往堆中添加元素 heapq.heappush(heap, 4) heapq.heappush(heap, 2) heapq.heappush(heap, 8) heapq.heappush(heap, 1) # 弹出堆中最小的元素 min_element = heapq.heappop(heap) print("堆中最小的元素是:", min_element) print("剩余堆的元素:", heap)
堆中最小的元素是: 1
剩余堆的元素: [2, 4, 8]
上述例子中,我们使用heappush向堆中添加元素,然后使用heappop弹出最小的元素。小顶堆的特性确保每次弹出的元素都是堆中最小的。
heapq模块主要用于处理堆数据结构,提供了一些函数接口来有效地进行堆操作。其中,heappush和heappop是两个常用的函数,让我们深入了解它们的作用。
heapq.heappush(heap, item)函数用于将元素item添加到堆heap中,并保持堆的特性。即在添加元素后,堆仍然是一个有效的小顶堆。
import heapq
heap = [1, 3, 5, 7, 9]
print("原始堆:", heap)
# 使用heappush添加元素到堆中
heapq.heappush(heap, 4)
print("添加元素后的堆:", heap)
原始堆: [1, 3, 5, 7, 9]
添加元素后的堆: [1, 3, 4, 7, 9, 5]
heapq.heappop(heap)函数用于弹出并返回堆heap中的最小元素。在弹出元素后,堆的特性仍然被维护。
import heapq
heap = [1, 3, 5, 7, 9]
print("原始堆:", heap)
# 使用heappop弹出最小元素
min_element = heapq.heappop(heap)
print("弹出的最小元素是:", min_element)
print("剩余堆的元素:", heap)
原始堆: [1, 3, 5, 7, 9]
弹出的最小元素是: 1
剩余堆的元素: [3, 7, 5, 9]
def heappush(heap, item): """Push item onto heap, maintaining the heap invariant.""" heap.append(item) _siftdown(heap, 0, len(heap)-1) # 'heap' is a heap at all indices >= startpos, except possibly for pos. pos # is the index of a leaf with a possibly out-of-order value. Restore the # heap invariant. def _siftdown(heap, startpos, pos): newitem = heap[pos] # Follow the path to the root, moving parents down until finding a place # newitem fits. while pos > startpos: parentpos = (pos - 1) >> 1 # 计算父节点的位置 parent = heap[parentpos] if newitem < parent: heap[pos] = parent # 如果新元素小于父节点,交换它们的位置 pos = parentpos continue break heap[pos] = newitem # 将新元素放置在正确的位置
当调用heapq.heappush(heap, item)时,heappush函数会将元素item添加到堆heap中,并确保堆的特性得以维持。在heappush函数内部,使用_siftdown函数来执行以下功能:将新添加的元素item放置在正确的位置,以保持小顶堆的性质。具体而言,它通过比较元素的值和其父节点的值,并在需要时进行交换,直到满足小顶堆的条件为止。因为heappush函数将新元素是放在堆底,然后让堆底元素上浮到合适的位置,所有称为新加元素的上浮操作,_siftdown的down有点让人混淆视听。用面向对象的编程方法改写heapq.heappush函数如下节。
class HeapLittle: def __init__(self): self.size = 0 self.array = [] @staticmethod def get_parent_index(child_index): return (child_index - 1) // 2 def heap_push(self, item): self.array.append(item) self.size += 1 self.heap_item_up(0, self.size - 1) def heap_item_up(self, start_position, end_position): new_item = self.array[end_position] iteration_index = end_position iteration_parent_index = self.get_parent_index(iteration_index) while iteration_parent_index >= start_position and new_item < self.array[iteration_parent_index]: self.array[iteration_index] = self.array[iteration_parent_index] iteration_index = iteration_parent_index iteration_parent_index = self.get_parent_index(iteration_index) self.array[iteration_index] = new_item
上面的heap_item_up函数改进了原版_siftdown函数的while循环的continue,break的逻辑,并增加函数封装更加表意。
def heappop(heap): """Pop the smallest item off the heap, maintaining the heap invariant.""" lastelt = heap.pop() # raises appropriate IndexError if heap is empty if heap: returnitem = heap[0] heap[0] = lastelt _siftup(heap, 0) return returnitem return lastelt def _siftup(heap, pos): endpos = len(heap) startpos = pos newitem = heap[pos] # Bubble up the smaller child until hitting a leaf. childpos = 2*pos + 1 # leftmost child position while childpos < endpos: # Set childpos to index of smaller child. rightpos = childpos + 1 if rightpos < endpos and not heap[childpos] < heap[rightpos]: childpos = rightpos # 如果右子节点存在且比左子节点小,则选择右子节点 # Move the smaller child up. heap[pos] = heap[childpos] # 如果pos位置的节点有子节点,选最小子节点的位置继续往下沉,直到沉到叶子节点。 pos = childpos childpos = 2*pos + 1 # The leaf at pos is empty now. Put newitem there, and bubble it up # to its final resting place (by sifting its parents down). heap[pos] = newitem _siftdown(heap, startpos, pos) #将沉到叶子节点(pos位置)往上浮到合适位置。
当调用heapq.heappop(heap)时,heappop函数会弹出并返回堆heap中的最小元素也就是堆顶元素,将堆的最后一个元素移动到堆顶,堆的长度减1。此时是新加入到堆顶的元素破坏了堆的性质,就需要把这个元素安置到合理位置。使用_siftup函数来执行以下功能:如果新加入到堆顶的元素有子节点,选它的最小子节点的位置往下沉,直到沉到叶子节点,此时叶子节点的位置假设为pos。那么就要执行_siftdown(heap, startpos, pos)函数将沉到的叶子节点(pos位置)往上浮到合适位置。以确保堆的特性得以维持。
分析上述_siftup函数,总是感觉怪怪的,是否有点脱裤子放屁–多次一举的操作。先将堆顶元素下沉到叶子节点,然后将叶子节点上浮到合适位置。这是否将操作量变大了?或许有我无法理解的深刻含义,读者如果有知道的可以告知一二。
def heap_pop(self): """Pop the smallest item off the heap, maintaining the heap invariant.""" last_item = self.array.pop() # raises appropriate IndexError if heap is empty self.size -= 1 if self.size != 0: return_item = self.array[0] self.array[0] = last_item # print(self) # print(f"first not adjusted, item is {last_item}") self.heap_item_down(0) return return_item return last_item def heap_item_down(self, start_position): new_item = self.array[start_position] iteration_index = start_position while self.has_left_child(iteration_index): smaller_child_index = self.get_left_child_index(iteration_index) right_child_index = self.get_right_child_index(iteration_index) if self.has_right_child(iteration_index) and self.array[right_child_index] < self.array[smaller_child_index]: smaller_child_index = right_child_index if new_item > self.array[smaller_child_index]: self.array[iteration_index] = self.array[smaller_child_index] iteration_index = smaller_child_index else: break # print(self) # print(f"iteration_index is {iteration_index}, item value is {new_item}") self.array[iteration_index] = new_item # print(self) # print("heap_item_down") def has_left_child(self, index): return self.get_left_child_index(index) < self.size def get_left_child(self, index): return self.array[self.get_left_child_index(index)] def get_right_child(self, index): return self.array[self.get_right_child_index(index)]
新加入到堆顶的元素破坏了堆的性质,就需要把这个元素安置到合理位置。使用heap_item_down函数来执行以下功能:如果新加入到堆顶的元素有子节点,并且它的子节点有比它值小的,那就选它的最小子节点的位置往下沉。直到已经沉到叶子节点了,或它的节点值都比它子节点值都小。
import numpy as np import heapq class HeapLittle: def __init__(self): self.size = 0 self.array = [] def print_heap(self): for i in self.array: print(i) print("heap is above") def __str__(self): if self.size == 0: return "Heap is empty" else: return str(self.array) @staticmethod def get_left_child_index(parent_index): return 2 * parent_index + 1 @staticmethod def get_right_child_index(parent_index): return 2 * parent_index + 2 @staticmethod def get_parent_index(child_index): return (child_index - 1) // 2 def has_left_child(self, index): return self.get_left_child_index(index) < self.size def has_right_child(self, index): return self.get_right_child_index(index) < self.size def has_parent(self, index): return self.get_parent_index(index) >= 0 def get_left_child(self, index): return self.array[self.get_left_child_index(index)] def get_right_child(self, index): return self.array[self.get_right_child_index(index)] def get_parent(self, index): return self.array[self.get_parent_index(index)] def heap_item_up(self, start_position, end_position): new_item = self.array[end_position] iteration_index = end_position iteration_parent_index = self.get_parent_index(iteration_index) while iteration_parent_index >= start_position and new_item < self.array[iteration_parent_index]: self.array[iteration_index] = self.array[iteration_parent_index] iteration_index = iteration_parent_index iteration_parent_index = self.get_parent_index(iteration_index) self.array[iteration_index] = new_item def heap_item_down(self, start_position): new_item = self.array[start_position] iteration_index = start_position while self.has_left_child(iteration_index): smaller_child_index = self.get_left_child_index(iteration_index) right_child_index = self.get_right_child_index(iteration_index) if self.has_right_child(iteration_index) and self.array[right_child_index] < self.array[smaller_child_index]: smaller_child_index = right_child_index if new_item > self.array[smaller_child_index]: self.array[iteration_index] = self.array[smaller_child_index] iteration_index = smaller_child_index else: break # print(self) # print(f"iteration_index is {iteration_index}, item value is {new_item}") self.array[iteration_index] = new_item # print(self) # print("heap_item_down") def heap_push(self, item): self.array.append(item) self.size += 1 self.heap_item_up(0, self.size - 1) def heap_pop(self): """Pop the smallest item off the heap, maintaining the heap invariant.""" last_item = self.array.pop() # raises appropriate IndexError if heap is empty self.size -= 1 if self.size != 0: return_item = self.array[0] self.array[0] = last_item # print(self) # print(f"first not adjusted, item is {last_item}") self.heap_item_down(0) return return_item return last_item heap_little_object = HeapLittle() array_normal = [np.random.randint(1, 50) for i in range(0, 10)] print(array_normal) for i in array_normal: heap_little_object.heap_push(i) print(heap_little_object) for i in range(0, 10): pop_item = heap_little_object.heap_pop() print(heap_little_object) print('python heapq') array_heapq = [] for i in array_normal: heapq.heappush(array_heapq, i) print(array_heapq) for i in range(0, 10): pop_item = heapq.heappop(array_heapq) print(array_heapq)
验证效果如下
[9, 13, 45, 3, 23, 2, 4, 49, 19, 36] [9] [9, 13] [9, 13, 45] [3, 9, 45, 13] [3, 9, 45, 13, 23] [2, 9, 3, 13, 23, 45] [2, 9, 3, 13, 23, 45, 4] [2, 9, 3, 13, 23, 45, 4, 49] [2, 9, 3, 13, 23, 45, 4, 49, 19] [2, 9, 3, 13, 23, 45, 4, 49, 19, 36] [3, 9, 4, 13, 23, 45, 36, 49, 19] [4, 9, 19, 13, 23, 45, 36, 49] [9, 13, 19, 49, 23, 45, 36] [13, 23, 19, 49, 36, 45] [19, 23, 45, 49, 36] [23, 36, 45, 49] [36, 49, 45] [45, 49] [49] Heap is empty python heapq [9] [9, 13] [9, 13, 45] [3, 9, 45, 13] [3, 9, 45, 13, 23] [2, 9, 3, 13, 23, 45] [2, 9, 3, 13, 23, 45, 4] [2, 9, 3, 13, 23, 45, 4, 49] [2, 9, 3, 13, 23, 45, 4, 49, 19] [2, 9, 3, 13, 23, 45, 4, 49, 19, 36] [3, 9, 4, 13, 23, 45, 36, 49, 19] [4, 9, 19, 13, 23, 45, 36, 49] [9, 13, 19, 49, 23, 45, 36] [13, 23, 19, 49, 36, 45] [19, 23, 45, 49, 36] [23, 36, 45, 49] [36, 49, 45] [45, 49] [49] []
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。