这个模块(build-in)实现了一个堆的数据结构,完美的解决了Top-K问题,以后解决Top-K问题的时候,直接把这个模块拿来用就可以了
注意,默认的heap是一个小顶堆!
heapq模块提供了如下几个函数:
heapq.heappush(heap, item) 把item添加到heap中(heap是一个列表)
heapq.heappop(heap) 把堆顶元素弹出,返回的就是堆顶
heapq.heappushpop(heap, item) 先把item加入到堆中,然后再pop,比heappush()再heappop()要快得多
heapq.heapreplace(heap, item) 先pop,然后再把item加入到堆中,比heappop()再heappush()要快得多
heapq.heapify(x) 将列表x进行堆调整,默认的是小顶堆
heapq.merge(*iterables) 将多个列表合并,并进行堆调整,返回的是合并后的列表的迭代器
heapq.nlargest(n, iterable, key=None) 返回最大的n个元素(Top-K问题)
heapq.nsmallest(n, iterable, key=None) 返回最小的n个元素(Top-K问题)
下面看下示例:
- import heapq
- import random
-
- # Top-K
- mylist = list(random.sample(range(100), 10))
- k = 3
- largest = heapq.nlargest(k, mylist)
- smallest = heapq.nsmallest(k, mylist)
- print('original list is', mylist)
- print('largest-'+str(k), ' is ', largest)
- print('smallest-'+str(k), ' is ', smallest)
-
- # heapify
- print('original list is', mylist)
- heapq.heapify(mylist)
- print('heapify list is', mylist)
-
- # heappush & heappop
- heapq.heappush(mylist, 105)
- print('pushed heap is', mylist)
- heapq.heappop(mylist)
- print('popped heap is', mylist)
-
- # heappushpop & heapreplace
- heapq.heappushpop(mylist, 130) # heappush -> heappop
- print('heappushpop', mylist)
- heapq.heapreplace(mylist, 2) # heappop -> heappush
- print('heapreplace', mylist)
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
输出结果为:
- original list is [18, 6, 10, 24, 48, 2, 9, 25, 16, 89]
- largest-3 is [89, 48, 25]
- smallest-3 is [2, 6, 9]
- original list is [18, 6, 10, 24, 48, 2, 9, 25, 16, 89]
- heapify list is [2, 6, 9, 16, 48, 10, 18, 25, 24, 89]
- pushed heap is [2, 6, 9, 16, 48, 10, 18, 25, 24, 89, 105]
- popped heap is [6, 16, 9, 24, 48, 10, 18, 25, 105, 89]
- heappushpop [9, 16, 10, 24, 48, 130, 18, 25, 105, 89]
- heapreplace [2, 16, 10, 24, 48, 130, 18, 25, 105, 89]
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
希望能对你有所帮助!