📄 compatheapq.py
字号:
# -*- coding: Latin-1 -*-"""Heap queue algorithm (a.k.a. priority queue).Heaps are arrays for which a[k] <= a[2*k+1] and a[k] <= a[2*k+2] forall k, counting elements from 0. For the sake of comparison,non-existing elements are considered to be infinite. The interestingproperty of a heap is that a[0] is always its smallest element.Usage:heap = [] # creates an empty heapheappush(heap, item) # pushes a new item on the heapitem = heappop(heap) # pops the smallest item from the heapitem = heap[0] # smallest item on the heap without popping itheapify(x) # transforms list into a heap, in-place, in linear timeitem = heapreplace(heap, item) # pops and returns smallest item, and adds # new item; the heap size is unchangedOur API differs from textbook heap algorithms as follows:- We use 0-based indexing. This makes the relationship between the index for a node and the indexes for its children slightly less obvious, but is more suitable since Python uses 0-based indexing.- Our heappop() method returns the smallest item, not the largest.These two make it possible to view the heap as a regular Python listwithout surprises: heap[0] is the smallest item, and heap.sort()maintains the heap invariant!"""# Original code by Kevin O'Connor, augmented by Tim Peters__about__ = """Heap queues[explanation by Fran鏾is Pinard]Heaps are arrays for which a[k] <= a[2*k+1] and a[k] <= a[2*k+2] forall k, counting elements from 0. For the sake of comparison,non-existing elements are considered to be infinite. The interestingproperty of a heap is that a[0] is always its smallest element.The strange invariant above is meant to be an efficient memoryrepresentation for a tournament. The numbers below are `k', not a[k]: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30In the tree above, each cell `k' is topping `2*k+1' and `2*k+2'. Inan usual binary tournament we see in sports, each cell is the winnerover the two cells it tops, and we can trace the winner down the treeto see all opponents s/he had. However, in many computer applicationsof such tournaments, we do not need to trace the history of a winner.To be more memory efficient, when a winner is promoted, we try toreplace it by something else at a lower level, and the rule becomesthat a cell and the two cells it tops contain three different items,but the top cell "wins" over the two topped cells.If this heap invariant is protected at all time, index 0 is clearlythe overall winner. The simplest algorithmic way to remove it andfind the "next" winner is to move some loser (let's say cell 30 in thediagram above) into the 0 position, and then percolate this new 0 downthe tree, exchanging values, until the invariant is re-established.This is clearly logarithmic on the total number of items in the tree.By iterating over all items, you get an O(n ln n) sort.A nice feature of this sort is that you can efficiently insert newitems while the sort is going on, provided that the inserted items arenot "better" than the last 0'th element you extracted. This isespecially useful in simulation contexts, where the tree holds allincoming events, and the "win" condition means the smallest scheduledtime. When an event schedule other events for execution, they arescheduled into the future, so they can easily go into the heap. So, aheap is a good structure for implementing schedulers (this is what Iused for my MIDI sequencer :-).Various structures for implementing schedulers have been extensivelystudied, and heaps are good for this, as they are reasonably speedy,the speed is almost constant, and the worst case is not much differentthan the average case. However, there are other representations whichare more efficient overall, yet the worst cases might be terrible.Heaps are also very useful in big disk sorts. You most probably allknow that a big sort implies producing "runs" (which are pre-sortedsequences, which size is usually related to the amount of CPU memory),followed by a merging passes for these runs, which merging is oftenvery cleverly organised[1]. It is very important that the initialsort produces the longest runs possible. Tournaments are a good wayto that. If, using all the memory available to hold a tournament, youreplace and percolate items that happen to fit the current run, you'llproduce runs which are twice the size of the memory for random input,and much better for input fuzzily ordered.Moreover, if you output the 0'th item on disk and get an input whichmay not fit in the current tournament (because the value "wins" overthe last output value), it cannot fit in the heap, so the size of theheap decreases. The freed memory could be cleverly reused immediatelyfor progressively building a second heap, which grows at exactly thesame rate the first heap is melting. When the first heap completelyvanishes, you switch heaps and start a new run. Clever and quiteeffective!In a word, heaps are useful memory structures to know. I use them ina few applications, and I think it is good to keep a `heap' modulearound. :-)--------------------[1] The disk balancing algorithms which are current, nowadays, aremore annoying than clever, and this is a consequence of the seekingcapabilities of the disks. On devices which cannot seek, like bigtape drives, the story was quite different, and one had to be veryclever to ensure (far in advance) that each tape movement will be themost effective possible (that is, will best participate at"progressing" the merge). Some tapes were even able to readbackwards, and this was also used to avoid the rewinding time.Believe me, real good tape sorts were quite spectacular to watch!From all times, sorting has always been a Great Art! :-)"""def heappush(heap, item): """Push item onto heap, maintaining the heap invariant.""" heap.append(item) _siftdown(heap, 0, len(heap)-1)def heappop(heap): """Pop the smallest item off the heap, maintaining the heap invariant.""" lastelt = heap.pop() # raises appropriate IndexError if heap is empty if heap: returnitem = heap[0] heap[0] = lastelt _siftup(heap, 0) else: returnitem = lastelt return returnitemdef heapreplace(heap, item): """Pop and return the current smallest value, and add the new item. This is more efficient than heappop() followed by heappush(), and can be more appropriate when using a fixed-size heap. Note that the value returned may be larger than item! That constrains reasonable uses of this routine. """ returnitem = heap[0] # raises appropriate IndexError if heap is empty heap[0] = item _siftup(heap, 0) return returnitemdef heapify(x): """Transform list into a heap, in-place, in O(len(heap)) time.""" n = len(x) # Transform bottom-up. The largest index there's any point to looking at # is the largest with a child index in-range, so must have 2*i + 1 < n, # or i < (n-1)/2. If n is even = 2*j, this is (2*j-1)/2 = j-1/2 so # j-1 is the largest, which is n//2 - 1. If n is odd = 2*j+1, this is # (2*j+1-1)/2 = j so j-1 is the largest, and that's again n//2-1. for i in xrange(n//2 - 1, -1, -1): _siftup(x, i)# 'heap' is a heap at all indices >= startpos, except possibly for pos. pos# is the index of a leaf with a possibly out-of-order value. Restore the# heap invariant.def _siftdown(heap, startpos, pos): newitem = heap[pos] # Follow the path to the root, moving parents down until finding a place # newitem fits. while pos > startpos: parentpos = (pos - 1) >> 1 parent = heap[parentpos] if parent <= newitem: break heap[pos] = parent pos = parentpos heap[pos] = newitem# The child indices of heap index pos are already heaps, and we want to make# a heap at index pos too. We do this by bubbling the smaller child of# pos up (and so on with that child's children, etc) until hitting a leaf,# then using _siftdown to move the oddball originally at index pos into place.## We *could* break out of the loop as soon as we find a pos where newitem <=# both its children, but turns out that's not a good idea, and despite that# many books write the algorithm that way. During a heap pop, the last array# element is sifted in, and that tends to be large, so that comparing it# against values starting from the root usually doesn't pay (= usually doesn't# get us out of the loop early). See Knuth, Volume 3, where this is# explained and quantified in an exercise.## Cutting the # of comparisons is important, since these routines have no# way to extract "the priority" from an array element, so that intelligence# is likely to be hiding in custom __cmp__ methods, or in array elements# storing (priority, record) tuples. Comparisons are thus potentially# expensive.## On random arrays of length 1000, making this change cut the number of# comparisons made by heapify() a little, and those made by exhaustive# heappop() a lot, in accord with theory. Here are typical results from 3# runs (3 just to demonstrate how small the variance is):## Compares needed by heapify Compares needed by 1000 heapppops# -------------------------- ---------------------------------# 1837 cut to 1663 14996 cut to 8680# 1855 cut to 1659 14966 cut to 8678# 1847 cut to 1660 15024 cut to 8703## Building the heap by using heappush() 1000 times instead required# 2198, 2148, and 2219 compares: heapify() is more efficient, when# you can use it.## The total compares needed by list.sort() on the same lists were 8627,# 8627, and 8632 (this should be compared to the sum of heapify() and# heappop() compares): list.sort() is (unsurprisingly!) more efficient# for sorting.def _siftup(heap, pos): endpos = len(heap) startpos = pos newitem = heap[pos] # Bubble up the smaller child until hitting a leaf. childpos = 2*pos + 1 # leftmost child position while childpos < endpos: # Set childpos to index of smaller child. rightpos = childpos + 1 if rightpos < endpos and heap[rightpos] <= heap[childpos]: childpos = rightpos # Move the smaller child up. heap[pos] = heap[childpos] pos = childpos childpos = 2*pos + 1 # The leaf at pos is empty now. Put newitem there, and and bubble it up # to its final resting place (by sifting its parents down). heap[pos] = newitem _siftdown(heap, startpos, pos)if __name__ == "__main__": # Simple sanity test heap = [] data = [1, 3, 5, 7, 9, 2, 4, 6, 8, 0] for item in data: heappush(heap, item) sort = [] while heap: sort.append(heappop(heap)) print sort
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -