The Heap data structure is a data structure that maintains a set of items in a semi-ordered manner where the smallest (or largest) item can be retrieved quickly. The beauty of it comes from its ability to efficiently support updates, such as deletions or insertions of items. The heap can either be a min-heap or a max-heap. The min-heap allows fast retrieval of the smallest item, while the max-heap allows fast retrieval of the largest item.
In this post, we will only talk about the min-heap because similar principles apply to the max-heap. A min-heap is essentially a complete binary tree that satisfies the heap property – if node A is the parent of node B then the value of A is less than the value of B. The heap supports the following two operations:
To go into further detail, the heapify operation on node A works as follow:
It is easy to see that the operations can be executed at most the depth of the tree times. Since the tree is a complete binary tree, the depth is O(log n) and thus, heapify is done in O(log n).
Since the heap is also a complete binary tree, it is much simpler to implement the data structure by using an array. Index 1 represents the root of the tree while indices 2i to 2(i + 1) - 1 represents the nodes on the i-th level of the tree.
The heap can be used to implement a priority queue, where a stream of tasks with different priorities arrives at different times. The task with the highest priority can be extracted quickly in O(1), while each update (tasks arriving or ending) takes O(log n).
The heap is also a crucial component in the Dijkstra’s shortest path algorithm, where at each iteration the node with the minimum value has to be retrieved and updated.