We can possibly use a heap instead of a Deque, hopefully giving better performance. Here is how it might work.
Assuming the input is (a, Maybe a), where the first element of the tuple is the element being inserted in the window and the second element is the one being ejected from the window. Assuming a min heap to find the minimum:
- When the input is
(x, Nothing), insert x into the heap
- When inserting
x into the heap, discard any of the existing heap elements that are greater than x.
- When the input is
(x, Just y), since we know that y is the last element in the window, we know that y has to be minimum if it is present in the heap. So remove y from the top of the heap if it is the top element in the heap. Then insert x as described in the previous steps. If there are multiple instances of y in the window then we should not remove it, we can keep a refcount for that and decrement the refcount until it becomes 0. Increment the refcount when inserting a duplicate.
Note that we would need a custom heap implementation as we need to cut the top of the heap in one case and the bottom of the heap in another.
The cost would likely be n * log w where n is the number of elements in the stream and w is the window size. The worst case for minimum would be when the input stream is sorted ascending order. The best case would be when it is sorted in the descending order.
Note that we can perform a partial sort of the stream by scanning it using the min or the max fold.
We can possibly use a heap instead of a
Deque, hopefully giving better performance. Here is how it might work.Assuming the input is (a, Maybe a), where the first element of the tuple is the element being inserted in the window and the second element is the one being ejected from the window. Assuming a min heap to find the minimum:
(x, Nothing), insertxinto the heapxinto the heap, discard any of the existing heap elements that are greater thanx.(x, Just y), since we know thatyis the last element in the window, we know thatyhas to be minimum if it is present in the heap. So removeyfrom the top of the heap if it is the top element in the heap. Then insertxas described in the previous steps. If there are multiple instances ofyin the window then we should not remove it, we can keep a refcount for that and decrement the refcount until it becomes 0. Increment the refcount when inserting a duplicate.Note that we would need a custom heap implementation as we need to cut the top of the heap in one case and the bottom of the heap in another.
The cost would likely be
n * log wwherenis the number of elements in the stream andwis the window size. The worst case forminimumwould be when the input stream is sorted ascending order. The best case would be when it is sorted in the descending order.Note that we can perform a partial sort of the stream by scanning it using the
minor themaxfold.