Why is a Fibonacci heap called a Fibonacci heap? - math

The Fibonacci heap data structure has the word "Fibonacci" in its name, but nothing in the data structure seems to use Fibonacci numbers. According to the Wikipedia article:
The name of Fibonacci heap comes from Fibonacci numbers which are used in the running time analysis.
How do these Fibonacci numbers arise in the Fibonacci heap?

The Fibonacci heap is made up of a collection of smaller heap-ordered trees of different "orders" that obey certain structural constraints. The Fibonacci sequence arises because these trees are constructed in a way such that a tree of order n has at least Fn+2 nodes in it, where Fn+2 is the (n + 2)nd Fibonacci number.
To see why this result is true, let's begin by seeing how the trees in the Fibonacci heap are constructed. Initially, whenever a node is put into a Fibonacci heap, it is put into a tree of order 0 that contains just that node. Whenever a value is removed from the Fibonacci heap, some of the trees in the Fibonacci heap are coalesced together such that the number of trees doesn't grow too large.
When combining trees together, the Fibonacci heap only combines together trees of the same order. To combine two trees of order n into a tree of order n + 1, the Fibonacci heap takes whichever of the two trees has a greater root value than the other, then makes that tree a child of the other tree. One consequence of this fact is that trees of order n always have exactly n children.
The main attraction of the Fibonacci heap is that it supports the decrease-key efficiently (in amortized O(1)). In order to support this, the Fibonacci heap implements decrease-key as follows: to decrease the key of a value stored in some node, that node is cut from its parent tree and treated as its own separate tree. When this happens, the order of its old parent node is decreased by one. For example, if an order 4 tree has a child cut from it, it shrinks to an order 3 tree, which makes sense because the order of a tree is supposed to be the number of children it contains.
The problem with doing this is that if too many trees get cut off from the same tree, we might have a tree with a large order but which contains a very small number of nodes. The time guarantees of a Fibonacci heap are only possible if trees with large orders contain a huge number of nodes, and if we can just cut any nodes we'd like from trees we could easily get into a situation where a tree with a huge order only contains a small number of nodes.
To address this, Fibonacci heaps make one requirement - if you cut two children from a tree, you have to in turn cut that tree from its parent. This means that the trees that form a Fibonacci heap won't be too badly damaged by decrease-key.
And now we can get to the part about Fibonacci numbers. At this point, we can say the following about the trees in a Fibonacci heap:
A tree of order n has exactly n children.
Trees of order n are formed by taking two trees of order n - 1 and making one the child of another.
If a tree loses two children, that tree is cut away from its parent.
So now we can ask - what are the smallest possible trees that you can make under these assumptions?
Let's try out some examples. There is only one possible tree of order 0, which is a just a single node:
Smallest possible order 0 tree: *
The smallest possible tree of order 1 would have to be at least a node with a child. The smallest possible child we could make is a single node with the smallest order-0 tree as a child, which is this tree:
Smallest possible order 1 tree: *
|
*
What about the smallest tree of order 2? This is where things get interesting. This tree certainly has to have two children, and it would be formed by merging together two trees of order 1. Consequently, the tree would initially have two children - a tree of order 0 and a tree of order 1. But remember - we can cut away children from trees after merging them! In this case, if we cut away the child of the tree of order 1, we would be left with a tree with two children, both of which are trees of order 0:
Smallest possible order 2 tree: *
/ \
* *
How about order 3? As before, this tree would be made by merging together two trees of order 2. We would then try to cut away as much of the subtrees of this order-3 tree as possible. When it's created, the tree has subtrees of orders 2, 1, and 0. We can't cut away from the order 0 tree, but we can cut a single child from the order 2 and order 1 tree. If we do this, we're left with a tree with three children, one of order 1, and two of order 2:
Smallest possible order 3 tree: *
/|\
* * *
|
*
Now we can spot a pattern. The smallest possible order-(n + 2) tree would be formed as follows: start by creating a normal order (n + 2) tree, which has children of orders n + 1, n, n - 1, ..., 2, 1, 0. Then, make those trees as small as possible by cutting away nodes from them without cutting two children from the same node. This leaves a tree with children of orders n, n - 2, ..., 1, 0, and 0.
We can now write a recurrence relation to try to determine how many nodes are in these trees. If we do this, we get the following, where NC(n) represents the smallest number of nodes that could be in a tree of order n:
NC(0) = 1
NC(1) = 2
NC(n + 2) = NC(n) + NC(n - 1) + ... + NC(1) + NC(0) + NC(0) + 1
Here, the final +1 accounts for the root node itself.
If we expand out these terms, we get the following:
NC(0) = 1
NC(1) = 2
NC(2) = NC(0) + NC(0) + 1 = 3
NC(3) = NC(1) + NC(0) + NC(0) + 1 = 5
NC(4) = NC(2) + NC(1) + NC(0) + NC(0) + 1 = 8
If you'll notice, this is exactly the Fibonacci series offset by two positions. In other words, each of these trees has to have at least Fn+2 nodes in them, where Fn + 2 is the (n + 2)nd Fibonacci number.
So why is it called a Fibonacci heap? Because each tree of order n has to have at least Fn+2 nodes in it!
If you're curious, the original paper on Fibonacci heaps has pictures of these smallest possible trees. It's pretty nifty to see! Also, check out this CS Theory Stack Exchange Post for an alternative explanation as to why Fibonacci heap trees have the sizes they do.
Hope this helps!

I want to give an intuitive explanation that I myself had an "Aha!" moment with.
Tree structures achieve O(log n) runtimes because they are able to store exponential number of items in terms of their heights. Binary trees can store 2^h, tri-nary trees can store 3^h and so on for x^h as generic case.
Can x be less than 2? Sure it can! As long as x > 1, we still achieve log runtimes and ability to store exponentially large number of items for its height. But how do we build such a tree? Fibonacci heap is a data structure where x ≈ 1.62, or Golden Ratio. Whenever we encounter Golden Ratio, there are Fibonacci numbers lurking somewhere...
Fibonacci heap is actually a forest of trees. After a process called "consolidation", each of these trees hold a distinct count of items that are exact powers of 2. For example, if your Fibonacci heap has 15 items, it would have 4 trees that hold 1, 2, 4 and 8 items respectively looking like this:
Details of "consolidation" process is not relevant but in essence it basically keeps unioning trees in the forest of same degree until all trees have distinct degree (try a cool visualization to see how these trees get built). As you can write any n as sum of exact powers of 2, it's easy to imagine how consolidated trees for Fibonacci heap would look like for any n.
OK, so far we still don't see any direct connection to Fibonacci numbers. Where do they come in picture? They actually appear in very special case and this is also a key to why Fibonacci heap can offer O(1) time for DECREASE-KEY operation. When we decrease a key, if new key is still larger than parent's key then we don't need to do anything else because min-heap property is not violated. But if it isn't then we can't leave smaller child buried under larger parent and so we need to cut it's subtree out and make a new tree out of it. Obviously we can't keep doing this for each delete because otherwise we will end up with trees that are too tall and no longer have exponential items which means no more O(log n) time for extract operation. So the question is what rule can we set up so tree still have exponential items for its height? The clever insight is this:
We will allow each parent to loose up to one child. If there is a subsequent attempt to remove another child from same parent then we will cut that parent also out of that tree and put it in root list as tree of 1.
Above rule makes sure trees still have exponential items for its height even in the worse case. What is the worse case? The worse case occurs when we make each parent loose one child. If parent has > 1 child we have choice to which one to get rid of. In that case let's get rid of child with largest subtree. So in above diagram, if you do that for each parent, you will have trees of the size 1, 1, 2 and 3. See a pattern here? Just for fun, add new tree of order 4 (i.e. 16 items) in above diagram and guess what would you be left with after running this rule for each parent: 5. We have a Fibonacci sequence here!
As Fibonacci sequence is exponential, each tree still retains exponential number of items and thus manages to have O(log n) runtime for EXTRACT-MIN operation. However notice that DECREASE-KEY now takes only O(1). One other cool thing is that you can represent any number as a sum of Fibonacci numbers. For example, 32 = 21 + 8 + 3 which means if you needed to hold 32 items in Fibonacci heap, you can do so using 3 trees holding 21, 8 and 3 items respectively. However "consolidation" process does not produces trees with Fibonacci numbers of nodes. They only occur when this worse case of DECREASE-KEY or DELETE happens.
Further reading
Binomial Heap - Fibonacci heaps are essentially lazy Binomial heaps. It's a cool data structure because it shows a different way of storing exponential items in a tree for its height other than binary heap.
Intuition behind Fibonacci Heaps Required reading for anyone who wants to understand Fibonacci heaps at its core. Textbooks often are either too shallow and too cluttered on this subject but the author of this SO answer has nailed it hands down.

Related

Runtime and space complexity of the recursive determinant algorithm for a n x n matrix

I am trying to figure out the runtime and space complexity of the algorithm below.
Some say that the runtime complexity of this is O(n!) and I am guessing it is because there are n! recursive calls for a recursive algorithm that solves for a n*n matrix. But I am not sure if I am right.
Also, is the space complexity also n!?
It might help to write out an explicit recurrence relation that governs the runtime of a straightforward implementation of the recursive algorithm. Notice that, in working on an n × n matrix, evaluating the sum requires making n recursive calls on matrices of size (n - 1) × (n - 1). Each recursive call requires about (n - 1)2 additional time to set up, since we need to extract a submatrix of that size from the original matrix, so the total per-call overhead of the algorithm would be Θ(n3) because we’re doing quadratic work linearly many times. That means that our work done is roughly
T(n) = nT(n - 1) + n3.
Completely ignoring the cubic term here, notice that expanding out the recursion will have the following effect:
T(n) = nT(n - 1) + ...
= n(n-1)T(n-2) + ...
= n(n-1)(n-2)T(n-3) + ...
and eventually we’ll get an n! term showing up, plus a bunch of extra terms from the cubic. So the work done here is at least Ω(n!), and probably a lot more once we factor in the cubic term.
As for the space complexity - when working with the space complexity, remember that once one branch of the recursion terminates we can reuse the space that branch was using. This means that we only really need to look at any one branch to see how much space is needed.
With a naive implementation of this summation where we explicitly compute the submatrices for the recursive calls, we’ll need space to store one matrix of size n × n, one of size (n-1) × (n-1), one of size (n-2) × (n-2), etc. That space usage sums up to Θ(n3).
There are a bunch of other algorithms you can use to compute determinants in much less time and space. Some are based on Gaussian elimination and run in time O(n3), for example.

calculate time complexity in recursive and dynamic solution of “number of ways to move from top left to bottom right in a matrix”

There is a m*n matrix and we need to find all possible paths from top left to bottom right.
It can be traversed only in right and down directions.
I have the following doubts:
In recursive approach I understand that the time complexity will be O(2(m+n)). How can I get it using induction?
How do I find the complexity in case of dynamic programming solution?
In dynamic programming you try to fill the array dp[i][j] where dp[i][j] means number of ways to reach cell (i,j) from top left cell. Also dp[i][j]=dp[i][j-1]+dp[i-1][j] , ( avoiding the corner case where i=1 or j=1). So in total you have to fill the dp table with n*m entries and each entry depends on constant number of entries ( at max 2 ) dp[i-1][j] and dp[i][j-1]. Thus complexity will be O(2*n * m) which is O(n*m).
Secondly,if we dont do dp or memoization ( can goole it ) and do it recursively then you are basically tracing all the possible paths while finding the count. So complexity would be number of paths from top left cell to bottom right. All paths will have m-1 horizontal and n-1 vertical moves.So number of paths becomes (m+n-2)! / ( (m-1)! * (n-1)! ). Which is the complexity, not exponential as you suggested.
For the first question without memoisation:
1) In recursive approach I understand that the time complexity will be
O(2(m+n)). How can I get it using induction?
when we represent the successive calls of the recursive function in a binary tree, at each floor k representing the kth move — 0 standing for the root of the binary tree, the start position — the function makes two new recursive calls at the k + 1th floor. Besides, as stated in sachas's answer, all paths will have m-1 horizontal and n-1 vertical moves. There are therefore (m-1)(n-1) floors, one for each possible kth move.
Then, because:
there are 2k calls per floor,
calls at every floor add up,
there is a total of (m-1)(n-1) floors,
there are therefore 20 + 20 + ... + 2(m - 1)(n - 1) = 2(m - 1)(n - 1) + 1 - 1 calls of the function (according to the formula of a sum of a geometric sequence), and the recursive function having a time complexity of O(1), the complexity is then O(2(m - 1)(n - 1) + 1) = O(2mn). Hence the result.

Number of movements in a dynamic array

A dynamic array is an array that doubles its size, when an element is added to an already full array, copying the existing elements to a new place more details here. It is clear that there will be ceil(log(n)) of bulk copy operations.
In a textbook I have seen the number of movements M as being computed this way:
M=sum for {i=1} to {ceil(log(n))} of i*n/{2^i} with the argument that "half the elements move once, a quarter of the elements twice"...
But I thought that for each bulk copy operation the number of copied/moved elements is actually n/2^i, as every bulk operation is triggered by reaching and exceeding the 2^i th element, so that the number of movements is
M=sum for {i=1} to {ceil(log(n))} of n/{2^i} (for n=8 it seems to be the correct formula).
Who is right and what is wrong in the another argument?
Both versions are O(n), so there is no big difference.
The textbook version counts the initial write of each element as a move operation but doesn't consider the very first element, which will move ceil(log(n)) times. Other than that they are equivalent, i.e.
(sum for {i=1} to {ceil(log(n))} of i*n/{2^i}) - (n - 1) + ceil(log(n))
== sum for {i=1} to {ceil(log(n))} of n/{2^i}
when n is a power of 2. Both are off by different amounts when n is not a power of 2.

Big O of Recursive Methods

I'm having difficulty determining the big O of simple recursive methods. I can't wrap my head around what happens when a method is called multiple times. I would be more specific about my areas of confusion, but at the moment I'm trying to answer some hw questions, and in lieu of not wanting to cheat, I ask that anyone responding to this post come up with a simple recursive method and provide a simple explanation of the big O of said method. (Preferably in Java... a language I'm learning.)
Thank you.
You can define the order recursively as well. For instance, let's say you have a function f. To calculate f(n) takes k steps. Now you want to calculate f(n+1). Lets say f(n+1) calls f(n) once, then f(n+1) takes k + some constant steps. Each invocation will take some constant steps extra, so this method is O(n).
Now look at another example. Lets say you implement fibonacci naively by adding the two previous results:
fib(n) = { return fib(n-1) + fib(n-2) }
Now lets say you can calculate fib(n-2) and fib(n-1) both in about k steps. To calculate fib(n) you need k+k = 2*k steps. Now lets say you want to calculate fib(n+1). So you need twice as much steps as for fib(n-1). So this seems to be O(2^N)
Admittedly, this is not very formal, but hopefully this way you can get a bit of a feel.
You might want to refer to the master theorem for finding the big O of recursive methods. Here is the wikipedia article: http://en.wikipedia.org/wiki/Master_theorem
You want to think of a recursive problem like a tree. Then, consider each level of the tree and the amount of work required. Problems will generally fall into 3 categories, root heavy (first iteration >> rest of tree), balanced (each level has equal amounts of work), leaf heavy (last iteration >> rest of tree).
Taking merge sort as an example:
define mergeSort(list toSort):
if(length of toSort <= 1):
return toSort
list left = toSort from [0, length of toSort/2)
list right = toSort from [length of toSort/2, length of toSort)
merge(mergeSort(left), mergeSort(right))
You can see that each call of mergeSort in turn calls 2 more mergeSorts of 1/2 the original length. We know that the merge procedure will take time proportional to the number of values being merged.
The recurrence relationship is then T(n) = 2*T(n/2)+O(n). The two comes from the 2 calls and the n/2 is from each call having only half the number of elements. However, at each level there are the same number of elements n which need to be merged, so the constant work at each level is O(n).
We know the work is evenly distributed (O(n) each depth) and the tree is log_2(n) deep, so the big O of the recursive function is O(n*log(n)).

How many different partitions with exactly n parts can be made of a set with k-elements?

How many different partitions with exactly two parts can be made of the set {1,2,3,4}?
There are 4 elements in this list that need to be partitioned into 2 parts. I wrote these out and got a total of 7 different possibilities:
{{1},{2,3,4}}
{{2},{1,3,4}}
{{3},{1,2,4}}
{{4},{1,2,3}}
{{1,2},{3,4}}
{{1,3},{2,4}}
{{1,4},{2,3}}
Now I must answer the same question for the set {1,2,3,...,100}.
There are 100 elements in this list that need to be partitioned into 2 parts. I know the largest size a part of the partition can be is 50 (that's 100/2) and the smallest is 1 (so one part has 1 number and the other part has 99). How can I determine how many different possibilities there are for partitions of two parts without writing out extraneous lists of every possible combination?
Can the answer be simplified into a factorial (such as 12!)?
Is there a general formula one can use to find how many different partitions with exactly n parts can be made of a set with k-elements?
1) stackoverflow is about programming. Your question belongs to https://math.stackexchange.com/ realm.
2) There are 2n subsets of a set of n elements (because each of n elements may either be or be not contained in the specific subset). This gives us 2n-1 different partitions of a n-element set into the two subsets. One of these partitions is the trivial one (with the one part being an empty subset and other part being the entire original set), and from your example it seems you don't want to count the trivial partition. So the answer is 2n-1-1 (which gives 23-1=7 for n=4).
The general answer for n parts and k elements would be the Stirling number of the second kind S(k,n).
Please beware that the usual convention is with n the total number of elements, thus S(n,k)
Computing the general formula is quite ugly, but doable for k=2 (with the common notation) :
Thus S(n,2) = 1/2 ( (+1) * 1 * 0n +(-1) * 2 * 1n + (+1) * 1 * 2n ) = (0-2+2n)/2 = 2n-1-1

Resources