I'm studying for an exam and I don't know how to get the height of the recursion tree generated by this relation:
T(n) = 3T(n/2) + n
I know the tree will look like this and that I have to add all the terms:
c*n
/ | \
/ | \
c*n/2 c*n/2 c*n/2
. . .
. . .
Thank you!!
When you have a recurrence relation of the form
T(n) = aT(n / b) + f(n)
then the height of the recursion tree depends only on the choice of b (assuming, of course, that a > 0). The reason for this is that each node in the tree represents what happens when you expand out the above recurrence, and the only place in the above recurrence that you can expand something is in the T(n / b) term. If you increase or decrease a, you will increase or decrease the branching factor of the tree (for example, 2T(n / b) means that there will be two nodes generated when you expand a node, and 3T(n / b) means that there will be three nodes generated when you expand a node), but the branching factor of the tree is independent of the number of levels. It just tells you how many levels there will be. Similarly, changing f(n) will only increase or decrease the total amount of work done at each node, which doesn't impact the shape of the recursion tree.
So how specifically does b impact the height of the tree? Well, in this recurrence relation, each time we expand out T, we end up dividing the size of the input by a factor of b. This means that at the top level of the tree, we'll have problems of size n. Below that are problems of size n / b. Below that are problems of size (n / b) / b = n / b2. Generally, at level k of the tree, the problem size will be n / bk. The recursion stops when the problem size drops to 0 or 1, which happens when k = logb n. In other words, the height of the recursion tree will be O(logb n).
Now, just knowing the tree height won't tell you the total amount of work done, because we also need to know the branching factor and the work done per level. There are a lot of different ways that these can interact with one another, but fortunately there's a beautiful theorem called the master theorem that lets you read off the solution pretty elegantly by just looking at a, b, and f(n). In your case, the recurrence is
T(n) = 3T(n / 2) + O(n)
Plugging this into the master theorem, we see that the solution to the recurrence is T(n) = O(nlog2 3). This is approximately O(n1.58).
Related
I'm trying to solve this problem, but I think I haven't understood how to do it correctly. The first thing I do in this type of exercises is taking the bigger value in the row (in this case is n^2) and divide it multiple times, so I can find what kind of relation there is between the values. After found the relation, I try to mathematically found its value and then as the final step, I multiply the result for the root. In this case the result should be n^3. How is possible?
Unfortunately #vahidreza's solutions seems false to me because it contradicts the Master theorem. In terms of the Master theorem a = 8, b = 2, c = 2. So log_b(a) = 3 so log_b(a) > c and thus this is the case of a recursion dominated by the subproblems so the answer should be T(n) = Ө(n^3) rather than O(m^(2+1/3)) which #vahidreza has.
The main issue is probably in this statement:
Also you know that the tree has log_8 m levels. Because at each level, you divide the number by 8.
Let's try to solve it properly:
On the zeroth level you have n^2 (I prefer to start counting from 0 as it simplifies notation a bit)
on the first level you have 8 nodes of (n/2)^2 or a total of 8*(n/2)^2
on the second level you have 8 * 8 nodes of (n/(2^2))^2 or a total of 8^2*(n/(2^2))^2
on the i-th level you have 8^i nodes of (n/(2^i))^2 or a total of 8^i*(n/(2^i))^2 = n^2 * 8^i/2^(2*i) = n^2 * 2^i
At each level your value n is divided by two so at level i the value is n/2^i and so you'll have log_2(n) levels. So what you need to calculate is sum for i from 0 to log_2(n) of n^2 * 2^i. That's a geometric progression with a ratio of 2 so it's sum is
Σ (n^2 * 2^i) = n^2 * Σ(2^i) = n^2 * (2^(log_2(n)+1) - 1)/2
Since we are talking about Ө/O we can ignore constants and so we need to estimate
n^2 * 2^log_2(n)
Obviously 2^log_2(n) is just n so the answer is
T(n) = Ө(n^3)
exactly as predicted by the Master theorem.
I have recently encountered a recurrence problem:
T(n) = 2*T(ceil((sqrt(n)))+1
T(1)=1;
I am unable to see this function terminate at all when I draw my recurrence tree. The general node form in the tree (n1/2i) becomes 1 only when 1/2i becomes 0. This means i should tend to infinity.
You're right that if sqrt is the ceiling of the square root, then you'll never reach 1 by repeatedly applying square roots. I'm going to assume that you meant to use the floor, which means that you will indeed eventually hit 1 as the recurrence unwinds.
In this case, your recurrence is more properly
T(1) = 1
T(n) = 2T(⌊√n⌋) + 1
A standard technique for solving recurrences involve square roots is to make a substitution. Let's define a new value k such that n = 2k. Notice that √n = (2k)1/2 = 2k/2. In other words, taking the square root of n is equivalent to halving the value of k. Because of this, we can convert the above recurrence, which involves square roots, into a new recurrence that will more closely match the form used by the Master Theorem and other recurrence-solving techniques. Specifically, let's define S(k) = T(2k). Then we get the recurrence
S(0) = 1
S(k) = 2S(⌊k / 2⌋) + 1
It's a lot easier to see how to solve this recurrence. Either by recognizing this recurrence from elsewhere or by using the Master Theorem, we get that S(k) = Θ(k). Now, we wanted to solve for T(n), so we can use the fact that S(k) = T(2k) = T(n). Since S(k) = Θ(k), we now see that T(n) = Θ(k). Since we chose k such that 2k = n, this means that k = lg n. Therefore, T(n) = Θ(log n), so the recurrence works out to Θ(log n).
Hope this helps!
What is the solution of this recurrence?
T(n) = T(n/1000) + T(999n/1000) + cn.
I think its O(n log n) since the work done per level is going to be cn and the height of the tree will be log n to the base of 1000/999, but I'm not sure if the reasoning is valid. Is that correct?
One thing to notice is that for the first log1000n layers, all branches of the recursion will be active (i.e. the branches for the n / 1000 cases won't have bottomed out) and the work done per layer will be Θ(n). This gives you an immediate lower bound on the runtime as Ω(n log n), since there are Θ(log n) layers doing Θ(n) work each.
For layers below that, the work begins to drop off because the branches for the n / 1000 case will bottom out. However, you can upper bound the work done by pretending that each layer in the tree will do Θ(n) work. In that case, there will be log1000/999 n layers before the 999n/1000 case bottoms out, so you get an upper bound of O(n log n) because you have Θ(log n) layers doing Θ(n) work each.
Since the work done is Ω(n log n) and O(n log n), the runtime is Θ(n log n).
Hope this helps!
When implementing a hash table using a good hash function (one where the probability of any two elements colliding is 1 / m, where m is the number of buckets), it is well-known that the average-case running time for looking up an element is Θ(1 + α), where α is the load factor. The worst-case running time is O(n), though, if all the elements end up put into the same bucket.
I was recently doing some reading on hash tables and found this article which claims (on page 3) that if α = 1, the expected worst-case complexity is Θ(log n / log log n). By "expected worst-case complexity," I mean, on expectation, the maximum amount of work you'll have to do if the elements are distributed by a uniform hash function. This is different from the actual worst-case, since the worst-case behavior (all elements in the same bucket) is extremely unlikely to actually occur.
My question is the following - the author seems to suggest that differing the value of α can change the expected worst-case complexity of a lookup. Does anyone know of a formula, table, or article somewhere that discusses how changing α changes the expected worst-case runtime?
For fixed α, the expected worst time is always Θ(log n / log log n). However if you make α a function of n, then the expected worst time can change. For instance if α = O(n) then the expected worst time is O(n) (that's the case where you have a fixed number of hash buckets).
In general the distribution of items into buckets is approximately a Poisson distribution, the odds of a random bucket having i items is αi e-α / i!. The worst case is just the m'th worst out of m close to independent observations. (Not entirely independent, but fairly close to it.) The m'th worst out of m observations tends to be something whose odds of happening are about 1/m times. (More precisely the distribution is given by a Β distribution, but for our analysis 1/m is good enough.)
As you head into the tail of the Poisson distribution the growth of the i! term dominates everything else, so the cumulative probability of everything above a given i is smaller than the probability of selecting i itself. So to a good approximation you can figure out the expected value by solving for:
αi e-α / i! = 1/m = 1/(n/α) = α/n
Take logs of both sides and we get:
i log(α) - α - (i log(i) - i + O(log(i)) = log(α) - log(n)
log(n) - α = i log(i) - i - i log(α) + O(log(i))
If we hold α constant then this is:
log(n) = i log(i) + O(i)
Can this work if i has the form k log(n) / log(log(n)) with k = Θ(1)? Let's try it:
log(n) = (k log(n) / log(log(n))) (log(k) + log(log(n)) - log(log(log(n)))) + O(log(log(n)))
= k (log(n) + o(log(n)) + o(log(n))
And then we get the sharper estimate that, for any fixed load average α, the expected worst time is (1 + o(1)) log(n) / log(log(n))
After some searching, I came across this research paper that gives a complete analysis of the expected worst-case behavior of a whole bunch of different types of hash tables, including chained hash tables. The author gives as an answer that the expected length is approximately Γ-1(m), where m is the number of buckets and Γ is the Gamma function. Assuming that α is a constant, this is approximately ln m / ln ln m.
Hope this helps!
Can someone explain the solution of this problem to me?
Suppose that you are given a sequence of n elements to sort. The input sequence
consists of n=k subsequences, each containing k elements. The elements in a given
subsequence are all smaller than the elements in the succeeding subsequence and
larger than the elements in the preceding subsequence. Thus, all that is needed to
sort the whole sequence of length n is to sort the k elements in each of the n=k
subsequences. Show an n lg k lower bound on the number of comparisons
needed to solve this variant of the sorting problem.
Solution:
Let S be a sequence of n elements divided into n/k subsequences each of length k
where all of the elements in any subsequence are larger than all of the elements
of a preceding subsequence and smaller than all of the elements of a succeeding
subsequence.
Claim
Any comparison-based sorting algorithm to sort S must take (n lg k) time in the
worst case.
Proof
First notice that, as pointed out in the hint, we cannot prove the lower
bound by multiplying together the lower bounds for sorting each subsequence.
That would only prove that there is no faster algorithm that sorts the subsequences
independently. This was not what we are asked to prove; we cannot introduce any
extra assumptions.
Now, consider the decision tree of height h for any comparison sort for S. Since
the elements of each subsequence can be in any order, any of the k! permutations
correspond to the final sorted order of a subsequence. And, since there are n/k such
subsequences, each of which can be in any order, there are (k!)^n/k permutations
of S that could correspond to the sorting of some input order. Thus, any decision
tree for sorting S must have at least (k!)^n/k leaves. Since a binary tree of height h
has no more than 2^h leaves, we must have 2^h ≥ (k!)^(n/k) or h ≥ lg((k!)^n/k). We
therefore obtain
h ≥ lg((k!)^n/k) -- unbalanced parens - final one added?
= (n/k) lg(k!)
≥ (n/k) lg((k/2)^k/2)
= (n/2) lg(k/2)
The third line comes from k! having its k/2 largest terms being at least k/2 each.
(We implicitly assume here that k is even. We could adjust with floors and ceilings
if k were odd.)
Since there exists at least one path in any decision tree for sorting S that has length
at least (n/2) lg(k/2), the worst-case running time of any comparison-based sorting
algorithm for S is (n lg k).
Can someone walk me through the steps in code block? Especially the step when lg k! becomes lg((k/2)^k/2).
I've reprinted the math below:
(1) h ≥ lg(k! n/k)
(2) = (n/k) lg(k!)
(3) ≥ (n/k) lg((k/2)k/2)
(4) = (n/2) lg(k/2)
Let's walk through this. Going from line (1) to line (2) uses properties of logarithms. Similarly, going from line (3) to line (4) uses properties of logarithms and the facththat (n / k)(k / 2) = (n / 2). So the trick step is going from line (2) to line (3).
The claim here is the following:
For all k, k! ≥ (k / 2)(k / 2)
Intuitively, the idea is as follows. Consider k! = k(k - 1)(k - 2)...(2)(1). If you'll notice, half of these terms are greater than k / 2 and half of them are smaller. If we drop all the terms that are less than k, we get something (close to) the following:
k! ≥ k(k - 1)(k - 2)...(k / 2)
Now, we have that k / 2 ≥ k, so we have that
k! ≥ k(k - 1)(k - 2)...(k / 2) ≥ (k/2)(k/2)...(k/2)
This is the product of (k / 2) with itself (k / 2) times, so it's equal to (k / 2)k/2. This math isn't precise because the logic for odd and even values are a bit different, but using essentially this idea you get a sketch of the proof of the earlier result.
To summarize: from (1) to (2) and from (3) to (4) uses properties of logarithms, and from (2) to (3) uses the above result.
Hope this helps!