Calculating average case complexity of Quicksort - recursion

I'm trying to calculate the big-O for Worst/Best/Average case of QuickSort using recurrence relations. My understanding is that the efficiency of your implementation is depends on how good the partition function is.
Worst Case: pivot always leaves one side empty
T(N) = N + T(N-1) + T(1)
T(N) = N + T(N-1)
T(N) ~ N2/2 => O(n^2)
Best Case: pivot divides elements equally
T(N) = N + T(N/2) + T(N/2)
T(N) = N + 2T(N/2) [Master Theorem]
T(N) ~ Nlog(N) => O(nlogn)
Average Case: This is where I'm confused how to represent the recurrence relation or how to approach it in general.
I know the average case big-O for Quicksort is O(nlogn) I'm just unsure how to derive it.

When you pick the pivot, the worst you can do is 0 | n and the best you can do is n/2 | n/2. The average case will find you getting a split of something more like n/4 | 3n/4, assuming uniform randomness. Plug that in and you get O(nlogn) once constants are eliminated.

Related

Big-O complexity of n/1 + n/2 + n/3 +

Is the big O complexity of n/1 + n/2 + n/3 + ... + n/n O(nlogn) or O(n)? I want to know this for calculating all divisors of all numbers from 1 to n. My approach would be to go over all the numbers and marking their multiples. This would take the above-mentioned time.
You have n multiplied with harmonic series sum which has logarithmic growth.
So O(nlogn)

Quadratic Equation in Big-Oh Notation

I am reviewing for a midterm test regarding Big-Oh runtime. One of the questions I have difficulty with is given the following recurrence and the question is asking for the Big-Oh notation.
T(n) = 2T(n/2) + (2n^2 + 3n + 5)
So by using the Master Theorem, where k > log_b (a), in this question, I am thinking k is 2 from (2n^2), a is 2 from 2T and b is 2 from (n/2). Therefore, the runtime of Master Theorem is when k > log_b (a), that is 2 > log_2 (2) = 1, then T(n) = O(n^2).
Is my thinking correct? I have never seen a quadratic runtime inside T(n) but I am fairly certain it is O(n^2) in this question.
Thank you for your input!
Yes, O(n^2) is correct. Actually there is a similar example in the Wikipedia article about the master theorem. This function could be anything and basically you just compare the depth and width of the recursion tree with the cost of this additional function and check what is dominating the complexity.

Big Theta runtime analysis

I don't really understand the 2 questions below about T(n). I understand what theta means but I'm not sure about the answer for the questions. Can someone explain?
I thought that first one was false because T(2n/3) + 1 = Theta(log n) because
the constant 1 added doesn't make a difference
and log is closer to halving continuously but 2n/3 is not
I thought that second one was true because T(n/2) + n = Theta(n * log n) because
the linear "n *" in Theta represents the "+n" in T(n/2) + n
the "n/2" represents the log n in Theta...
The first is Θ(log n).
Intuitively, when you multiply n by a constant factor, T(n) increases by a constant amount.
Example: T(n) = log(n)/log(3/2)
The second is Θ(n).
Intuitively, when you multiply n by a constant factor, T(n) increases by an amount proportional to n.
Example: T(n) = 2n

Proving worst case running time of QuickSort

I am trying to perform asymptotic analysis on the following recursive function for an efficient way to power a number. I am having trouble determining the recurrence equation due to having different equations for when the power is odd and when the power is even. I am unsure how to handle this situation. I understand that the running time is theta(logn) so any advice on how to proceed to this result would be appreciated.
Recursive-Power(x, n):
if n == 1
return x
if n is even
y = Recursive-Power(x, n/2)
return y*y
else
y = Recursive-Power(x, (n-1)/2)
return y*y*x
In any case, the following condition holds:
T(n) = T(floor(n/2)) + Θ(1)
where floor(n) is the biggest integer not greater than n.
Since floor doesn't have influence on results, the equation is informally written as:
T(n) = T(n/2) + Θ(1)
You have guessed the asymptotic bound correctly. The result could be proved using Substitution method or Master theorem. It is left as an exercise for you.

Recursion and Big O

I've been working through a recent Computer Science homework involving recursion and big-O notation. I believe I understand this pretty well (certainly not perfectly, though!) But there is one question in particular that is giving me the most problems. The odd thing is that by looking it, it looks to be the most simple one on the homework.
Provide the best rate of growth using the big-Oh notation for the solution to the following recurrence?
T(1) = 2
T(n) = 2T(n - 1) + 1 for n>1
And the choices are:
O(n log n)
O(n^2)
O(2^n)
O(n^n)
I understand that big O works as an upper bound, to describe the most amount of calculations, or the highest running time, that program or process will take. I feel like this particular recursion should be O(n), since, at most, the recursion only occurs once for each value of n. Since n isn't available, it's either better than that, O(nlogn), or worse, being the other three options.
So, my question is: Why isn't this O(n)?
There's a couple of different ways to solve recurrences: substitution, recurrence tree and master theorem. Master theorem won't work in the case, because it doesn't fit the master theorem form.
You could use the other two methods, but the easiest way for this problem is to solve it iteratively.
T(n) = 2T(n-1) + 1
T(n) = 4T(n-2) + 2 + 1
T(n) = 8T(n-3) + 4 + 2 + 1
T(n) = ...
See the pattern?
T(n) = 2n-1⋅T(1) + 2n-2 + 2n-3 + ... + 1
T(n) = 2n-1⋅2 + 2n-2 + 2n-3 + ... + 1
T(n) = 2n + 2n-2 + 2n-3 + ... + 1
Therefore, the tightest bound is Θ(2n).
I think you have misunderstood the question a bit. It does not ask you how long it would take to solve the recurrence. It is asking what the big-O (the asymptotic bound) of the solution itself is.
What you have to do is to come up with a closed form solution, i. e. the non-recursive formula for T(n), and then determine what the big-O of that expression is.
The question is asking for the big-Oh notation for the solution to the recurrence, not the cost of calculation the recurrence.
Put another way: the recurrence produces:
1 -> 2
2 -> 5
3 -> 11
4 -> 23
5 -> 47
What big-Oh notation best describes the sequence 2, 5, 11, 23, 47, ...
The correct way to solve that is to solve the recurrence equations.
I think this will be exponential. Each increment to n makes the value to be twice as large.
T(2) = 2 * T(1) = 4
T(3) = 2 * T(2) = 2 * 4
...
T(x) would be the running time of the following program (for example):
def fn(x):
if (x == 1):
return # a constant time
# do the calculation for n - 1 twice
fn(x - 1)
fn(x - 1)
I think this will be exponential. Each increment to n brings twice as much calculation.
No, it doesn't. Quite on the contrary:
Consider that for n iterations, we get running time R. Then for n + 1 iterations we'll get exactly R + 1.
Thus, the growth rate is constant and the overall runtime is indeed O(n).
However, I think Dima's assumption about the question is right although his solution is overly complicated:
What you have to do is to come up with a closed form solution, i. e. the non-recursive formula for T(n), and then determine what the big-O of that expression is.
It's sufficient to examine the relative size of T(n) and T(n + 1) iterations and determine the relative growth rate. The amount obviously doubles which directly gives the asymptotic growth.
First off, all four answers are worse than O(n)... O(n*log n) is more complex than plain old O(n). What's bigger: 8 or 8 * 3, 16 or 16 * 4, etc...
On to the actual question. The general solution can obviously be solved in constant time if you're not doing recursion
( T(n) = 2^(n - 1) + 2^(n) - 1 ), so that's not what they're asking.
And as you can see, if we write the recursive code:
int T( int N )
{
if (N == 1) return 2;
return( 2*T(N-1) + 1);
}
It's obviously O(n).
So, it appears to be a badly worded question, and they are probably asking you the growth of the function itself, not the complexity of the code. That's 2^n. Now go do the rest of your homework... and study up on O(n * log n)
Computing a closed form solution to the recursion is easy.
By inspection, you guess that the solution is
T(n) = 3*2^(n-1) - 1
Then you prove by induction that this is indeed a solution. Base case:
T(1) = 3*2^0 - 1 = 3 - 1 = 2. OK.
Induction:
Suppose T(n) = 3*2^(n-1) - 1. Then
T(n+1) = 2*T(n) + 1 = 3*2^n - 2 + 1 = 3*2^((n+1)-1) - 1. OK.
where the first equality stems from the recurrence definition,
and the second from the inductive hypothesis. QED.
3*2^(n-1) - 1 is clearly Theta(2^n), hence the right answer is the third.
To the folks that answered O(n): I couldn't agree more with Dima. The problem does not ask the tightest upper bound to the computational complexity of an algorithm to compute T(n) (which would be now O(1), since its closed form has been provided). The problem asks for the tightest upper bound on T(n) itself, and that is the exponential one.

Resources