How to prove the fairness of AIMD in TCP?

How to prove the fairness of AIMD in TCP? - tcp

I am currently studying the Additive Increase Multiplicative Decrease method, used in TCP as congestion avoidance technique. If we have K TCP sessions sharing a common link of bandwidth R, it is said that this technique guarantees fairness for all the sessions, i.e, each session will have a throughput of R/K.
Now, I'd like to prove this fairness mathematically (reaching the conclusion that, regardless of the initial values of the throughput of each session, they will all eventually tend to R/K).
Thanks !

A very intuitive answer is illustrated in the Chiu-Jain paper. From there, you can easily see a sort of general argument for AIMD that can be formalized further. All you really need is in that paper.
If you insist on proof by letters rather than proof by graph, you can reason as follows:
Assume x, y are the initial shares of the bandwidth, which is R. Let 2d = R for convenience.
The sequence of window size ratios evolves like follows:
(init) x/y ->
(MD) (x/2)/(y/2) = x/y ->
(AI) (x + d)/(y + d) ->
(MD) ((x + d)/2)/((y + d)/2) ->
(AI) ((x + d)/2 + d) / ((y + d) /2 + d) = (x + d + 2d)/(y + d + 2d)
This last one is (x + d + 2d)/(y + d + 2d). And you can see if you write it down how it is going to evolve after that.
In general, (x + k /y + k) -> 1 as k grows, and in our specific case the convergence is even exponential because k grows exponentially with time. This is what the convergence to fairness looks like for a starting point of x = 5 and y = 3.
To prove it for more than 2 flows you can take the flows with largest and smallest starting share of the bandwidth. Everyone else is in between, so the proof should still be simple. The convergence we talk about here is convergence to fairness, not convergence to R/K because in reality the system will oscillate between R/2K and R/K forever.

If you're wanting scholarly level feedback, I recommend searching scholar.google.com. The one entitled A Note on the Fairness of Additive Increase and Multiplicative Decrease with fancy looking maths might be right up your alley.

Related

Solving the recurrence T(n) = T(n / 1000) + T(999n/1000) + cn?

What is the solution of this recurrence?
T(n) = T(n/1000) + T(999n/1000) + cn.
I think its O(n log n) since the work done per level is going to be cn and the height of the tree will be log n to the base of 1000/999, but I'm not sure if the reasoning is valid. Is that correct?

One thing to notice is that for the first log1000n layers, all branches of the recursion will be active (i.e. the branches for the n / 1000 cases won't have bottomed out) and the work done per layer will be Θ(n). This gives you an immediate lower bound on the runtime as Ω(n log n), since there are Θ(log n) layers doing Θ(n) work each.
For layers below that, the work begins to drop off because the branches for the n / 1000 case will bottom out. However, you can upper bound the work done by pretending that each layer in the tree will do Θ(n) work. In that case, there will be log1000/999 n layers before the 999n/1000 case bottoms out, so you get an upper bound of O(n log n) because you have Θ(log n) layers doing Θ(n) work each.
Since the work done is Ω(n log n) and O(n log n), the runtime is Θ(n log n).
Hope this helps!

Expected worst-case time complexity of chained hash table lookups?

When implementing a hash table using a good hash function (one where the probability of any two elements colliding is 1 / m, where m is the number of buckets), it is well-known that the average-case running time for looking up an element is Θ(1 + α), where α is the load factor. The worst-case running time is O(n), though, if all the elements end up put into the same bucket.
I was recently doing some reading on hash tables and found this article which claims (on page 3) that if α = 1, the expected worst-case complexity is Θ(log n / log log n). By "expected worst-case complexity," I mean, on expectation, the maximum amount of work you'll have to do if the elements are distributed by a uniform hash function. This is different from the actual worst-case, since the worst-case behavior (all elements in the same bucket) is extremely unlikely to actually occur.
My question is the following - the author seems to suggest that differing the value of α can change the expected worst-case complexity of a lookup. Does anyone know of a formula, table, or article somewhere that discusses how changing α changes the expected worst-case runtime?

For fixed α, the expected worst time is always Θ(log n / log log n). However if you make α a function of n, then the expected worst time can change. For instance if α = O(n) then the expected worst time is O(n) (that's the case where you have a fixed number of hash buckets).
In general the distribution of items into buckets is approximately a Poisson distribution, the odds of a random bucket having i items is αi e-α / i!. The worst case is just the m'th worst out of m close to independent observations. (Not entirely independent, but fairly close to it.) The m'th worst out of m observations tends to be something whose odds of happening are about 1/m times. (More precisely the distribution is given by a Β distribution, but for our analysis 1/m is good enough.)
As you head into the tail of the Poisson distribution the growth of the i! term dominates everything else, so the cumulative probability of everything above a given i is smaller than the probability of selecting i itself. So to a good approximation you can figure out the expected value by solving for:
αi e-α / i! = 1/m = 1/(n/α) = α/n
Take logs of both sides and we get:
i log(α) - α - (i log(i) - i + O(log(i)) = log(α) - log(n)
log(n) - α = i log(i) - i - i log(α) + O(log(i))
If we hold α constant then this is:
log(n) = i log(i) + O(i)
Can this work if i has the form k log(n) / log(log(n)) with k = Θ(1)? Let's try it:
log(n) = (k log(n) / log(log(n))) (log(k) + log(log(n)) - log(log(log(n)))) + O(log(log(n)))
= k (log(n) + o(log(n)) + o(log(n))
And then we get the sharper estimate that, for any fixed load average α, the expected worst time is (1 + o(1)) log(n) / log(log(n))

After some searching, I came across this research paper that gives a complete analysis of the expected worst-case behavior of a whole bunch of different types of hash tables, including chained hash tables. The author gives as an answer that the expected length is approximately Γ-1(m), where m is the number of buckets and Γ is the Gamma function. Assuming that α is a constant, this is approximately ln m / ln ln m.
Hope this helps!

how to evaluate derivative of function in matlab?

This should be very simple. I have a function f(x), and I want to evaluate f'(x) for a given x in MATLAB.
All my searches have come up with symbolic math, which is not what I need, I need numerical differentiation.
E.g. if I define: fx = inline('x.^2')
I want to find say f'(3), which would be 6, I don't want to find 2x

If your function is known to be twice differentiable, use
f'(x) = (f(x + h) - f(x - h)) / 2h
which is second order accurate in h. If it is only once differentiable, use
f'(x) = (f(x + h) - f(x)) / h (*)
which is first order in h.
This is theory. In practice, things are quite tricky. I'll take the second formula (first order) as the analysis is simpler. Do the second order one as an exercise.
The very first observation is that you must make sure that (x + h) - x = h, otherwise you get huge errors. Indeed, f(x + h) and f(x) are close to each other (say 2.0456 and 2.0467), and when you substract them, you lose a lot of significant figures (here it is 0.0011, which has 3 significant figures less than x). So any error on h is likely to have a huge impact on the result.
So, first step, fix a candidate h (I'll show you in a minute how to chose it), and take as h for your computation the quantity h' = (x + h) - x. If you are using a language like C, you must take care to define h or x as volatile for that computation not to be optimized away.
Next, the choice of h. The error in (*) has two parts: the truncation error and the roundoff error. The truncation error is because the formula is not exact:
(f(x + h) - f(x)) / h = f'(x) + e1(h)
where e1(h) = h / 2 * sup_{x in [0,h]} |f''(x)|.
The roundoff error comes from the fact that f(x + h) and f(x) are close to each other. It can be estimated roughly as
e2(h) ~ epsilon_f |f(x) / h|
where epsilon_f is the relative precision in the computation of f(x) (or f(x + h), which is close). This has to be assessed from your problem. For simple functions, epsilon_f can be taken as the machine epsilon. For more complicated ones, it can be worse than that by orders of magnitude.
So you want h which minimizes e1(h) + e2(h). Plugging everything together and optimizing in h yields
h ~ sqrt(2 * epsilon_f * f / f'')
which has to be estimated from your function. You can take rough estimates. When in doubt, take h ~ sqrt(epsilon) where epsilon = machine accuracy. For the optimal choice of h, the relative accuracy to which the derivative is known is sqrt(epsilon_f), ie. half the significant figures are correct.
In short: too small a h => roundoff error, too large a h => truncation error.
For the second order formula, same computation yields
h ~ (6 * epsilon_f / f''')^(1/3)
and a fractional accuracy of (epsilon_f)^(2/3) for the derivative (which is typically one or two significant figures better than the first order formula, assuming double precision).
If this is too imprecise, feel free to ask for more methods, there are a lot of tricks to get better accuracy. Richardson extrapolation is a good start for smooth functions. But those methods typically compute f quite a few times, this may or not be what you want if your function is complex.
If you are going to use numerical derivatives a lot of times at different points, it becomes interesting to construct a Chebyshev approximation.

To get a numerical difference (symmetric difference), you calculate (f(x+dx)-f(x-dx))/(2*dx)
fx = #(x)x.^2;
fPrimeAt3 = (fx(3.1)-fx(2.9))/0.2;
Alternatively, you can create a vector of function values and apply DIFF, i.e.
xValues = 2:0.1:4;
fValues = fx(xValues);
df = diff(fValues)./0.1;
Note that diff takes the forward difference, and that it assumes that dx equals to 1.
However, in your case, you may be better off to define fx as a polynomial, and evaluating the derivative of the function, rather than the function values.

Lacking the symbolic toolbox, nothing stops you from using Derivest, a tool for automatic adaptive numerical differentiation.
derivest(#sin,pi)
ans =
-1
For your example it does very nicely. In fact, it even provides an estimate of the error in the resulting approximation.
fx = inline('x.^2');
[fp,errest] = derivest(fx,3)
fp =
6
errest =
3.6308e-14

did you try diff (calculates differences and approximates a derivative), gradient, or polyder (calculates the derivative of a polynomial) functions?
You can read more on these functions by using help <commandname> on MATLAB console, or use the function browser in the Help menu.

For a given function in analytical form, you can evaluate the derivative at a desired point with the following code:
syms x
df = diff(x^2);
df3 = subs(df, 'x', 3);
fprintf('f''(3)=%f\n', df3);
For pure numerical derivatives use the already given solutions by Jonas and posdef.

What is O value for naive random selection from finite set?

This question on getting random values from a finite set got me thinking...
It's fairly common for people to want to retrieve X unique values from a set of Y values. For example, I may want to deal a hand from a deck of cards. I want 5 cards, and I want them to all be unique.
Now, I can do this naively, by picking a random card 5 times, and try again each time I get a duplicate, until I get 5 cards. This isn't so great, however, for large numbers of values from large sets. If I wanted 999,999 values from a set of 1,000,000, for instance, this method gets very bad.
The question is: how bad? I'm looking for someone to explain an O() value. Getting the xth number will take y attempts...but how many? I know how to figure this out for any given value, but is there a straightforward way to generalize this for the whole series and get an O() value?
(The question is not: "how can I improve this?" because it's relatively easy to fix, and I'm sure it's been covered many times elsewhere.)

Variables
n = the total amount of items in the set
m = the amount of unique values that are to be retrieved from the set of n items
d(i) = the expected amount of tries needed to achieve a value in step i
i = denotes one specific step. i ∈ [0, n-1]
T(m,n) = expected total amount of tries for selecting m unique items from a set of n items using the naive algorithm
Reasoning
The first step, i=0, is trivial. No matter which value we choose, we get a unique one at the first attempt. Hence:
d(0) = 1
In the second step, i=1, we at least need 1 try (the try where we pick a valid unique value). On top of this, there is a chance that we choose the wrong value. This chance is (amount of previously picked items)/(total amount of items). In this case 1/n. In the case where we picked the wrong item, there is a 1/n chance we may pick the wrong item again. Multiplying this by 1/n, since that is the combined probability that we pick wrong both times, gives (1/n)2. To understand this, it is helpful to draw a decision tree. Having picked a non-unique item twice, there is a probability that we will do it again. This results in the addition of (1/n)3 to the total expected amounts of tries in step i=1. Each time we pick the wrong number, there is a chance we might pick the wrong number again. This results in:
d(1) = 1 + 1/n + (1/n)2 + (1/n)3 + (1/n)4 + ...
Similarly, in the general i:th step, the chance to pick the wrong item in one choice is i/n, resulting in:
d(i) = 1 + i/n + (i/n)2 + (i/n)3 + (i/n)4 + ... = = sum( (i/n)k ), where k ∈ [0,∞]
This is a geometric sequence and hence it is easy to compute it's sum:
d(i) = (1 - i/n)-1
The overall complexity is then computed by summing the expected amount of tries in each step:
T(m,n) = sum ( d(i) ), where i ∈ [0,m-1] = = 1 + (1 - 1/n)-1 + (1 - 2/n)-1 + (1 - 3/n)-1 + ... + (1 - (m-1)/n)-1
Extending the fractions in the series above by n, we get:
T(m,n) = n/n + n/(n-1) + n/(n-2) + n/(n-3) + ... + n/(n-m+2) + n/(n-m+1)
We can use the fact that:
n/n ≤ n/(n-1) ≤ n/(n-2) ≤ n/(n-3) ≤ ... ≤ n/(n-m+2) ≤ n/(n-m+1)
Since the series has m terms, and each term satisfies the inequality above, we get:
T(m,n) ≤ n/(n-m+1) + n/(n-m+1) + n/(n-m+1) + n/(n-m+1) + ... + n/(n-m+1) + n/(n-m+1) = = m*n/(n-m+1)
It might be(and probably is) possible to establish a slightly stricter upper bound by using some technique to evaluate the series instead of bounding by the rough method of (amount of terms) * (biggest term)
Conclusion
This would mean that the Big-O order is O(m*n/(n-m+1)). I see no possible way to simplify this expression from the way it is.
Looking back at the result to check if it makes sense, we see that, if n is constant, and m gets closer and closer to n, the results will quickly increase, since the denominator gets very small. This is what we'd expect, if we for example consider the example given in the question about selecting "999,999 values from a set of 1,000,000". If we instead let m be constant and n grow really, really large, the complexity will converge towards O(m) in the limit n → ∞. This is also what we'd expect, since while chosing a constant number of items from a "close to" infinitely sized set the probability of choosing a previously chosen value is basically 0. I.e. We need m tries independently of n since there are no collisions.

If you already have chosen i values then the probability that you pick a new one from a set of y values is
(y-i)/y.
Hence the expected number of trials to get (i+1)-th element is
y/(y-i).
Thus the expected number of trials to choose x unique element is the sum
y/y + y/(y-1) + ... + y/(y-x+1)
This can be expressed using harmonic numbers as
y (Hy - Hy-x).
From the wikipedia page you get the approximation
Hx = ln(x) + gamma + O(1/x)
Hence the number of necessary trials to pick x unique elements from a set of y elements
is
y (ln(y) - ln(y-x)) + O(y/(y-x)).
If you need then you can get a more precise approximation by using a more precise approximation for Hx. In particular, when x is small it is possible to
improve the result a lot.

If you're willing to make the assumption that your random number generator will always find a unique value before cycling back to a previously seen value for a given draw, this algorithm is O(m^2), where m is the number of unique values you are drawing.
So, if you are drawing m values from a set of n values, the 1st value will require you to draw at most 1 to get a unique value. The 2nd requires at most 2 (you see the 1st value, then a unique value), the 3rd 3, ... the mth m. Hence in total you require 1 + 2 + 3 + ... + m = [m*(m+1)]/2 = (m^2 + m)/2 draws. This is O(m^2).
Without this assumption, I'm not sure how you can even guarantee the algorithm will complete. It's quite possible (especially with a pseudo-random number generator which may have a cycle), that you will keep seeing the same values over and over and never get to another unique value.
==EDIT==
For the average case:
On your first draw, you will make exactly 1 draw.
On your 2nd draw, you expect to make 1 (the successful draw) + 1/n (the "partial" draw which represents your chance of drawing a repeat)
On your 3rd draw, you expect to make 1 (the successful draw) + 2/n (the "partial" draw...)
...
On your mth draw, you expect to make 1 + (m-1)/n draws.
Thus, you will make 1 + (1 + 1/n) + (1 + 2/n) + ... + (1 + (m-1)/n) draws altogether in the average case.
This equals the sum from i=0 to (m-1) of [1 + i/n]. Let's denote that sum(1 + i/n, i, 0, m-1).
Then:
sum(1 + i/n, i, 0, m-1) = sum(1, i, 0, m-1) + sum(i/n, i, 0, m-1)
= m + sum(i/n, i, 0, m-1)
= m + (1/n) * sum(i, i, 0, m-1)
= m + (1/n)*[(m-1)*m]/2
= (m^2)/(2n) - (m)/(2n) + m
We drop the low order terms and the constants, and we get that this is O(m^2/n), where m is the number to be drawn and n is the size of the list.

There's a beautiful O(n) algorithm for this. It goes as follows. Say you have n items, from which you want to pick m items. I assume the function rand() yields a random real number between 0 and 1. Here's the algorithm:
items_left=n
items_left_to_pick=m
for j=1,...,n
if rand()<=(items_left_to_pick/items_left)
Pick item j
items_left_to_pick=items_left_to_pick-1
end
items_left=items_left-1
end
It can be proved that this algorithm does indeed pick each subset of m items with equal probability, though the proof is non-obvious. Unfortunately, I don't have a reference handy at the moment.
Edit The advantage of this algorithm is that it takes only O(m) memory (assuming the items are simply integers or can be generated on-the-fly) compared to doing a shuffle, which takes O(n) memory.

Your actual question is actually a lot more interesting than what I answered (and harder). I've never been any good at statistitcs (and it's been a while since I did any), but intuitively, I'd say that the run-time complexity of that algorithm would probably something like an exponential. As long as the number of elements picked is small enough compared to the size of the array the collision-rate will be so small that it will be close to linear time, but at some point the number of collisions will probably grow fast and the run-time will go down the drain.
If you want to prove this, I think you'd have to do something moderately clever with the expected number of collisions in function of the wanted number of elements. It might be possible do to by induction as well, but I think going by that route would require more cleverness than the first alternative.
EDIT: After giving it some thought, here's my attempt:
Given an array of m elements, and looking for n random and different elements. It is then easy to see that when we want to pick the ith element, the odds of picking an element we've already visited are (i-1)/m. This is then the expected number of collisions for that particular pick. For picking n elements, the expected number of collisions will be the sum of the number of expected collisions for each pick. We plug this into Wolfram Alpha (sum (i-1)/m, i=1 to n) and we get the answer (n**2 - n)/2m. The average number of picks for our naive algorithm is then n + (n**2 - n)/2m.
Unless my memory fails me completely (which entirely possible, actually), this gives an average-case run-time O(n**2).

The worst case for this algorithm is clearly when you're choosing the full set of N items. This is equivalent to asking: On average, how many times must I roll an N-sided die before each side has come up at least once?
Answer: N * HN, where HN is the Nth harmonic number,
a value famously approximated by log(N).
This means the algorithm in question is N log N.
As a fun example, if you roll an ordinary 6-sided die until you see one of each number, it will take on average 6 H6 = 14.7 rolls.

Before being able to answer this question in details, lets define the framework. Suppose you have a collection {a1, a2, ..., an} of n distinct objects, and want to pick m distinct objects from this set, such that the probability of a given object aj appearing in the result is equal for all objects.
If you have already picked k items, and radomly pick an item from the full set {a1, a2, ..., an}, the probability that the item has not been picked before is (n-k)/n. This means that the number of samples you have to take before you get a new object is (assuming independence of random sampling) geometric with parameter (n-k)/n. Thus the expected number of samples to obtain one extra item is n/(n-k), which is close to 1 if k is small compared to n.
Concluding, if you need m unique objects, randomly selected, this algorithm gives you
n/n + n/(n-1) + n/(n-2) + n/(n-3) + .... + n/(n-(m-1))
which, as Alderath showed, can be estimated by
m*n / (n-m+1).
You can see a little bit more from this formula:
* The expected number of samples to obtain a new unique element increases as the number of already chosen objects increases (which sounds logical).
* You can expect really long computation times when m is close to n, especially if n is large.
In order to obtain m unique members from the set, use a variant of David Knuth's algorithm for obtaining a random permutation. Here, I'll assume that the n objects are stored in an array.
for i = 1..m
k = randInt(i, n)
exchange(i, k)
end
here, randInt samples an integer from {i, i+1, ... n}, and exchange flips two members of the array. You only need to shuffle m times, so the computation time is O(m), whereas the memory is O(n) (although you can adapt it to only save the entries such that a[i] <> i, which would give you O(m) on both time and memory, but with higher constants).

Most people forget that looking up, if the number has already run, also takes a while.
The number of tries nessesary can, as descriped earlier, be evaluated from:
T(n,m) = n(H(n)-H(n-m)) ⪅ n(ln(n)-ln(n-m))
which goes to n*ln(n) for interesting values of m
However, for each of these 'tries' you will have to do a lookup. This might be a simple O(n) runthrough, or something like a binary tree. This will give you a total performance of n^2*ln(n) or n*ln(n)^2.
For smaller values of m (m < n/2), you can do a very good approximation for T(n,m) using the HA-inequation, yielding the formula:
2*m*n/(2*n-m+1)
As m goes to n, this gives a lower bound of O(n) tries and performance O(n^2) or O(n*ln(n)).
All the results are however far better, that I would ever have expected, which shows that the algorithm might actually be just fine in many non critical cases, where you can accept occasional longer running times (when you are unlucky).

Recursion and Big O

I've been working through a recent Computer Science homework involving recursion and big-O notation. I believe I understand this pretty well (certainly not perfectly, though!) But there is one question in particular that is giving me the most problems. The odd thing is that by looking it, it looks to be the most simple one on the homework.
Provide the best rate of growth using the big-Oh notation for the solution to the following recurrence?
T(1) = 2
T(n) = 2T(n - 1) + 1 for n>1
And the choices are:
O(n log n)
O(n^2)
O(2^n)
O(n^n)
I understand that big O works as an upper bound, to describe the most amount of calculations, or the highest running time, that program or process will take. I feel like this particular recursion should be O(n), since, at most, the recursion only occurs once for each value of n. Since n isn't available, it's either better than that, O(nlogn), or worse, being the other three options.
So, my question is: Why isn't this O(n)?

There's a couple of different ways to solve recurrences: substitution, recurrence tree and master theorem. Master theorem won't work in the case, because it doesn't fit the master theorem form.
You could use the other two methods, but the easiest way for this problem is to solve it iteratively.
T(n) = 2T(n-1) + 1
T(n) = 4T(n-2) + 2 + 1
T(n) = 8T(n-3) + 4 + 2 + 1
T(n) = ...
See the pattern?
T(n) = 2n-1⋅T(1) + 2n-2 + 2n-3 + ... + 1
T(n) = 2n-1⋅2 + 2n-2 + 2n-3 + ... + 1
T(n) = 2n + 2n-2 + 2n-3 + ... + 1
Therefore, the tightest bound is Θ(2n).

I think you have misunderstood the question a bit. It does not ask you how long it would take to solve the recurrence. It is asking what the big-O (the asymptotic bound) of the solution itself is.
What you have to do is to come up with a closed form solution, i. e. the non-recursive formula for T(n), and then determine what the big-O of that expression is.

The question is asking for the big-Oh notation for the solution to the recurrence, not the cost of calculation the recurrence.
Put another way: the recurrence produces:
1 -> 2
2 -> 5
3 -> 11
4 -> 23
5 -> 47
What big-Oh notation best describes the sequence 2, 5, 11, 23, 47, ...
The correct way to solve that is to solve the recurrence equations.

I think this will be exponential. Each increment to n makes the value to be twice as large.
T(2) = 2 * T(1) = 4
T(3) = 2 * T(2) = 2 * 4
...
T(x) would be the running time of the following program (for example):
def fn(x):
if (x == 1):
return # a constant time
# do the calculation for n - 1 twice
fn(x - 1)
fn(x - 1)

I think this will be exponential. Each increment to n brings twice as much calculation.
No, it doesn't. Quite on the contrary:
Consider that for n iterations, we get running time R. Then for n + 1 iterations we'll get exactly R + 1.
Thus, the growth rate is constant and the overall runtime is indeed O(n).
However, I think Dima's assumption about the question is right although his solution is overly complicated:
What you have to do is to come up with a closed form solution, i. e. the non-recursive formula for T(n), and then determine what the big-O of that expression is.
It's sufficient to examine the relative size of T(n) and T(n + 1) iterations and determine the relative growth rate. The amount obviously doubles which directly gives the asymptotic growth.

First off, all four answers are worse than O(n)... O(n*log n) is more complex than plain old O(n). What's bigger: 8 or 8 * 3, 16 or 16 * 4, etc...
On to the actual question. The general solution can obviously be solved in constant time if you're not doing recursion
( T(n) = 2^(n - 1) + 2^(n) - 1 ), so that's not what they're asking.
And as you can see, if we write the recursive code:
int T( int N )
{
if (N == 1) return 2;
return( 2*T(N-1) + 1);
}
It's obviously O(n).
So, it appears to be a badly worded question, and they are probably asking you the growth of the function itself, not the complexity of the code. That's 2^n. Now go do the rest of your homework... and study up on O(n * log n)

Computing a closed form solution to the recursion is easy.
By inspection, you guess that the solution is
T(n) = 3*2^(n-1) - 1
Then you prove by induction that this is indeed a solution. Base case:
T(1) = 3*2^0 - 1 = 3 - 1 = 2. OK.
Induction:
Suppose T(n) = 3*2^(n-1) - 1. Then
T(n+1) = 2*T(n) + 1 = 3*2^n - 2 + 1 = 3*2^((n+1)-1) - 1. OK.
where the first equality stems from the recurrence definition,
and the second from the inductive hypothesis. QED.
3*2^(n-1) - 1 is clearly Theta(2^n), hence the right answer is the third.
To the folks that answered O(n): I couldn't agree more with Dima. The problem does not ask the tightest upper bound to the computational complexity of an algorithm to compute T(n) (which would be now O(1), since its closed form has been provided). The problem asks for the tightest upper bound on T(n) itself, and that is the exponential one.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex