how to solve 50 noodles/shoelaces puzzle?

There are 50 noodles in a bowl. You can tie two ends of either one noodle or two different noodles, forming a nod.
Q: What is the expected value of number of loops we can have in the bowl?

∑(1/i) for i from 1 to 50.
When you have n noodles lets take a look at noodle number n. It can either be tied to itself with probability 1/n or some other noodle with probability (n-1)/n. When it gets tied to itself the loop is formed and we need to find the expected value for the rest n-1 noodles. When it gets tied to some other noodle then it is the same as we have taken away this noodle so the answer is expected value for the rest n-1 noodles.
f(n) = 1/n * (f(n-1) + 1) + (n-1)/n * f(n-1);
f(n) = 1/n * f(n-1) + 1/n + (n-1)/n * f(n-1);
f(n) = f(n-1) + 1/n
f(n) = 1 + 1/2 + ... + 1/n


getting closed form of recursion equation and compare which is faster

Get closed form of these equations if possible. Then, determine which would be faster than the other.
f(n) = 0.25f(n/3)+ f(n/10) + logn, f(1) = 1
g(n) = n + log(n-1)^2 + 1
In these equations, do I have to expand these recursions and try to discover patterns within? I really don't know how to calculate closed form intuitively
Short answer: g(n)>f(n)
Long answer: g is not even recursive, so you can see immediately that g(n)=O(n).
You can approximate f(n) <f(n/2)+logn
which, by the master theorem, is Θ(logn)

What is "h" in numerical differentiation?

I would like to know what h from the numerical differentiation formulas is and how I can calculate it when I have a function.
I am speaking about this formulas:
f'(x0) = (f(x0 + h) - f(x0)) / h
f'(x0) = (f(x0) - f(x0 - h)) / h
f'(x0) = (f(x0 + h) - f(x0 - h)) / 2*h
I would really appreciate any kind of help!
In such formulae h is usually a "very small number", similar to epsilon in Calculus.
For example, the derivative of f at a is defined as:
Note how h is defined as approaching 0.
When programming, e.g. doing numerical gradient computation, it usually works to set h to something very small - many programming environments have an "epsilon" quantity; lacking that, you can just use a very small floating-point number.
Using the usual 8 byte floats, sensible values for h are 1e-8 for the first and second formula and 1e-5 for the third central difference quotient. This is valid for medium values of x, for larger x one would have to include the scale of x in some way.
In general, for a kth order difference quotient with error order p, the balance between floating point noise and numerical error is reached for h about pow(2e-16, 1.0/(p+k)).

Algorithm for Solving a Linear Combination?

I have run into the following problem that I need to solve in a project that I'm working on:
Given some number of vectors v_i (in the math sense), and a target vector H, compute a linear combination of the vectors v_i that most closely matches the target vector H, with the constraint that the coefficients must be in [0, 1].
I do not know much about what kind of algorithms / math should be used to approach such a problem. Any prods in the right general direction would be much appreciated!
It's a constrained least square problem. Basically you want to solve the optimization problem:
argmin ||Ax-H||
s.t. 0<=x_j<=1
where x=(x_1, ..., x_j, ..., x_n) consists the coefficients you are seeking, and a column of A corresponds to a vector v_i.
Assuming that you want to solve in the least squares sense, then you have a quadratic programming problem. For example, say that your set of vectors is
x1 = 1 2 3]' x2 = [3 2 1]'
and your target vector is
H = [1 -1 1]'
Then you can create the matrix whose columns are your vectors:
A = [1 3;
2 2;
3 1]
and the thing you are trying to minimize is
norm(A*x - H) = (A*x - H)' * (A*x - H) = x' * (A'*A) * x - (2*H'*A) * x + const
If you define
B = A' * A
C = -2 * H' * A
then you have a problem that can be solved optimally my Matlab's quadprog function
ans =
so the optimal solution in this case is
1/6 * x1 + 1/6 * x2 = [2/3, 2/3, 2/3]
This is a combinatorial optimization problem. This kind of problems are NP-hard. But I guess for the binary one, there should be polynomial algorithms that can solve, or there may be some relaxation to get an approximate solution. Some googling on "integer programming" may help.

The I in Proportional Integral Derivative

The I in PID (Proportional Integral Derivative) is the sum of the last few previous errors, weighted only by it's gain.
Using error(-1) to mean the previous error, error(-2) to mean the error before that etc... 'I' can be described as:
I = (error(-1) + error(-2) + error(-3) + error(-4) etc...) * I_gain
Why when PID was designed was 'I' not instead designed to slope off in importance into the past, for example:
I = (error(-1) + (error(-2) * 0.9) + (error(-3) * 0.81) + (error(-4) * 0.729) + etc...) * I_gain
edit: reworded
The integral term is the sum of ALL the past errors. You simply add the error to the "integrator" at each time step. If this needs to be limited, clamp it to a min or max value if it goes out of range. Then copy this accumulated value to your output and add the proportional and derivative terms and clamp the output again if necessary.
The Derivative term is the difference in the present and previous error (the rate of change in the error). P of course is just proportional to the error.
err = reference - new_measurement
I += kI * err
Derivative = err - old_err
output = I - kD * Derivative + kP * err
old_err = err
And there you have it. Limits omitted of course.
Once the controller reaches the reference value, the error will become zero and the integrator will stop changing. Noise will naturally make it bounce around a bit, but it will stay at the steady state value required to meet your objective, while the P and D terms do most of the work to reduce transients.
Notice that in a steady state condition, the I term is the ONLY thing providing any output. If the control has reached the reference and this requires a non-zero output, it is provided solely by the integrator since the error will be zero. If the I term used weighted errors, it would start to decay back to zero and not sustain the output as needed.

Recursion and Big O

I've been working through a recent Computer Science homework involving recursion and big-O notation. I believe I understand this pretty well (certainly not perfectly, though!) But there is one question in particular that is giving me the most problems. The odd thing is that by looking it, it looks to be the most simple one on the homework.
Provide the best rate of growth using the big-Oh notation for the solution to the following recurrence?
T(1) = 2
T(n) = 2T(n - 1) + 1 for n>1
And the choices are:
O(n log n)
I understand that big O works as an upper bound, to describe the most amount of calculations, or the highest running time, that program or process will take. I feel like this particular recursion should be O(n), since, at most, the recursion only occurs once for each value of n. Since n isn't available, it's either better than that, O(nlogn), or worse, being the other three options.
So, my question is: Why isn't this O(n)?
There's a couple of different ways to solve recurrences: substitution, recurrence tree and master theorem. Master theorem won't work in the case, because it doesn't fit the master theorem form.
You could use the other two methods, but the easiest way for this problem is to solve it iteratively.
T(n) = 2T(n-1) + 1
T(n) = 4T(n-2) + 2 + 1
T(n) = 8T(n-3) + 4 + 2 + 1
T(n) = ...
See the pattern?
T(n) = 2n-1⋅T(1) + 2n-2 + 2n-3 + ... + 1
T(n) = 2n-1⋅2 + 2n-2 + 2n-3 + ... + 1
T(n) = 2n + 2n-2 + 2n-3 + ... + 1
Therefore, the tightest bound is Θ(2n).
I think you have misunderstood the question a bit. It does not ask you how long it would take to solve the recurrence. It is asking what the big-O (the asymptotic bound) of the solution itself is.
What you have to do is to come up with a closed form solution, i. e. the non-recursive formula for T(n), and then determine what the big-O of that expression is.
The question is asking for the big-Oh notation for the solution to the recurrence, not the cost of calculation the recurrence.
Put another way: the recurrence produces:
1 -> 2
2 -> 5
3 -> 11
4 -> 23
5 -> 47
What big-Oh notation best describes the sequence 2, 5, 11, 23, 47, ...
The correct way to solve that is to solve the recurrence equations.
I think this will be exponential. Each increment to n makes the value to be twice as large.
T(2) = 2 * T(1) = 4
T(3) = 2 * T(2) = 2 * 4
T(x) would be the running time of the following program (for example):
def fn(x):
if (x == 1):
return # a constant time
# do the calculation for n - 1 twice
fn(x - 1)
fn(x - 1)
I think this will be exponential. Each increment to n brings twice as much calculation.
No, it doesn't. Quite on the contrary:
Consider that for n iterations, we get running time R. Then for n + 1 iterations we'll get exactly R + 1.
Thus, the growth rate is constant and the overall runtime is indeed O(n).
However, I think Dima's assumption about the question is right although his solution is overly complicated:
What you have to do is to come up with a closed form solution, i. e. the non-recursive formula for T(n), and then determine what the big-O of that expression is.
It's sufficient to examine the relative size of T(n) and T(n + 1) iterations and determine the relative growth rate. The amount obviously doubles which directly gives the asymptotic growth.
First off, all four answers are worse than O(n)... O(n*log n) is more complex than plain old O(n). What's bigger: 8 or 8 * 3, 16 or 16 * 4, etc...
On to the actual question. The general solution can obviously be solved in constant time if you're not doing recursion
( T(n) = 2^(n - 1) + 2^(n) - 1 ), so that's not what they're asking.
And as you can see, if we write the recursive code:
int T( int N )
if (N == 1) return 2;
return( 2*T(N-1) + 1);
It's obviously O(n).
So, it appears to be a badly worded question, and they are probably asking you the growth of the function itself, not the complexity of the code. That's 2^n. Now go do the rest of your homework... and study up on O(n * log n)
Computing a closed form solution to the recursion is easy.
By inspection, you guess that the solution is
T(n) = 3*2^(n-1) - 1
Then you prove by induction that this is indeed a solution. Base case:
T(1) = 3*2^0 - 1 = 3 - 1 = 2. OK.
Suppose T(n) = 3*2^(n-1) - 1. Then
T(n+1) = 2*T(n) + 1 = 3*2^n - 2 + 1 = 3*2^((n+1)-1) - 1. OK.
where the first equality stems from the recurrence definition,
and the second from the inductive hypothesis. QED.
3*2^(n-1) - 1 is clearly Theta(2^n), hence the right answer is the third.
To the folks that answered O(n): I couldn't agree more with Dima. The problem does not ask the tightest upper bound to the computational complexity of an algorithm to compute T(n) (which would be now O(1), since its closed form has been provided). The problem asks for the tightest upper bound on T(n) itself, and that is the exponential one.
