how to solve mathematical expectation in hackerrank 20/20 hack february 2014

how to solve mathematical expectation in hackerrank 20/20 hack february 2014 - math

So this problem was given in Hackerrank 20/20 hack february :
Let’s consider a random permutation p1, p2, …, pN of numbers 1, 2, …, N and calculate the value F=(X2+…+XN-1)^K, where Xi equals 1 if one of the following two conditions holds: pi-1 < pi > pi+1 or pi-1 > pi < pi+1 and Xi equals 0 otherwise. What is the expected value of F?
Constraints: 1000 <= N <= 10^9, 1 <= K <= 5
I thought it was Eulerian number related problem. As the contest is over,I can see the solutions. But I don't understand any of them. Is there any tricks?

so a few words about my "solution" ;)
What I basically did:
1) write a brute force solver (obviously for N << 20)
-> this solver won't handle high values of N, as given in the constraints
2) analyze the output of the solutions to these (invalid) inputs
-> observe that with K=1, the output follows a straight line
-> K=2, is a quadratic function
-> K=3, is a cubic function, and so on
3) find the parameters for each function (K=1 - 5) by using a solver, or how I did it, wolfram alpha ;)
-> additionally I "normalized" each parameter to only have one division afterwards
4) use any programming language / big integer class to solve the correct inputs in O(1)
I'm pretty sure that one can come up with these parameters in a very clever way, but for me, during the contest, this solution was easy and fast enough without having to think too much about the "why" ;)

Related

When have enough bits of my series with non-negative terms been calculated?

I have a power series with all terms non-negative which I want to evaluate to some arbitrarily set precision p (the length in binary digits of a MPFR floating-point mantissa). The result should be faithfully rounded. The issue is that I don't know when should I stop adding terms to the result variable, that is, how do I know when do I already have p + 32 accurate summed bits of the series? 32 is just an arbitrarily chosen small natural number meant to facilitate more accurate rounding to p binary digits.
This is my original series
0 <= h <= 1
series_orig(h) := sum(n = 0, +inf, a(n) * h^n)
But I actually need to calculate an arbitrary derivative of the above series (m is the order of the derivative):
series(h, m) := sum(n = m, +inf, a(n) * (n - m + 1) * ... * n * h^(n - m))
The rational number sequence a is defined like so:
a(n) := binomial(1/2, n)^2
= (((2*n)!/(n!)) / (n! * 4^n * (2*n - 1)))^2
So how do I know when to stop summing up terms of series?
Is the following maybe a good strategy?
compute in p * 4 (which is assumed to be greater than p + 32).
at each point be able to recall the current partial sum and the previous one.
stop looping when the previous and current partial sums are equal if rounded to precision p + 32.
round to precision p and return.
Clarification
I'm doing this with MPFI, an interval arithmetic addon to MPFR. Thus the [mpfi] tag.
Attempts to get relevant formulas and equations
Guided by Eric in the comments, I have managed to derive a formula for the required working precision and an equation for the required number of terms of the series in the sum.
A problem, however, is that a nice formula for the required number of terms is not possible.
Someone more mathematically capable might instead be able to achieve a formula for a useful upper bound, but that seems quite difficult to do for all possible requested result precisions and for all possible values of m (the order of the derivative). Note that the formulas need to be easily computable so they're ready before I start computing the series.
Another problem is that it seems necessary to assume the worst case for h (h = 1) for there to be any chance of a nice formula, but this is wasteful if h is far from the worst case, that is if h is close to zero.

Sum of the powers

I have to code to evaluate the value of following sequence :
( pow(1,k) + pow(2,k) + ... + pow(n,k) ) % MOD
for given value of n,k and MOD.
I have tried searching it on internet. I got an equation . It contains zeta functions and it seems difficult in implementation. I want any simple approach for implementing the same. Note that the value of n is large, so that we cannot simply use brute force to pass the time limit.

Newton's identities might be of help. Calculate the coefficients of the polynomial with 1..n as roots. That pretty trivial. Then use the identities.
It's just the first thing that comes to mind when I see sums of powers.
I think it is nicely compatible with modular arithmetics - there are only multiplications and additions.
I must admit, that Newton's identities are only the rearrangement of the terms, so not much speed gain here.

JUST USE PYTHON
k=input("Enter value for K: ")
n=input("Enter value for N: ")
mod=input("Enter value for MOD: ")
sum=0
for i in range(1,n+1):
sum+=pow(i,k)
result=sum % mod
print mod
May be this code is gonna help.

I agree that math.stackexchange.com is a better bet.
But here are random facts that, depending on parameters, may make the problem more manageable.
First, factor MOD, solve for each prime power factor, then use the Chinese Remainder Theorem to find the answer for MOD. Thus without loss of generality, you may assume that MOD is a prime power.
Next, note that 1^k + ... + MOD^k is always divisible by MOD. Therefore you can replace n by n mod MOD.
Next, if MOD = p^i and j is not divisible by p, then j^((p-1) * p^(i-1)) is 1 mod MOD, so we can reduce the size of k.
Of course if (k, n) < MOD and MOD is prime, this will not help you at all. (Which, depending on how this problem arises, may well be the case.)
(If k is small enough, there are explicit formulas that you can produce for the sum. But it seems that for you k can be large enough to make that approach intractable.)

Get branch points of equation

If I have a general function,f(z,a), z and a are both real, and the function f takes on real values for all z except in some interval (z1,z2), where it becomes complex. How do I determine z1 and z2 (which will be in terms of a) using Mathematica (or is this possible)? What are the limitations?
For a test example, consider the function f[z_,a_]=Sqrt[(z-a)(z-2a)]. For real z and a, this takes on real values except in the interval (a,2a), where it becomes imaginary. How do I find this interval in Mathematica?
In general, I'd like to know how one would go about finding it mathematically for a general case. For a function with just two variables like this, it'd probably be straightforward to do a contour plot of the Riemann surface and observe the branch cuts. But what if it is a multivariate function? Is there a general approach that one can take?

What you have appears to be a Riemann surface parametrized by 'a'. Consider the algebraic (or analytic) relation g(a,z)=0 that would be spawned from this branch of a parametrized Riemann surface. In this case it is simply g^2 - (z - a)*(z - 2*a) == 0. More generally it might be obtained using Groebnerbasis, as below (no guarantee this will always work without some amount of user intervention).
grelation = First[GroebnerBasis[g - Sqrt[(z - a)*(z - 2*a)], {x, a, g}]]
Out[472]= 2 a^2 - g^2 - 3 a z + z^2
A necessary condition for the branch points, as functions of the parameter 'a', is that the zero set for 'g' not give a (single valued) function in a neighborhood of such points. This in turn means that the partial derivative of this relation with respect to g vanishes (this is from the implicit function theorem of multivariable calculus). So we find where grelation and its derivative both vanish, and solve for 'z' as a function of 'a'.
Solve[Eliminate[{grelation == 0, D[grelation, g] == 0}, g], z]
Out[481]= {{z -> a}, {z -> 2 a}}
Daniel Lichtblau
Wolfram Research

For polynomial systems (and some class of others), Reduce can do the job.
E.g.
In[1]:= Reduce[Element[{a, z}, Reals]
&& !Element[Sqrt[(z - a) (z - 2 a)], Reals], z]
Out[1]= (a < 0 && 2a < z < a) || (a > 0 && a < z < 2a)
This type of approach also works (often giving very complicated solutions for functions with many branch cuts) for other combinations of elementary functions I checked.
To find the branch cuts (as opposed to the simple class of branch points you're interested in) in general, I don't know of a good approach. The best place to find the detailed conventions that Mathematica uses is at the functions.wolfram site.
I do remember reading a good paper on this a while back... I'll try to find it....
That's right! The easiest approach I've seen for branch cut analysis uses the unwinding number. There's a paper "Reasoning about the elementary functions of complex analysis" about this the the journal "Artificial Intelligence and Symbolic Computation". It and similar papers can be found at one of the authors homepage: http://www.apmaths.uwo.ca/~djeffrey/offprints.html.

For general functions you cannot make Mathematica calculate it.
Even for polynomials, finding an exact answer takes time.
I believe Mathematica uses some sort of quantifier elimination when it uses Reduce,
which takes time.
Without any restrictions on your functions (are they polynomials, continuous, smooth?)
one can easily construct functions which Mathematica cannot simplify further:
f[x_,y_] := Abs[Zeta[y+0.5+x*I]]*I
If this function is real for arbitrary x and any -0.5 < y < 0 or 0<y<0.5,
then you will have found a counterexample to the Riemann zeta conjecture,
and I'm sure Mathematica cannot give a correct answer.

how to evaluate derivative of function in matlab?

This should be very simple. I have a function f(x), and I want to evaluate f'(x) for a given x in MATLAB.
All my searches have come up with symbolic math, which is not what I need, I need numerical differentiation.
E.g. if I define: fx = inline('x.^2')
I want to find say f'(3), which would be 6, I don't want to find 2x

If your function is known to be twice differentiable, use
f'(x) = (f(x + h) - f(x - h)) / 2h
which is second order accurate in h. If it is only once differentiable, use
f'(x) = (f(x + h) - f(x)) / h (*)
which is first order in h.
This is theory. In practice, things are quite tricky. I'll take the second formula (first order) as the analysis is simpler. Do the second order one as an exercise.
The very first observation is that you must make sure that (x + h) - x = h, otherwise you get huge errors. Indeed, f(x + h) and f(x) are close to each other (say 2.0456 and 2.0467), and when you substract them, you lose a lot of significant figures (here it is 0.0011, which has 3 significant figures less than x). So any error on h is likely to have a huge impact on the result.
So, first step, fix a candidate h (I'll show you in a minute how to chose it), and take as h for your computation the quantity h' = (x + h) - x. If you are using a language like C, you must take care to define h or x as volatile for that computation not to be optimized away.
Next, the choice of h. The error in (*) has two parts: the truncation error and the roundoff error. The truncation error is because the formula is not exact:
(f(x + h) - f(x)) / h = f'(x) + e1(h)
where e1(h) = h / 2 * sup_{x in [0,h]} |f''(x)|.
The roundoff error comes from the fact that f(x + h) and f(x) are close to each other. It can be estimated roughly as
e2(h) ~ epsilon_f |f(x) / h|
where epsilon_f is the relative precision in the computation of f(x) (or f(x + h), which is close). This has to be assessed from your problem. For simple functions, epsilon_f can be taken as the machine epsilon. For more complicated ones, it can be worse than that by orders of magnitude.
So you want h which minimizes e1(h) + e2(h). Plugging everything together and optimizing in h yields
h ~ sqrt(2 * epsilon_f * f / f'')
which has to be estimated from your function. You can take rough estimates. When in doubt, take h ~ sqrt(epsilon) where epsilon = machine accuracy. For the optimal choice of h, the relative accuracy to which the derivative is known is sqrt(epsilon_f), ie. half the significant figures are correct.
In short: too small a h => roundoff error, too large a h => truncation error.
For the second order formula, same computation yields
h ~ (6 * epsilon_f / f''')^(1/3)
and a fractional accuracy of (epsilon_f)^(2/3) for the derivative (which is typically one or two significant figures better than the first order formula, assuming double precision).
If this is too imprecise, feel free to ask for more methods, there are a lot of tricks to get better accuracy. Richardson extrapolation is a good start for smooth functions. But those methods typically compute f quite a few times, this may or not be what you want if your function is complex.
If you are going to use numerical derivatives a lot of times at different points, it becomes interesting to construct a Chebyshev approximation.

To get a numerical difference (symmetric difference), you calculate (f(x+dx)-f(x-dx))/(2*dx)
fx = #(x)x.^2;
fPrimeAt3 = (fx(3.1)-fx(2.9))/0.2;
Alternatively, you can create a vector of function values and apply DIFF, i.e.
xValues = 2:0.1:4;
fValues = fx(xValues);
df = diff(fValues)./0.1;
Note that diff takes the forward difference, and that it assumes that dx equals to 1.
However, in your case, you may be better off to define fx as a polynomial, and evaluating the derivative of the function, rather than the function values.

Lacking the symbolic toolbox, nothing stops you from using Derivest, a tool for automatic adaptive numerical differentiation.
derivest(#sin,pi)
ans =
-1
For your example it does very nicely. In fact, it even provides an estimate of the error in the resulting approximation.
fx = inline('x.^2');
[fp,errest] = derivest(fx,3)
fp =
6
errest =
3.6308e-14

did you try diff (calculates differences and approximates a derivative), gradient, or polyder (calculates the derivative of a polynomial) functions?
You can read more on these functions by using help <commandname> on MATLAB console, or use the function browser in the Help menu.

For a given function in analytical form, you can evaluate the derivative at a desired point with the following code:
syms x
df = diff(x^2);
df3 = subs(df, 'x', 3);
fprintf('f''(3)=%f\n', df3);
For pure numerical derivatives use the already given solutions by Jonas and posdef.

Pohlig–Hellman algorithm for computing discrete logarithms

I'm working on coding the Pohlig-Hellman Algorithm but I am having problem understand the steps in the algorithm based on the definition of the algorithm.
Going by the Wiki of the algorithm:
I know the first part 1) is to calculate the prime factor of p-1 - which is fine.
However, I am not sure what I need to do in steps 2) where you calculate the co-efficents:
Let x2 = c0 + c1(2).
125(180/2) = 12590 1 mod (181) so c0 = 0.
125(180/4) = 12545 1 mod (181) so c1 = 0.
Thus, x2 = 0 + 0 = 0.
and 3) put the coefficents together and solve in the chinese remainder theorem.
Can someone help with explaining this in plain english (i) - or pseudocode. I want to code the solution myself obviously but I cannot make any more progress unless i understand the algorithm.
Note: I have done a lot of searching for this and I read S. Pohlig and M. Hellman (1978). "An Improved Algorithm for Computing Logarithms over GF(p) and its Cryptographic Significance but its still not really making sense to me.
Thanks in advance
Update:
how come q(125) stays constant in this example.
Where as in this example is appears like he is calculating a new q each time.
To be more specific I don't understand how the following is computed:
Now divide 7531 by a^c0 to get
7531(a^-2) = 6735 mod p.

Let's start with the main idea behind Pohlig-Hellman. Assume that we are given y, g and p and that we want to find x, such that
y == gx (mod p).
(I'm using == to denote an equivalence relation). To simplify things, I'm also assuming that the order of g is p-1, i.e. the smallest positive k with 1==gk (mod p) is k=p-1.
An inefficient method to find x, would be to simply try all values in the range 1 .. p-1.
Somewhat better is the "Baby-step giant-step" method that requires O(p0.5) arithmetic operations. Both methods are quite slow for large p. Pohlig-Hellman is a significant improvement when p-1 has many factors. I.e. assume that
p-1 = n r
Then what Pohlig and Hellman propose is to solve the equation
yn == (gn)z
(mod p).
If we take logarithms to the basis g on both sides, this is the same as
n logg(y) == logg(yn) == nz (mod p-1).
n can be divided out, giving
logg(y) == z (mod r).
Hence x == z (mod r).
This is an improvement, since we only have to search a range 0 .. r-1 for a solution of z. And again "Baby-step giant-step" can be used to improve the search for z. Obviously, doing this once is not a complete solution yet. I.e. one has to repeat the algorithm above for every prime factor r of p-1 and then to use the Chinese remainder theorem to find x from the partial solutions. This works nicely if p-1 is square free.
If p-1 is divisible by a prime power then a similiar idea can be used. For example let's assume that p-1 = m qk.
In the first step, we compute z such that x == z (mod q) as shown above. Next we want to extend this to a solution x == z' (mod q2). E.g. if p-1 = m q2 then this means that we have to find z' such that
ym == (gm)z' (mod p).
Since we already know that z' == z (mod q), z' must be in the set {z, z+q, z+2q, ..., z+(q-1)q }. Again we could either do an exhaustive search for z' or improve the search with "baby-step giant-step". This step is repeated for every exponent of q, this is from knowing x mod qi we iteratively derive x mod qi+1.

I'm coding it up myself right now (JAVA). I'm using Pollard-Rho to find the small prime factors of p-1. Then using Pohlig-Hellman to solve a DSA private key. y = g^x. I am having the same problem..
UPDATE: "To be more specific I don't understand how the following is computed: Now divide 7531 by a^c0 to get 7531(a^-2) = 6735 mod p."
if you find the modInverse of a^c0 it will make sense
Regards