Pytorch derivative calculation - math

I have this simple pytorch code:
x = torch.arange(3,dtype=float)
x.requires_grad_(True)
y = 3*x + x.sum()
y.backward(torch.ones(3))
x.grad
This gives me [6,6,6], but shouldn't it be [4,4,4] ?
Because if we have f(x)=3 * x0 + 3 * x1 + 3 * x2 + x0+x1+x2, partial derivatives would be 3+1=4 ?

The result is correct, and here is why.
I will refer to the first element of your results, and you can extend to the other elements. You want to compute dy1/dx1, but this is not the correct way. The result your code computes is dy1/dx1+ dy2/dx1 + dy3/dx1.
The ones you pass in the .backward implies that the result computed would be dot_product(ones, dy/dx). Note that dy/dx is a 3x3 matrix.

Related

Evaluate the Binomial polynomial expression in R

I need to calculate the binomial polynomial expression in r. I can calculate the polynomial expression using polynomial() function in r. But on top of evaluating the expression in polynomial, I want the expression must hold the binomial expression as well.
For example: in binomial,
we know
1+1 = 0, which is also 1 XOR 1 = 0,
Now, if we do the same in polynomial expressions, it can be done in the following way:
(1+x) + x = 1
Here we suppose,
x + x is similar to 1 + 1 which is equal to zero. Or in other words x XOR x = 0.
Before, I have added the whole of the code in R, maybe there were few people who did not understand the question, so they might thought it is better to close the question.
I need to know how to implement the XOR operation in binomial polynomial expression in R.
Need to apply in following manner:
let f(x) = (1 + x + x^3) and g(x) = (x + x^3),
Therefore for the sum of f(x) and g(x), I need to do the following:
f(x) + g(x) = (1 + x + x^3) + (x + x^3)
= 1 + (1 + 1)x + (1 + 1)x^3 (using addition modulo 2 in Z2)
= 1 + (0)x + (0)x^3
= 1.
I hope, this time I more clear of what exactly I want and my question is more understandable.
Thanks in Advance
XOR <- function(x,y) (x+y) %% 2
would give you an XOR function fitting your definition.
Adding a solution to my own question.
Basically we need to first calculate the polynomial. Simply how we do. That is a first step. For example for adding f(x) and g(x), create a function like below
bPolynomial<-function(f, g){
K <- polynomial()
K <- (1 + x + x^3) + (x + x^3) # where f is (1 + x + x^3), and g is (x + x^3)
}
Then second is, extract the coefficients from the above polynomial and reduce them to modulo 2, using below code:
coeff <- coefficients(C_D) %% 2
print(coeff)
C_D <- polynomial(c(coeff))
That is it. You will get the desired result. I do feel stupid for getting stuck on something which is very basic. But implementation with mathematics computation sometimes make people confuse, same happened with me..!!
Hope it will be helpful to other people. Thanks.

How to solve equation with rotation and translation matrices?

I working on computer vision task and have this equation:
R0*c + t0 = R1*c + t1 = Ri*c + ti = ... = Rn*c + tn ,
n is about 20 (but can be more if needs)
where each pair of R,t (rotation matrix and translation vector in 3D) is a result of i-measurement and they are known, and vector c is what I whant to know.
I've got result with ceres solver. It's good that it can handle outliers but I think it's overkill for this task.
So what methods I should use for two situations:
With outliers
Without outliers
To handle outliers you can use RANSAC:
* In each iteration randomly pick i,j (a "sample") and solve c:
Ri*c + ti = Rj*c + tj
- Set Y = Ri*c + ti
* Apply to a larger population:
- Select S={k} for which ||Rk*c + tk - Y||<e
e ~ 3*RMS of errors without outliers
- Find optimal c for all k equations (with least mean square)
- Give it a "grade": size of S
* After few iterations use optimal c found for Max "grade".
* Number of iterations: log(1-p)/log(1-w^2)
[https://en.wikipedia.org/wiki/Random_sample_consensus]
p = 0.001 (for example. It is the required certainty of the result)
w is an assumption of nonoutliers/n.

How to calculate log(sum of terms) from its component log-terms

(1) The simple version of the problem:
How to calculate log(P1+P2+...+Pn), given log(P1), log(P2), ..., log(Pn), without taking the exp of any terms to get the original Pi. I don't want to get the original Pi because they are super small and may cause numeric computer underflow.
(2) The long version of the problem:
I am using Bayes' Theorem to calculate a conditional probability P(Y|E).
P(Y|E) = P(E|Y)*P(Y) / P(E)
I have a thousand probabilities multiplying together.
P(E|Y) = P(E1|Y) * P(E2|Y) * ... * P(E1000|Y)
To avoid computer numeric underflow, I used log(p) and calculate the summation of 1000 log(p) instead of calculating the product of 1000 p.
log(P(E|Y)) = log(P(E1|Y)) + log(P(E2|Y)) + ... + log(P(E1000|Y))
However, I also need to calculate P(E), which is
P(E) = sum of P(E|Y)*P(Y)
log(P(E)) does not equal to the sum of log(P(E|Y)*P(Y)). How should I get log(P(E)) without solving for P(E|Y)*P(Y) (they are extremely small numbers) and adding them.
You can use
log(P1+P2+...+Pn) = log(P1[1 + P2/P1 + ... + Pn/P1])
= log(P1) + log(1 + P2/P1 + ... + Pn/P1])
which works for any Pi. So factoring out maxP = max_i Pi results in
log(P1+P2+...+Pn) = log(maxP) + log(1+P2/maxP + ... + Pn/maxP)
where all the ratios are less than 1.

How to obtain the numerical solution of these differential equations with matlab

I have differential equations derived from epidemic spreading. I want to obtain the numerical solutions. Here's the equations,
t is a independent variable and ranges from [0,100].
The initial value is
y1 = 0.99; y2 = 0.01; y3 = 0;
At first, I planned to deal these with ode45 function in matlab, however, I don't know how to express the series and the combination. So I'm asking for help here.
**
The problem is how to express the right side of the equations as the odefun, which is a parameter in the ode45 function.
**
Matlab has functions to calculate binomial coefficients (number of combinations) and the finite series can be expressed just as matrix multiplication. I'll demonstrate how that works for the sum in the first equation. Note the use of the element-wise "dotted" forms of the arithmetic operators.
Calculate a row vector coefs with the constant coefficients in the sum as:
octave-3.0.0:33> a = 0:20;
octave-3.0.0:34> coefs = log2(a * 0.05 + 1) .* bincoeff(20, a);
The variables get combined into another vector:
octave-3.0.0:35> y1 = 0.99;
octave-3.0.0:36> y2 = 0.01;
octave-3.0.0:37> z = (y2 .^ a) .* ((1 - y2) .^ a) .* (y1 .^ a);
And the sum is then just evaluated as the inner product:
octave-3.0.0:38> coefs * z'
The other sums are similar.
function demo(a_in)
X = [0;0;0];
T = [0:.1:100];
a = a_in; % for nested scope
[Xout, Tout ]= ode45( #myFunc, T, X );
function [dxdt] = myFunc( t, x )
% nested function accesses "a"
dxdt = 0*x + a;
% Todo: real value of dxdt.
end
end
What about this, and you simply need to fill in the dxdt from your math above? It remains to be seen if the numerical roundoff matters...
Edit: there's a serious issue due to the 1=y1+y2+y3 constraint. Is that even allowed, since you have an IVP with 3 initial values given and 3 first order ODE's? If that constraint is a natural consequence of the equations, it may not be needed.

Calculating a parabola: What am I doing wrong?

I was following this thread and copied the code in my project. Playing around with it turns out that it seems not to be very precise.
Recall the formula: y = ax^2 + bx +c
Since the first given point I have is at x1 = 0, we already have c=y1 . We just need to find a and b. Using:
y2 = ax2^2 + bx2 +c
y3 = ax3^2 + bx3 +c
Solving the equations for b yields:
b = y/x - ax - cx
Now setting both equations equal to each other so b falls out
y2/x2 - ax2 - cx2 = y3/x3 - ax3 - cx3
Now solving for a gives me:
a = ( x3*(y2 - c) + x2*(y3 - c) ) / ( x2*x3*(x2 - x3) )
(is that correct?!)
And then using again b = y2/x2 - ax2 - cx2 to find b. However so far I haven't found the correct a and b coeffs. What am I doing wrong?
Edit
Ok I figured out, but had to use a CAS because I don't know how to invert symbolic matrices by hand. (Gauss algo doesn't seem to work)
Writing it down in Matrix form:
| 0 0 1 | |a|
| x2^2 x2 1 | * |b| = Y
| x3^2 x3 1 | |c|
Let's call the Matrix M and multiply from the left with M^(-1)
|a|
|b| = M^(-1)*Y
|c|
Then I got out of maple:
a = (-y1 * x2 + y1 * x3 - y2 * x3 + y3 * x2) / x2 / x3 / (-x2 + x3)
Guess I did a stupid mistake somewhere above.
Which gives me the same result as the formula in the thread quoted above.
Your problem is that you have three unknowns (the coefficients a, b, and c) and only one equation that I can see: y = y1 when x = 0; this gives c = y1, as you said.
Without more information, all you can do is tell how b is related to a. That's it. There isn't one solution, there are many solutions.
If you're telling me that you have two other points (x2, y2) and (x3, y3), then you should substitute all of them into the equation and solve. Start with:
(source: equationsheet.com)
Now substitute the three points (x1, y1), (x2, y2), and (x3, y3):
(source: equationsheet.com)
This is the matrix equation that you need to invert. You can use Cramer's rule or LU decomposition. Another possibility is Wolfram Alpha:
http://www.wolframalpha.com/input/?i=inverse{{x1*x1,+x1,+1},+{x2*x2,+x2,+1},+{x3*x3,+x3,+1}}
Take the inverse that the link gives you and multiply the right hand side vector by it to solve for your three coefficients.
It's a pretty easy thing to code if you note that
det = (x2 x1^2-x3 x1^2-x2^2 x1+x3^2
x1-x2 x3^2+x2^2 x3)
Divide all the entries in the matrix by this value. The numerators are pretty simple:
(source: equationsheet.com)
Divide this by the determinant and you've got your inverse.
If you have more points than three you need to do a least squares fit. Do the same trick of substituting all the points you have (x1, y1)...(xn, yn). You'll have more equations than unknowns. Multiply both sides by the transpose of the nx3 matrix and solve. Voila - you'll have the set of coefficients that minimize the squares of errors between the points and the function values.

Resources