I working on computer vision task and have this equation:
R0*c + t0 = R1*c + t1 = Ri*c + ti = ... = Rn*c + tn ,
n is about 20 (but can be more if needs)
where each pair of R,t (rotation matrix and translation vector in 3D) is a result of i-measurement and they are known, and vector c is what I whant to know.
I've got result with ceres solver. It's good that it can handle outliers but I think it's overkill for this task.
So what methods I should use for two situations:
With outliers
Without outliers
To handle outliers you can use RANSAC:
* In each iteration randomly pick i,j (a "sample") and solve c:
Ri*c + ti = Rj*c + tj
- Set Y = Ri*c + ti
* Apply to a larger population:
- Select S={k} for which ||Rk*c + tk - Y||<e
e ~ 3*RMS of errors without outliers
- Find optimal c for all k equations (with least mean square)
- Give it a "grade": size of S
* After few iterations use optimal c found for Max "grade".
* Number of iterations: log(1-p)/log(1-w^2)
[https://en.wikipedia.org/wiki/Random_sample_consensus]
p = 0.001 (for example. It is the required certainty of the result)
w is an assumption of nonoutliers/n.
Related
I have written an R code to solve the following equations jointly. These are closed-form solutions that require numerical procedure.
I further divided the numerator and denominator of (B) by N to get arithmetic means.
Here is my code:
y=cbind(Sta,Zta,Ste,Zte) # combine the variables
St=as.matrix(y[,c(1,3)])
Stm=c(mean(St[,1]), mean(St[,2])); # Arithmetic means of St's
Zt=as.matrix(y[,c(2,4)])
Ztm=c(mean(Zt[,1]), mean(Zt[,2])); # Arithmetic means of Zt's
theta=c(-20, -20); # starting values for thetas
tol=c(10^-4, 10^-4);
err=c(0,0);
epscon=-0.1
while (abs(err) > tol | phicon<0) {
### A
eps = ((mean(y[,2]^2))+mean(y[,4]^2))/(-mean(y[,1]*y[,2])+theta[1]*mean(y[,2])-mean(y[,3]*y[,4])+theta[2]*mean(y[,4]))
### B
thetan = Stm + (1/eps)*Ztm
err=thetan-theta
theta=thetan
epscon=1-eps
print(c(ebs,theta))
}
Iteration does not stop as the second condition of while loop is not met, the solution is a positive epsilon. I would like to get a negative epsilon. This, I guess requires a grid search or a range of starting values for the Thetas.
Can anyone please help code this process differently and more efficiently? Or help correct my code if there are flaws in it.
Thank you
If I am right, using linearity your equations have the form
ΘA = a + b / ε
ΘB = c + d / ε
1/ε = e ΘA + f ΘB + g
This is an easy 3x3 linear system.
I'm trying to find one of the roots of a nonlinear (roughly quartic) equation.
The equation always has four roots, a pair of them close to zero, a large positive, and a large negative root. I'd like to identify either of the near zero roots, but nlsolve, even with an initial guess very close to these roots, seems to always converge on the large positive or negative root.
A plot of the function essentially looks like a constant negative value, with a (very narrow) even-ordered pole near zero, and gradually rising to cross zero at the large positive and negative roots.
Is there any way I can limit the region searched by nlsolve, or do something to make it more sensitive to the presence of this pole in my function?
EDIT:
Here's some example code reproducing the problem:
using NLsolve
function f!(F,x)
x = x[1]
F[1] = -15000 + x^4 / (x+1e-5)^2
end
# nlsolve will find the root at -122
nlsolve(f!,[0.0])
As output, I get:
Results of Nonlinear Solver Algorithm
* Algorithm: Trust-region with dogleg and autoscaling
* Starting Point: [0.0]
* Zero: [-122.47447713915808]
* Inf-norm of residuals: 0.000000
* Iterations: 15
* Convergence: true
* |x - x'| < 0.0e+00: false
* |f(x)| < 1.0e-08: true
* Function Calls (f): 16
* Jacobian Calls (df/dx): 6
We can find the exact roots in this case by transforming the objective function into a polynomial:
using PolynomialRoots
roots([-1.5e-6,-0.3,-15000,0,1])
produces
4-element Array{Complex{Float64},1}:
122.47449713915809 - 0.0im
-122.47447713915808 + 0.0im
-1.0000000813048448e-5 + 0.0im
-9.999999186951818e-6 + 0.0im
I would love a way to identify the pair of roots around the pole at x = -1e-5 without knowing the exact form of the objective function.
EDIT2:
Trying out Roots.jl :
using Roots
f(x) = -15000 + x^4 / (x+1e-5)^2
find_zero(f,0.0) # finds +122... root
find_zero(f,(-1e-4,0.0)) # error, not a bracketing interval
find_zeros(f,-1e-4,0.0) # finds 0-element Array{Float64,1}
find_zeros(f,-1e-4,0.0,no_pts=6) # finds root slightly less than -1e-5
find_zeros(f,-1e-4,0.0,no_pts=10) # finds 0-element Array{Float64,1}, sensitive to value of no_pts
I can get find_zeros to work, but it's very sensitive to the no_pts argument and the exact values of the endpoints I pick. Doing a loop over no_pts and taking the first non-empty result might work, but something more deterministic to converge would be preferable.
EDIT3 :
Here's applying the tanh transformation suggested by Bogumił
using NLsolve
function f_tanh!(F,x)
x = x[1]
x = -1e-4 * (tanh(x)+1) / 2
F[1] = -15000 + x^4 / (x+1e-5)^2
end
nlsolve(f_tanh!,[100.0]) # doesn't converge
nlsolve(f_tanh!,[1e5]) # doesn't converge
using Roots
function f_tanh(x)
x = -1e-4 * (tanh(x)+1) / 2
return -15000 + x^4 / (x+1e-5)^2
end
find_zeros(f_tanh,-1e10,1e10) # 0-element Array
find_zeros(f_tanh,-1e3,1e3,no_pts=100) # 0-element Array
find_zero(f_tanh,0.0) # convergence failed
find_zero(f_tanh,0.0,max_evals=1_000_000,maxfnevals=1_000_000) # convergence failed
EDIT4 : This combination of techniques identifies at least one root somewhere around 95% of the time, which is good enough for me.
using Peaks
using Primes
using Roots
# randomize pole location
a = 1e-4*rand()
f(x) = -15000 + x^4 / (x+a)^2
# do an initial sample to find the pole location
l = 1000
minval = -1e-4
maxval = 0
m = []
sample_r = []
while l < 1e6
sample_r = range(minval,maxval,length=l)
rough_sample = f.(sample_r)
m = maxima(rough_sample)
if length(m) > 0
break
else
l *= 10
end
end
guess = sample_r[m[1]]
# functions to compress the range around the estimated pole
cube(x) = (x-guess)^3 + guess
uncube(x) = cbrt(x-guess) + guess
f_cube(x) = f(cube(x))
shift = l ÷ 1000
low = sample_r[m[1]-shift]
high = sample_r[m[1]+shift]
# search only over prime no_pts, so no samplings divide into each other
# possibly not necessary?
for i in primes(500)
z = find_zeros(f_cube,uncube(low),uncube(high),no_pts=i)
if length(z)>0
println(i)
println(cube.(z))
break
end
end
More comment could be given if you provided more information on your problem.
However in general:
It seems that your problem is univariate, in which case you can use Roots.jl where find_zero and find_zeros give the interface you ask for (i.e. allowing to specify the search region)
If a problem is multivariate you have several options how to do it in the problem specification for nlsolve (as it by default does not allow to specify a bounding box AFAICT). The simplest is to use variable transformation. E.g. you can apply a ai * tanh(xi) + bi transformation selecting ai and bi for each variable so that it is bounded to the desired interval
The first problem you have in your definition is that the way you define f it never crosses 0 near the two roots you are looking for because Float64 does not have enough precision when you write 1e-5. You need to use greater precision of computations:
julia> using Roots
julia> f(x) = -15000 + x^4 / (x+1/big(10.0^5))^2
f (generic function with 1 method)
julia> find_zeros(f,big(-2*10^-5), big(-8*10^-6), no_pts=100)
2-element Array{BigFloat,1}:
-1.000000081649671426108658262468117284940444265467160592853348997523986352593615e-05
-9.999999183503552405580084054429938261707450678661727461293670518591720605751116e-06
and set no_pts to be sufficiently large to find intervals bracketing the roots.
I'm reading Deep Learning by Goodfellow et al. and am trying to implement gradient descent as shown in Section 4.5 Example: Linear Least Squares. This is page 92 in the hard copy of the book.
The algorithm can be viewed in detail at https://www.deeplearningbook.org/contents/numerical.html with R implementation of linear least squares on page 94.
I've tried implementing in R, and the algorithm as implemented converges on a vector, but this vector does not seem to minimize the least squares function as required. Adding epsilon to the vector in question frequently produces a "minimum" less than the minimum outputted by my program.
options(digits = 15)
dim_square = 2 ### set dimension of square matrix
# Generate random vector, random matrix, and
set.seed(1234)
A = matrix(nrow = dim_square, ncol = dim_square, byrow = T, rlnorm(dim_square ^ 2)/10)
b = rep(rnorm(1), dim_square)
# having fixed A & B, select X randomly
x = rnorm(dim_square) # vector length of dim_square--supposed to be arbitrary
f = function(x, A, b){
total_vector = A %*% x + b # this is the function that we want to minimize
total = 0.5 * sum(abs(total_vector) ^ 2) # L2 norm squared
return(total)
}
f(x,A,b)
# how close do we want to get?
epsilon = 0.1
delta = 0.01
value = (t(A) %*% A) %*% x - t(A) %*% b
L2_norm = (sum(abs(value) ^ 2)) ^ 0.5
steps = vector()
while(L2_norm > delta){
x = x - epsilon * value
value = (t(A) %*% A) %*% x - t(A) %*% b
L2_norm = (sum(abs(value) ^ 2)) ^ 0.5
print(L2_norm)
}
minimum = f(x, A, b)
minimum
minimum_minus = f(x - 0.5*epsilon, A, b)
minimum_minus # less than the minimum found by gradient descent! Why?
On page 94 of the pdf appearing at https://www.deeplearningbook.org/contents/numerical.html
I am trying to find the values of the vector x such that f(x) is minimized. However, as demonstrated by the minimum in my code, and minimum_minus, minimum is not the actual minimum, as it exceeds minimum minus.
Any idea what the problem might be?
Original Problem
Finding the value of x such that the quantity Ax - b is minimized is equivalent to finding the value of x such that Ax - b = 0, or x = (A^-1)*b. This is because the L2 norm is the euclidean norm, more commonly known as the distance formula. By definition, distance cannot be negative, making its minimum identically zero.
This algorithm, as implemented, actually comes quite close to estimating x. However, because of recursive subtraction and rounding one quickly runs into the problem of underflow, resulting in massive oscillation, below:
Value of L2 Norm as a function of step size
Above algorithm vs. solve function in R
Above we have the results of A %% x followed by A %% min_x, with x estimated by the implemented algorithm and min_x estimated by the solve function in R.
The problem of underflow, well known to those familiar with numerical analysis, is probably best tackled by the programmers of lower-level libraries best equipped to tackle it.
To summarize, the algorithm appears to work as implemented. Important to note, however, is that not every function will have a minimum (think of a straight line), and also be aware that this algorithm should only be able to find a local, as opposed to a global minimum.
I'm really confused on simplifying this recurrence relation: c(n) = c(n/2) + n^2.
So I first got:
c(n/2) = c(n/4) + n^2
so
c(n) = c(n/4) + n^2 + n^2
c(n) = c(n/4) + 2n^2
c(n/4) = c(n/8) + n^2
so
c(n) = c(n/8) + 3n^2
I do sort of notice a pattern though:
2 raised to the power of whatever coefficient is in front of "n^2" gives the denominator of what n is over.
I'm not sure if that would help.
I just don't understand how I would simplify this recurrence relation and then find the theta notation of it.
EDIT: Actually I just worked it out again and I got c(n) = c(n/n) + n^2*lgn.
I think that is correct, but I'm not sure. Also, how would I find the theta notation of that? Is it just theta(n^2lgn)?
Firstly, make sure to substitute n/2 everywhere n appears in the original recurrence relation when placing c(n/2) on the lhs.
i.e.
c(n/2) = c(n/4) + (n/2)^2
Your intuition is correct, in that it is a very important part of the problem. How many times can you divide n by 2 before we reach 1?
Let's take 8 for an example
8/2 = 4
4/2 = 2
2/2 = 1
You see it's 3, which as it turns out is log(8)
In order to prove the theta notation, it might be helpful to check out the master theorem. This is a very useful tool for proving complexity of a recurrence relation.
Using the master theorem case 3, we can see
a = 1
b = 2
logb(a) = 0
c = 2
n^2 = Omega(n^2)
k = 9/10
(n/2)^2 < k*n^2
c(n) = Theta(n^2)
The intuition as to why the answer is Theta(n^2) is that you have n^2 + (n^2)/4 + (n^2)/16 + ... + (n^2)/2^(2n), which won't give us logn n^2s, but instead increasingly smaller n^2s
Let's answer a more generic question for recurrences of the form:
r(n) = r(d(n)) + f(n). There are some restrictions for the functions, that need further discussion, e.g. if x is a fix point of d, then f(x) should be 0, otherwise there isn't any solution. In your specific case this condition is satisfied.
Rearranging the equation we get that r(n) - r(d(n)) = f(n), and we get the intuition that r(n) and r(d(n)) are both a sum of some terms, but r(n) has one more term than r(d(n)), that's why the f(n) as the difference. On the other hand, r(n) and r(d(n)) have to have the same 'form', so the number of terms in the previously mentioned sum has to be infinite.
Thus we are looking for a telescopic sum, in which the terms for r(d(n)) cancel out all but one terms for r(n):
r(n) = f(n) + a_0(n) + a_1(n) + ...
- r(d(n)) = - a_0(n) - a_1(n) - ...
This latter means that
r(d(n)) = a_0(n) + a_1(n) + ...
And just by substituting d(n) into the place of n into the equation for r(n), we get:
r(d(n)) = f(d(n)) + a_0(d(n)) + a_1(d(n)) + ...
So by choosing a_0(n) = f(d(n)), a_1(n) = a_0(d(n)) = f(d(d(n))), and so on: a_k(n) = f(d(d(...d(n)...))) (with k+1 pieces of d in each other), we get a correct solution.
Thus in general, the solution is of the form r(n) = sum{i=0..infinity}(f(d[i](n))), where d[i](n) denotes the function d(d(...d(n)...)) with i number of iterations of the d function.
For your case, d(n)=n/2 and f(n)=n^2, hence you can get the solution in closed form by using identities for geometric series. The final result is r(n)=4/3*n^2.
Go for advance Master Theorem.
T(n) = aT(n/b)+n^klog^p
where a>0 b>1 k>0 p=real number.
case 1: a>b^k
T(n) = 0(n^logba) b is in base.
case 2 a=b^k
1. p>-1 T(n) than T(n)=0(n^logba log^p+1)
2. p=-1 Than T(n)=0(n^logba logn)
3. p<-1 than T(n)=0(n^logba)
case 3: a<b^k
1.if p>=0 than T(n)=0(n^k log^p n)
2 if p<0 than T(n)=O(n^k)
forgave Constant bcoz constant doesn't change time complexity or constant change processor to processor .(i.e n/2 ==n*1/2 == n)
(1) The simple version of the problem:
How to calculate log(P1+P2+...+Pn), given log(P1), log(P2), ..., log(Pn), without taking the exp of any terms to get the original Pi. I don't want to get the original Pi because they are super small and may cause numeric computer underflow.
(2) The long version of the problem:
I am using Bayes' Theorem to calculate a conditional probability P(Y|E).
P(Y|E) = P(E|Y)*P(Y) / P(E)
I have a thousand probabilities multiplying together.
P(E|Y) = P(E1|Y) * P(E2|Y) * ... * P(E1000|Y)
To avoid computer numeric underflow, I used log(p) and calculate the summation of 1000 log(p) instead of calculating the product of 1000 p.
log(P(E|Y)) = log(P(E1|Y)) + log(P(E2|Y)) + ... + log(P(E1000|Y))
However, I also need to calculate P(E), which is
P(E) = sum of P(E|Y)*P(Y)
log(P(E)) does not equal to the sum of log(P(E|Y)*P(Y)). How should I get log(P(E)) without solving for P(E|Y)*P(Y) (they are extremely small numbers) and adding them.
You can use
log(P1+P2+...+Pn) = log(P1[1 + P2/P1 + ... + Pn/P1])
= log(P1) + log(1 + P2/P1 + ... + Pn/P1])
which works for any Pi. So factoring out maxP = max_i Pi results in
log(P1+P2+...+Pn) = log(maxP) + log(1+P2/maxP + ... + Pn/maxP)
where all the ratios are less than 1.