I'm trying to find the root of the following function in R f <- x^3 + 2 * x^2 - 7 using the bisection method and the repeat function. This code results in an error:
x <- 1.3
tolerance <- 0.000001
repeat {
f <- x^3 + 2 * x^2 - 7
if (abs(f) < tolerance) break
x <- (x^3 + 2 * x^2 - 7)/2
}
Error in if (abs(f) < tolerance) break :
missing value where TRUE/FALSE needed
I've set the initial x to be 1.3, the tolerance to be 0.000001 and I know that the root lies between 1 and 2. I have already tried to substitute the last line of the code for f instead of retyping the function, but the same error appears. Can someone help me?
Based on a very brief reading of the bisection method, I think you're adjusting x incorrectly. You should be bisecting the domain of x (the x value fed into f), not the range of f.
There are many reasons your function does not do what you want, but a primary one is that you are not even using the information you have about reasonable values for x, that is values of x that are near the root of the function. You should never be setting your x value to some value of the function for which you are trying to find a root...there's no reason these two values need to be related. For example, if the root of a function is near 100, the value of the function, f, for f(100) will be some low number. Then perhaps the value of f near 0 is some very high number. So if you start with f(x=100), you'll move x to around 0, then run f(0) and get some very big number so you'll move x to that big number, and so on. You'll be bouncing around according to f's values but not in a way that has anything to do with finding the root.
Let's try this:
x <- 1.3
tolerance <- 0.001
repeat {
message(x)
f <- x^3 + 2 * x^2 - 7
if (abs(f) < tolerance) break
x <- (x^3 + 2 * x^2 - 7)/2
}
#1.3
#-0.7115
#-3.1738598729375
#-9.4123720943868
#-331.84120816267
#-18160827.4603406
#-2.99486226359293e+21
#-1.34307592742051e+64
#-1.21135573473788e+192
#Error in if (abs(f) < tolerance) break :
# missing value where TRUE/FALSE needed
As you see the x value becomes more and more negative until it's absolute value is too large for double:
x <- -1.21135573473788e+192
x^3 + 2 * x^2 - 7
#[1] NaN
You should look up the algorithm of the bisection method, because what you have implemented here is clearly not the correct algorithm.
Related
I have an expression that contains several parts. However, for simplicity, consider only the following part as MWE:
Let's assume we have the inverse of a matrix Y that I want to differentiate w.r.t. x.
Y is given as I - (x * b * t(b)), where I is the identity matrix, x is a scalar, and b is a vector.
According to The Matrix Cookbook Equ. 59, the partial derivative of an inverse is:
Normally I would use the function D from the package stats to calculate the derivatives. But that is not possible in this case, because e.g. solve to specify Y as inverse and t() is not in the table of derivatives.
What is the best workaround to circumvent this problem? Are there any other recommended packages that can handle such input?
Example that doesn't work:
f0 <- expression(solve(I - (x * b %*% t(b))))
D(f0, "x")
Example that works:
f0 <- expression(x^3)
D(f0, "x")
3 * x^2
I assume that the question is how to get an explicit expression for the derivative of the inverse of Y with respect to x. In the first section we compute it and in the second section we double check it by computing it numerically and show that the two approaches give the same result.
b and the null space of b are both eigenspaces of Y which we can readily verify by noting that Yb = (1-(b'b)x)b and if z belongs to the nullspace of b then Yz = z. This also shows that the corresponding eigenvalues are 1 - x(b'b) with multiplicity 1 and 1 with multiplicity n-1 (since the nullspace of b has that dimension).
As a result of the fact that we can expand such a matrix into the sum of each eigenvalue times the projection onto its eigenspace we can express Y as the following where bb'/b'b is the projection onto the eigenspace spanned by b and the part pre-multiplying it is the eigenvalue. The remaining terms do not involve x because they involve an eigenvalue of 1 independently of x and the nullspace of b is independent of x as well.
Y = (1-x(b'b))(bb')/(b'b) + terms not involving x
The inverse of Y is formed by taking the reciprocals of the eigenvalues so:
Yinv = 1/(1-x(b'b)) * (bb')/(b'b) + terms not involving x
and the derivative of that wrt x is:
(b'b) / (1 - x(b'b))^2 * (bb')/(b'b)
Cancelling the b'b and writing the derivative in terms of R code:
1/(1 - x*sum(b*b))^2*outer(b, b)
Double check
Using specific values for b and x we can verify it against the numeric derivative as follows:
library(numDeriv)
x <- 1
b <- 1:3
# Y inverse as a function of x
Yinv <- function(x) solve(diag(3) - x * outer(b, b))
all.equal(matrix(jacobian(Yinv, x = 1), 3),
1/(1 - x*sum(b*b))^2*outer(b, b))
## [1] TRUE
I'm trying to find one of the roots of a nonlinear (roughly quartic) equation.
The equation always has four roots, a pair of them close to zero, a large positive, and a large negative root. I'd like to identify either of the near zero roots, but nlsolve, even with an initial guess very close to these roots, seems to always converge on the large positive or negative root.
A plot of the function essentially looks like a constant negative value, with a (very narrow) even-ordered pole near zero, and gradually rising to cross zero at the large positive and negative roots.
Is there any way I can limit the region searched by nlsolve, or do something to make it more sensitive to the presence of this pole in my function?
EDIT:
Here's some example code reproducing the problem:
using NLsolve
function f!(F,x)
x = x[1]
F[1] = -15000 + x^4 / (x+1e-5)^2
end
# nlsolve will find the root at -122
nlsolve(f!,[0.0])
As output, I get:
Results of Nonlinear Solver Algorithm
* Algorithm: Trust-region with dogleg and autoscaling
* Starting Point: [0.0]
* Zero: [-122.47447713915808]
* Inf-norm of residuals: 0.000000
* Iterations: 15
* Convergence: true
* |x - x'| < 0.0e+00: false
* |f(x)| < 1.0e-08: true
* Function Calls (f): 16
* Jacobian Calls (df/dx): 6
We can find the exact roots in this case by transforming the objective function into a polynomial:
using PolynomialRoots
roots([-1.5e-6,-0.3,-15000,0,1])
produces
4-element Array{Complex{Float64},1}:
122.47449713915809 - 0.0im
-122.47447713915808 + 0.0im
-1.0000000813048448e-5 + 0.0im
-9.999999186951818e-6 + 0.0im
I would love a way to identify the pair of roots around the pole at x = -1e-5 without knowing the exact form of the objective function.
EDIT2:
Trying out Roots.jl :
using Roots
f(x) = -15000 + x^4 / (x+1e-5)^2
find_zero(f,0.0) # finds +122... root
find_zero(f,(-1e-4,0.0)) # error, not a bracketing interval
find_zeros(f,-1e-4,0.0) # finds 0-element Array{Float64,1}
find_zeros(f,-1e-4,0.0,no_pts=6) # finds root slightly less than -1e-5
find_zeros(f,-1e-4,0.0,no_pts=10) # finds 0-element Array{Float64,1}, sensitive to value of no_pts
I can get find_zeros to work, but it's very sensitive to the no_pts argument and the exact values of the endpoints I pick. Doing a loop over no_pts and taking the first non-empty result might work, but something more deterministic to converge would be preferable.
EDIT3 :
Here's applying the tanh transformation suggested by Bogumił
using NLsolve
function f_tanh!(F,x)
x = x[1]
x = -1e-4 * (tanh(x)+1) / 2
F[1] = -15000 + x^4 / (x+1e-5)^2
end
nlsolve(f_tanh!,[100.0]) # doesn't converge
nlsolve(f_tanh!,[1e5]) # doesn't converge
using Roots
function f_tanh(x)
x = -1e-4 * (tanh(x)+1) / 2
return -15000 + x^4 / (x+1e-5)^2
end
find_zeros(f_tanh,-1e10,1e10) # 0-element Array
find_zeros(f_tanh,-1e3,1e3,no_pts=100) # 0-element Array
find_zero(f_tanh,0.0) # convergence failed
find_zero(f_tanh,0.0,max_evals=1_000_000,maxfnevals=1_000_000) # convergence failed
EDIT4 : This combination of techniques identifies at least one root somewhere around 95% of the time, which is good enough for me.
using Peaks
using Primes
using Roots
# randomize pole location
a = 1e-4*rand()
f(x) = -15000 + x^4 / (x+a)^2
# do an initial sample to find the pole location
l = 1000
minval = -1e-4
maxval = 0
m = []
sample_r = []
while l < 1e6
sample_r = range(minval,maxval,length=l)
rough_sample = f.(sample_r)
m = maxima(rough_sample)
if length(m) > 0
break
else
l *= 10
end
end
guess = sample_r[m[1]]
# functions to compress the range around the estimated pole
cube(x) = (x-guess)^3 + guess
uncube(x) = cbrt(x-guess) + guess
f_cube(x) = f(cube(x))
shift = l ÷ 1000
low = sample_r[m[1]-shift]
high = sample_r[m[1]+shift]
# search only over prime no_pts, so no samplings divide into each other
# possibly not necessary?
for i in primes(500)
z = find_zeros(f_cube,uncube(low),uncube(high),no_pts=i)
if length(z)>0
println(i)
println(cube.(z))
break
end
end
More comment could be given if you provided more information on your problem.
However in general:
It seems that your problem is univariate, in which case you can use Roots.jl where find_zero and find_zeros give the interface you ask for (i.e. allowing to specify the search region)
If a problem is multivariate you have several options how to do it in the problem specification for nlsolve (as it by default does not allow to specify a bounding box AFAICT). The simplest is to use variable transformation. E.g. you can apply a ai * tanh(xi) + bi transformation selecting ai and bi for each variable so that it is bounded to the desired interval
The first problem you have in your definition is that the way you define f it never crosses 0 near the two roots you are looking for because Float64 does not have enough precision when you write 1e-5. You need to use greater precision of computations:
julia> using Roots
julia> f(x) = -15000 + x^4 / (x+1/big(10.0^5))^2
f (generic function with 1 method)
julia> find_zeros(f,big(-2*10^-5), big(-8*10^-6), no_pts=100)
2-element Array{BigFloat,1}:
-1.000000081649671426108658262468117284940444265467160592853348997523986352593615e-05
-9.999999183503552405580084054429938261707450678661727461293670518591720605751116e-06
and set no_pts to be sufficiently large to find intervals bracketing the roots.
Currently doing a homework exercise based on root finding algorithms:
A root finding algorithm can also be used to approximate certain functions. Show mathematically how the evaluation of the square root function f(x) = √x can be expressed as a root finding problem.4 Use both Newton’s method and the bisection method to approximate √x for different values of x. Compare your approximations with the R function sqrt. For which values of x does the approximation work well? Does Newton’s method or the bisection method perform better? How do the answers to these questions depend
on your starting value?
I have the following code that worked for every function so far:
newton.function <- function(f, fPrime, nmax, eps, x0){
n <- 1
x1 <- x0
result <- c()
while((n <= nmax) && (abs(f(x1)) >= eps)){
x1 <- (x0 - (f(x0)/fPrime(x0)))
result <- c(result, x1)
n <- n + 1
x0 <- x1
}
iterations <- n - 1
return(c(iterations, result[length(result)]))
}
Sqrt functions:
g <- function(x){
x^(1/2)
}
gPrime <- function(x){
1/(2*x^(1/2))
}
When I execute the function I either get Error in if (abs(f(x1)) <= eps) break :
missing value where TRUE/FALSE needed or if the x0 = 0 I get 1 and 0 as a result.
newton.function(f = g, fPrime = gPrime, nmax = 1000, eps = 1E-8, x0 = 0)
My bisection function works equally as bad, I am stuck answering the question.
From a programming point of view, your code works as expected.
If you start with 0, which is the exact solution, you get 0, fine.
Now look what happens when starting with any other number:
x1 <- (x0 - (f(x0)/fPrime(x0))) = (x0 - (x0^(1/2)/(1/(2*x^(1/2)))))
= x0-2x0 = -x0
So if you start with a positive number, x1 will be negative after the first iteration, and the next call to f(x1) returns NaN since you ask the square root of a negative number.
The error message tells you that R can not evaluate abs(f(x1)) >= eps to TRUE or FALSE, indeed, abs(f(x1)) returns NaN and the >= operator returns also NaN in this case. This is exactly what the error message tells you.
So I advice you to look at some mathematics source to check you algorithm, but the R part is ok.
I have a rather simple optimization question and while I'm fairly decent with R, optimization is something I haven't done a lot.
my.function <- function(parameters){
x <- parameters[1]
y <- parameters[2]
z <- parameters[3]
((10*x^2) - ((y/2) * (z/4)))^2
}
result <- optim(c(7,10,18),fn = my.function, method = 'L-BFGS-B',
lower = c(2,7,7),
upper = c(15,20,20))
result$par
#[1] 2.205169 19.546621 19.902243
This is a made up version of the problem I'm working on, so please forgive it if its purpose makes no sense. I have limits in place using the 'L-BFGS-B' method but I need to add a constraint and I'm unsure how to do it. My rules that I'm trying to implement are as follows:
x must be between 2 and 15
y must be between 7 and 20
z must be between 7 and 20
z <= y
It's the last one I don't know how to implement. Any help would be appreciated. Thank you.
Add a large number to the objective function if the constraint is violated, i.e. change the last line of my.function to:
((10*x^2) - ((y/2) * (z/4)))^2 + ifelse(y > z, 10^5, 0)
The result in this case is the following which does satisfy the constraint. Also, since the objective is non-negative its value cannot be less than 0 so we have achieved the minimum to numeric tolerance.
result$par
## [1] 2.223537 19.776462 20.000000
result$value
## [1] 1.256682e-11
y <- matrix(c(7, 9, -5, 0, 2, 6), ncol = 1)
try <- t(y)
tryy <- try %*% y
i <- solve(tryy)
h <- y %*% i %*% try
uniroot(as.vector(solve(((1-x) * diag(6)) + h)), c(-Inf, Inf))
Error in (1 - x) * diag(6) : non-conformable arrays
The purpose of this command uniroot(as.vector(solve(((1-x) * diag(6)) + h)), c(-Inf, Inf)) is to solve the characteristics equation det[(1-λ)I+h] = 0
where, λ=eigenvalues , I=identity matrix , h=hat matrix=y(y'y)^(-1)y'
here λ is unknown ,we have to solve for it.
I am not understanding where is the problem here? I have tried as:
as.vector(solve(6*diag(6)+h))
This is not non-conformable. But why is not working inside the uniroot function?
Your question is a bit confusing, so I have to make a couple of assumptions. If you want the eigenvalues of h, then the characteristic equation is:
det(h - I*λ) = 0
not
det[(1-λ)I+h] = 0
So I used the former.
Given the above, the short answer is: do it this way.
f <- function(lambda) det(h -lambda*diag(6))
F <- Vectorize(f)
library(rootSolve)
uniroot.all(F,c(-1000,1000),n=2000)
# [1] 0 1
# or, much more simply
eigen(h)$values
# [1] 1.000000e+00 2.220446e-16 0.000000e+00 -2.731318e-18 -6.876381e-18 -7.365903e-17
So h has 2 eigenvalues, 0 and 1. Note that the built-in function eigen(...) finds 6 roots, but 5 of them are within the machine tolerance of 0.
The question about why your code fails is a bit more involved.
First, your code:
tryy <- try %*% y
is the dot product of y with itself (so, a scalar), returned as a matrix with one element. When you "invert" that using solve(...)
i <- solve(tryy)
you simply take the reciprocal, so i is also a matrix with 1 element. I'm not sure if this is what you had in mind.
Second, uniroot(...) does not work this way. The first argument must be a function; you've passed an expression which depends on x, which in turn is undefined. You could try:
f <- function(x) det(h-x*diag(6))
uniroot(f,c(-Inf,Inf))
but this wouldn't work either because (a) uniroot(...) works on a finite interval, (b) it requires that the function f(...) have different sign at the ends of the interval, and (c) in any event it would return only one root (the smaller one).
So you could use uniroot.all(...) in package rootSolve. uniroot.all(...) also requires a function as it's first argument, but there's a twist: the function must be "vectorized". This means that if you pass a vector of lambda values, f(...) should return a vector of the same length. Fortunately in R there is an easy way to "vectorize" a given function, as in:
F <- Vectorize(f).
Even this has it's limits. uniroot.all(...) also requires a finite interval, so we have to guess what that is, and also it evaluates F on n sub-intervals. So if your interval does not contain all the roots, or if the sub-intervals are not small enough, you will not find all the roots.
Using the built-in eigen(...) function is definitely the best option.