Correct way of using fsolve in R (pracma package) - r

I am trying to execute the following code to solve a system of nonlinear equations using R:
library(pracma)
t <- read.csv("values-try.csv", header=F, sep=",")
x0 <- as.matrix( c(0, 1, 0, 1, 0, 1))
Gr <- 9.807
F <- function (x) {
x1 <- x[1]; x2 <- x[2]; x3 <- x[3]; x4 <- x[4]; x5 <- x[5]; x6 <- x[6]
as.matrix( c( (t[1,1] - x1)^2/x2^2 + (t[1,2] - x3)^2/x4^2 + (t[1,3] - x5)^2/x6^2 - Gr^2,
(t[2,1] - x1)^2/x2^2 + (t[2,2] - x3)^2/x4^2 + (t[2,3] - x5)^2/x6^2 - Gr^2,
(t[3,1] - x1)^2/x2^2 + (t[3,2] - x3)^2/x4^2 + (t[3,3] - x5)^2/x6^2 - Gr^2,
(t[4,1] - x1)^2/x2^2 + (t[4,2] - x3)^2/x4^2 + (t[4,3] - x5)^2/x6^2 - Gr^2,
(t[5,1] - x1)^2/x2^2 + (t[5,2] - x3)^2/x4^2 + (t[5,3] - x5)^2/x6^2 - Gr^2,
(t[6,1] - x1)^2/x2^2 + (t[6,2] - x3)^2/x4^2 + (t[6,3] - x5)^2/x6^2 - Gr^2), ncol = 1)
}
fsolve(F, x0)
I keep getting the following error:
Error in if (norm(s, "F") < tol || norm(as.matrix(ynew), "F") < tol) break :
missing value where TRUE/FALSE needed
Calls: fsolve -> broyden
Execution halted
Any hints or resolving the error?
The values-try.csv looks like this:
0.1191419256974832, -0.2806359683994824, -9.755712465258934
0.3194200198415491, 0.05681698915395282, -9.711375649078391
0.05320046522270569, 0.21071993729858585, -9.711942750423542
0.056291795600583824, 0.20746318577998762, -9.697096562782926
-0.18870002789891743, -0.03873042128470452, -9.70831243701548
0.13239301222057243, -9.790554976542873, -0.9744148062871234

Finding the common zeroes of a set of polynomials is always tricky business. I somehow doubt the polynomials in your example do have such an exact common zero. Anyway, implementations like the one in fsolve will have problems with too small gradients or step lengths.
A better idea might be to apply a least-squares solver, i.e., minimize the sum of squares of the components of F. Function pracma::lsqnonlin will do this, squaring and summing the components of F automatically.
library(pracma)
x0 <- as.matrix( c(0, 1, 0, 1, 0, 1))
sol = lsqnonlin(F, x0, options=list(tolx=1e-12, tolg=1e-12))
sol$x
## [1] 0.1061871 32.9875053 -0.5361180
## [4] 59.1224428 68975.6833271 7034.3066917
F(sol$x)
## [,1]
## [1,] 1.838934e-07
## [2,] 9.420962e-08
## [3,] 2.146091e-05
## [4,] -2.161610e-05
## [5,] -1.225254e-07
## [6,] -3.836504e-10
Please note that you will find other minima with different starting points. You didn’t say whether you want to restrict the problem domain; I am quite certain there are no ‘near’ zeros in [-10, 10]^6.

Related

Is it mathematically possible to solve this problem?

x <- abs(rnorm(8))
C <- (x[1]*x[2]*x[3])^(1/3)
y <- log(x/C)
Is it mathematically possible to determine x[1:3] given you only have y? Here, x and y are always vectors of length 8. I should note that x is known for some of my dataset, which could be useful to find a solution for the other portion of the data where x is unknown. All of my code is implemented in R, so R code would be appreciated if this is solvable!
Defining f as
f <- function(x) {
C <- (x[1]*x[2]*x[3])^(1/3)
log(x/C)
}
we first note that if k is any scalar constant then f(x) and f(k*x) give the same result so if we have y = f(x) we can't tell whether y came from x or from k*x. That is, y could have come from any scalar multiple of x; therefore, we cannot recover x from y.
Linear formulation
Although we cannot recover x we can determine x up to a scalar multiple. Define the matrix A:
ones <- rep(1, 8)
a <- c(1, 1, 1, 0, 0, 0, 0, 0)
A <- diag(8) - outer(ones, a) / 3
in which case f(x) equals:
A %*% log(x)
Inverting formula
From this formula, given y and solving for x, the value of x would equal
exp(solve(A) %*% y) ## would equal x if A were invertible
if A were invertible but unfortunately it is not. For example, rowSums(A) equals zero which shows that the columns of A are linearly dependent which implies non-invertibility.
all.equal(rowSums(A), rep(0, 8))
## [1] TRUE
Rank and nullspace
Note that A is a projection matrix. This follows from the fact that it is idempotent, i.e. A %*% A equals A.
all.equal(A %*% A, A)
## [1] TRUE
It also follows from the fact that its eigenvalues are all 0 and 1:
zapsmall(eigen(A)$values)
## [1] 1 1 1 1 1 1 1 0
From the eigenvalues we see that A has rank 7 (the number of nonzero eigenvalues) and the dimension of the nullspace is 1 (the number of zero eigenvalues).
Another way to see this is that knowing that A is a projection matrix its rank equals its trace, which is 7, so its nullspace must have dimension 8-7=1.
sum(diag(A)) # rank of A
## [1] 7
Taking scalar multiples spans a one dimensional space so from the fact that the nullspace has dimension 1 it must be the entirely of the values that map into the same y.
Key formula
Now replacing solve in ## above with the generalized inverse, ginv, we have this key formula for our approximation to x given that y = f(x) for some x:
library(MASS)
exp(ginv(A) %*% y) # approximation to x accurate up to scalar multiple
or equivalently if y = f(x)
exp(y - mean(y))
While these do not give x they do determine x up to a scalar multiple. That is if x' is the value produced by the above expressions then x equals k * x' for some scalar constant k.
For example, using x and y from the question:
exp(ginv(A) %*% y)
## [,1]
## [1,] 1.2321318
## [2,] 0.5060149
## [3,] 3.4266146
## [4,] 0.1550034
## [5,] 0.2842220
## [6,] 3.7703442
## [7,] 1.0132635
## [8,] 2.7810703
exp(y - mean(y)) # same
## [1] 1.2321318 0.5060149 3.4266146 0.1550034 0.2842220 3.7703442 1.0132635
## [8] 2.7810703
exp(y - mean(y))/x
## [1] 2.198368 2.198368 2.198368 2.198368 2.198368 2.198368 2.198368 2.198368
Note
Note that y - mean(y) can be written as
B <- diag(8) - outer(ones, ones) / 8
B %*% y
and if y = f(x) then y must be in the range of A so we can verify that:
all.equal(ginv(A) %*% A, B %*% A)
## [1] TRUE
It is not true that the matrix ginv(A) equals B. It is only true that they act the same on the range of A which is all that we need.
No, it's not possible. You have three unknowns. That means you need three independent pieces of information (equations) to solve for all three. y gives you only one piece of information. Knowing that the x's are positive imposes a constraint, but doesn't necessarily allow you to solve. For example:
x1 + x2 + x3 = 6
Doesn't allow you to solve. x1 = 1, x2 = 2, x3 = 3 is one solution, but so is x1 = 1, x2 = 1, x3 = 4. There are many other solutions. [Imposing your "all positive" constraint would rule out solutions such as x1 = 100, x2 = 200, x3 = -294, but in general would leave more than one remaining solution.]
x1 + x2 + x3 = 6,
x1 + x2 - x3 = 0
Constrains x3 to be 3, but allows arbitrary solutions for x1 and x2, subject to x1 + x2 = 3.
x1 + x2 + x3 = 6,
x1 + x2 - x3 = 0,
x1 - x2 + x3 = 2
Gives the unique solution x1 = 1, x2 = 2, x3 = 3.

Definite, improper and multiple integration?

I have a question, whether is it possible in R to implement the Excel "search of the decision function"? It is necessary to create a script in R to solve an integral equation.
To solve 4 integrals below manually I just need paper, a pencil and 10 minutes:
Improper:
Double:
Triple:
Definite:
So I don't want solve the integrals like these manually, how can it be solved using R?
LATEX formala editor code:
improper
\int_{2}^{\infty} \frac{1}{\left(x - 1\right)^{2}}\, dx
double integrals
\int_{0}^{1}\int_{\frac{-1 x}{2}}^{\frac{x}{2}} e^{- x - y}\, dy\, dx
triple integrals
\int_{0}^{1}\int_{\frac{-1 x}{2}}^{\frac{x}{2}}\int_{\frac{-1 y}{3}}^{\frac{y}{3}} e^{- z + - x - y}\, dz\, dy\, dx
definite integrals
\int_{0}^{1} x^{2} \sin{\left (x \right )}\, dx
You can use rSymPy package to integrate all four expresisions as below:
Improper:
library(rSymPy)
x <- Var("x")
sympy("integrate(1 / (x - 1) ** 2, (x, 2, oo))")
# [1] "1"
Double:
library(rSymPy)
x <- Var("x")
y <- Var("y")
# double
sympy("integrate(exp(-x - y), (y, -x/2, x/2), (x, 0, 1))")
# [1] "4/3 + 2*exp(-3/2)/3 - 2*exp(-1/2)"
Triple:
library(rSymPy)
x <- Var("x")
y <- Var("y")
z <- Var("z")
sympy("integrate(exp(-x - y - z), (z, -y/3, y/3), (y, -x/2, x/2), (x, 0, 1))")
# [1] "-27/40 - 9*exp(-5/3)/20 + 9*exp(-4/3)/8 - 9*exp(-2/3)/4 + 9*exp(-1/3)/4"
Definite:
library(rSymPy)
x <- Var("x")
sympy("integrate(x ** 2 * sin(x), (x, 0, 1))")
# [1] "-2 + 2*sin(1) + cos(1)"

Constants in terms of ideal: "stdio:4:11:(3): error: can't promote number to ring" in Macaulay2

I am trying to demonstrate Handelman's theorem and the example 1 here with Macaulay2. I cannot understand the error in defining the ideal for the polytope restricted by the intervals.
R=QQ[x1,x2,x3,MonomialOrder=>Lex];
I=ideal(x1-0.2,-x1+0.5,x2,-x2+1,x3-1,-x3+1)
stdio:2:11:(3): error: can't promote number to ring
and what is the error for? How should I define the constants?
For some reason, Macaulay2 only accepts the computation for polynomial ring with RR not QQ:
i1 : R=RR[x1,x2,x3,MonomialOrder=>Lex]
o1 = R
o1 : PolynomialRing
i2 : I=ideal(x1-0.2,-x1+0.5,x2,-x2+1,x3-1,-x3+1)
o2 = ideal (x1 - .2, - x1 + .5, x2, - x2 + 1, x3 - 1, - x3 + 1)
o2 : Ideal of R
You get the error because M2 views decimals as real numbers as opposed to rationals:
i1 : .2
o1 = .2
o1 : RR (of precision 53)
So .2 isn't in your base ring.
Use fraction notation (as opposed to decimal notation) to input your ideal and you'll be in business.
i2 : R=QQ[x1,x2,x3, MonomialOrder => Lex];
i3 : I=ideal(x1-1/5,-x1+1/2,x2,-x2+1,x3-1,-x3+1)
o3 = ideal (x1 - 1/5, - x1 + 1/2, x2, - x2 + 1, x3 - 1, - x3 + 1)
o3 : Ideal of R

calibration of the posterior probabilities

currently i work on calibration of probability. i use the calibration approach, called rescaling algorithm - the source http://lem.cnrs.fr/Portals/2/actus/DP_201106.pdf (page 7).
the algorithm i wrote is:
rescaling_fun = function(x, y, z) {
P_korg = z # yhat_test_prob$BAD
P_k_C1 = sum(as.numeric(y) - 1)/length(y) # testset$BAD
P_kt_C1 = sum(as.numeric(x) - 1)/length(x) # trainset$BAD
P_k_C0 = sum(abs(as.numeric(y) - 2))/length(y)
P_kt_C0 = sum(abs(as.numeric(x) - 2))/length(x)
P_new <- ((P_k_C1/P_kt_C1) * P_korg)/((P_k_C0/P_k_C0) * (1 - P_korg) + (P_k_C0/P_k_C1) * (P_korg))
return(P_new)
}
the input values are:
1. x - train_set$BAD (actuals of `train set`)
2. y - test_set$BAD (actuals of `test set`)
3. z - yhat_test_prob$BAD (prediction on `test set`)
the problem - the result values are not within range of 0 and 1. Could you please help to solve the problem?
Your formulas to obtain probs (P_k_C1 ...) need to be modified. For example, according to the paper, y is a binary variable (0, 1) and the formula is sum(y - 1)/length(y) which is most likely to be negative - it converts y values to be -1 or 0, followed by adding them. I consider it should be (sum(y)-1)/length(y). Below is an example.
set.seed(1237)
y <- sample(0:1, 10, replace = T)
y
[1] 0 1 0 0 0 1 1 0 1 1
# it must be negative as it is sum(y - 1) - y is 0 or 1
sum(as.numeric(y) - 1)/length(y)
[1] -0.5
# modification
(sum(as.numeric(y)) - 1)/length(y)
[1] 0.4

Optimization with Constraints

I am working with the output from a model in which there are parameter estimates that may not follow a-priori expectations. I would like to write a function that forces these utility estimates back in line with those expectations. To do this, the function should minimize the sum of the squared deviance between the starting values and the new estimates. Since we have a-priori expections, the optimization should be subject to the following constraints:
B0 < B1
B1 < B2
...
Bj < Bj+1
For example, the raw parameter estimates below are flipflopped for B2 and B3. The columns Delta and Delta^2 show the deviance between the original parameter estimate and the new coefficient. I am trying to minimize the column Delta^2. I've coded this up in Excel and shown how Excel's Solver would optimize this problem providing the set of constraints:
Beta BetaRaw Delta Delta^2 BetaNew
B0 1.2 0 0 1.2
B1 1.3 0 0 1.3
B2 1.6 -0.2 0.04 1.4
B3 1.4 0 0 1.4
B4 2.2 0 0 2.2
After reading through ?optim and ?constrOptim, I'm not able to grok how to set this up in R. I'm sure I'm just being a bit dense, but could use some pointers in the right direction!
3/24/2012 - Added bounty since I'm not smart enough to translate the first answer.
Here's some R code that should be on the right path. Assuming that the betas start with:
betas <- c(1.2,1.3,1.6,1.4,2.2)
I want to minimize the following function such that b0 <= b1 <= b2 <= b3 <= b4
f <- function(x) {
x1 <- x[1]
x2 <- x[2]
x3 <- x[3]
x4 <- x[4]
x5 <- x[5]
loss <- (x1 - betas[1]) ^ 2 +
(x2 - betas[2]) ^ 2 +
(x3 - betas[3]) ^ 2 +
(x4 - betas[4]) ^ 2 +
(x5 - betas[5]) ^ 2
return(loss)
}
To show that the function works, the loss should be zero if we pass the original betas in:
> f(betas)
[1] 0
And relatively large with some random inputs:
> set.seed(42)
> f(rnorm(5))
[1] 8.849329
And minimized at the values I was able to calculate in Excel:
> f(c(1.2,1.3,1.4,1.4,2.2))
[1] 0.04
1.
Since the objective is quadratic and the constraints linear,
you can use solve.QP.
It finds the b that minimizes
(1/2) * t(b) %*% Dmat %*% b - t(dvec) %*% b
under the constraints
t(Amat) %*% b >= bvec.
Here, we want b that minimizes
sum( (b-betas)^2 ) = sum(b^2) - 2 * sum(b*betas) + sum(beta^2)
= t(b) %*% t(b) - 2 * t(b) %*% betas + sum(beta^2).
Since the last term, sum(beta^2), is constant, we can drop it,
and we can set
Dmat = diag(n)
dvec = betas.
The constraints are
b[1] <= b[2]
b[2] <= b[3]
...
b[n-1] <= b[n]
i.e.,
-b[1] + b[2] >= 0
- b[2] + b[3] >= 0
...
- b[n-1] + b[n] >= 0
so that t(Amat) is
[ -1 1 ]
[ -1 1 ]
[ -1 1 ]
[ ... ]
[ -1 1 ]
and bvec is zero.
This leads to the following code.
# Sample data
betas <- c(1.2, 1.3, 1.6, 1.4, 2.2)
# Optimization
n <- length(betas)
Dmat <- diag(n)
dvec <- betas
Amat <- matrix(0,nr=n,nc=n-1)
Amat[cbind(1:(n-1), 1:(n-1))] <- -1
Amat[cbind(2:n, 1:(n-1))] <- 1
t(Amat) # Check that it looks as it should
bvec <- rep(0,n-1)
library(quadprog)
r <- solve.QP(Dmat, dvec, Amat, bvec)
# Check the result, graphically
plot(betas)
points(r$solution, pch=16)
2.
You can use constrOptim in the same way (the objective function can be arbitrary, but the constraints have to be linear).
3.
More generally, you can use optim if you reparametrize the problem
into a non-constrained optimization problem,
for instance
b[1] = exp(x[1])
b[2] = b[1] + exp(x[2])
...
b[n] = b[n-1] + exp(x[n-1]).
There are a few examples
here
or there.
Alright, this is starting to take form, but still has some bugs. Based on the conversation in chat with #Joran, it seems I can include a conditional that will set the loss function to an arbitrarily large value if the values are not in order. This seems to work IF the discrepancy occurs between the first two coefficients, but not thereafter. I'm having a hard time parsing out why that would be the case.
Function to minimize:
f <- function(x, x0) {
x1 <- x[1]
x2 <- x[2]
x3 <- x[3]
x4 <- x[4]
x5 <- x[5]
loss <- (x1 - x0[1]) ^ 2 +
(x2 - x0[2]) ^ 2 +
(x3 - x0[3]) ^ 2 +
(x4 - x0[4]) ^ 2 +
(x5 - x0[5]) ^ 2
#Make sure the coefficients are in order
if any(diff(c(x1,x2,x3,x4,x5)) > 0) loss = 10000000
return(loss)
}
Working example (sort of, it seems the loss would be minimized if b0 = 1.24?):
> betas <- c(1.22, 1.24, 1.18, 1.12, 1.10)
> optim(betas, f, x0 = betas)$par
[1] 1.282 1.240 1.180 1.120 1.100
Non-working example (note that the third element is still larger than the second:
> betas <- c(1.20, 1.15, 1.18, 1.12, 1.10)
> optim(betas, f, x0 = betas)$par
[1] 1.20 1.15 1.18 1.12 1.10

Resources