Is it possible to solve an algebraic equation in R? - r

I want to find the solution of:
-x^3+6*x^2+51*x+44=0
but with R. Is it possible?
I found the package Ryacas, but nobody seems to be able to make it work.
May sound trivial, but I'm not able to find an easy way to do this...
Do you have an alternative?
Thanks guys!

You can use polynom package:
library(polynom)
p <- polynomial(c(44,51,6,-1))
# 44 + 51*x + 6*x^2 - x^3
solve(p)
# [1] -4 -1 11
But you simply can use the function polyroot from base package:
polyroot(c(44,51,6,-1))
# [1] -1+0i -4+0i 11+0i
If you keep the real part with Re:
Re(polyroot(c(44,51,6,-1)))
# [1] -1 -4 11

Here we solve for the roots using the relationship between a matrix and its characteristic polynomial.
Given the polynomial a0 + a1*x^1 + a2*x^2 + x^3, define the matrix:
0 0 -a0
1 0 -a1
0 1 -a2
The eigenvalues of this matrix are the roots of the polynomial.
Substituting y = -x in your polynomial equation gives this
y^3 + 6*y^2 - 51*y + 44=0
And gives this example
> z <- matrix(c(0,1,0,0,0,1,-44,51,-6),3,3)
> z
[,1] [,2] [,3]
[1,] 0 0 -44
[2,] 1 0 51
[3,] 0 1 -6
> eigen(z)
$values
[1] -11 4 1
$vectors
[,1] [,2] [,3]
[1,] 0.6172134 0.73827166 0.98733164
[2,] -0.7715167 -0.67115606 -0.15707549
[3,] 0.1543033 -0.06711561 -0.02243936
Or, since we've substituted -y for x:
> eigen(-z)$values
[1] 11 -4 -1
See: http://www-math.mit.edu/~edelman/publications/polynomial_roots.pdf

I just stumbled upon this question and I am not sure if anything inherently changed around the Ryacas package, but it seems to work great in 2020, here is a helpful vignette to get started: https://cran.r-project.org/web/packages/Ryacas/vignettes/getting-started.html
Following the vignette, things seem to work as expected when I run the code:
library(Ryacas)
# initialize equation:
eq <- "-x^3+6*x^2+51*x+44"
# simplify the equation:
library(glue)
yac_str(glue("Simplify({eq})"))
[1] "6*x^2-x^3+51*x+44"
# factor:
yac_str(glue("Factor({eq})"))
[1] "(-1)*(x-11)*(x+4)*(x+1)"
You can evaluate the expression like this, plugging in whatever values for x:
# evaluate
evaluate(eq,list(x=c(0,1,10,100,-100)))
[[1]]
$src
[1] "-x^3+6*x^2+51*x+44"
attr(,"class")
[1] "source"
[[2]]
[1] "[1] 44 100 154 -934856 1054944\n"
Here you can see the results where x=0 produced an answer of 44, x=1 produced an answer of 100, etc...
If you evaluated the new simplified or factored versions and evaluated those, you would of course end up with the same exact results:
evaluate(yac_str(glue("Simplify({eq})")),list(x=c(0,1,10,100,-100)))
[[1]]
$src
[1] "6*x^2-x^3+51*x+44"
attr(,"class")
[1] "source"
[[2]]
[1] "[1] 44 100 154 -934856 1054944\n"
Notice the formula changed in the $src output, but we get the same results.
Here's the factored one too:
evaluate(yac_str(glue("Factor({eq})")),list(x=c(0,1,10,100,-100)))
[[1]]
$src
[1] "(-1)*(x-11)*(x+4)*(x+1)"
attr(,"class")
[1] "source"
[[2]]
[1] "[1] 44 100 154 -934856 1054944\n"
The only real difference between what I outlined here and what's outlined in the vignette is the actual formula, and the fact that I used library(glue) instead of paste0, which is also a fair option.

Related

integrate() gives totally wrong number

integrate() gives horribly wrong answer:
integrate(function (x) dnorm(x, -5, 0.07), -Inf, Inf, subdivisions = 10000L)
# 2.127372e-23 with absolute error < 3.8e-23
The return value should obviously be 1 (normal distrubution integrates to 1), but integrate() returns ridiculously small number, with wrong error reporting, and no warning...
Any ideas?
This seems the default integrate() is horribly buggy... and I just found this by chance! Is there any reliable R package to compute numerical integration?
EDIT: I tried package pracma and I see the same problem! :
require(pracma)
integral(function (x) dnorm(x, -5, 0.07), -Inf, Inf)
# For infinite domains Gauss integration is applied!
# [1] 0
EDIT: Hmm... digging deeper, it seems that he has trouble to find the very narrow domain for the function which is numerically > 0. When I set the limits to certain (very close to 0, 1) quantiles, it starts to work:
integral(function (x) dnorm(x, -5, 0.07), qnorm(1e-10, -5, 0.07), qnorm(1 - 1e-10, -5, 0.07))
But anyway, this is quite horrible gotcha... wonder if there is any remedy for this.
From the online documentation: "Like all numerical integration routines, these evaluate the function on a finite set of points. If the function is approximately constant (in particular, zero) over nearly all its range it is possible that the result and error estimate may be seriously wrong."
I take this to mean "caveat emptor". I notice that in your example, the absolute error is greater than value of the integral. Given that you know f(x) > 0 for all x, at least it's giving you the chance to spot that something has gone wrong. It's down to you to take the opportunity.
integrate( function(x) dnorm(x, -5, 0.07), -20, 10, subdivisions=1000L)
Gives
1 with absolute error < 9.8e-07
The warning in the online doc says to me that, given your apparent definition of buggy, the answer to your question is "no, there is no reliable numerical intergration method. Not in R or any other language". No numerical integration technique should be used blindly. The user needs to check their inputs are sensible and the output is reasonable. It's no good believing an answer just because the computer gave it to you.
See also this post.
Expanding a little further on #r2evan's and #Limey's comments:
#Limey: for very general problems like this, there is simply no way to guarantee a generic solution.
One way to solve such problem is to use more knowledge of the properties of the integrand (#r2evans's answer); the answer referenced by #Limey goes into detail for a different problem.
One "gotcha" that you may not have thought of is that trying out a bunch of generic methods, tuning settings, etc. may mislead you into concluding that some settings/methods are generically better than the first one you tried that failed to get the right answer. (Methods that work may work better because they're generically better, but trying them on one example doesn't prove it!)
As an example, the description of pcubature() (in ?cubature::pcubature says
This algorithm is often superior to h-adaptive integration for
smooth integrands in a few (<=3) dimensions, but is a poor choice
in higher dimensions or for non-smooth integrands.
However, recall that pcubature() happens to fail for your example, which is a smooth low-dimensional case - exactly where pcubature() is supposed to perform better - which suggests that it may be just luck that hcubature() works and pcubature() doesn't in this case.
An illustration of how sensitive the results can be to parameters (lower/upper limits in this case):
library(emdbook)
cc <- curve3d(integrate( dnorm, mean=-5, sd=0.07,
lower=x, upper=y, subdivisions=1000L)$value,
xlim=c(-30,-10), ylim=c(0,30), n = c(61, 61),
sys3d="image", col=c("black", "white"),
xlab="lower", ylab="upper")
White squares are successful (integral=1), black squares are bad (integral=0).
Try package cubature.
library(cubature)
hcubature(function (x) dnorm(x, -5, 0.07), -Inf, Inf)
#$integral
#[1] 1
#
#$error
#[1] 9.963875e-06
#
#$functionEvaluations
#[1] 405
#
#$returnCode
#[1] 0
Note that function pcubature in the same package also returns 0.
From vignette("cubature"), section Introduction. My emphasis.
This R cubature package exposes both the hcubature and pcubature
routines of the underlying C cubature library, including the
vectorized interfaces.
Per the documentation, use of pcubature is advisable only for smooth
integrands in dimensions up to three at most. In fact, the pcubature
routines perform significantly worse than the vectorized hcubature
in inappropriate cases. So when in doubt, you are better off using
hcubature.
Since in this case the integrand is the normal density, a smooth and 1-dimensional function, there would be reasons to prefer pcubature. But it doesn't give the right result. The vignette concludes the following.
Vectorized hcubature seems to be a good starting point.
For smooth integrands in low dimensions (≤3), pcubature might be worth trying out. Experiment before using in a production package.
Interesting workaround: not too surprisingly, integrate does well when the values sampled (on (-Inf,Inf), no less) are closer to the "center" of the data. You can reduce this by using your function but hinting at a center:
Without adjustment:
t(sapply(-10:10, function(i) integrate(function (x) dnorm(x, i, 0.07), -Inf, Inf, subdivisions = 10000L)))
# value abs.error subdivisions message call
# [1,] 0 0 1 "OK" Expression
# [2,] 1 4.611403e-05 10 "OK" Expression
# [3,] 6.619713e-19 1.212066e-18 2 "OK" Expression
# [4,] 7.344551e-71 0 2 "OK" Expression
# [5,] 3.389557e-06 6.086176e-06 3 "OK" Expression
# [6,] 2.127372e-23 3.849798e-23 2 "OK" Expression
# [7,] 1 3.483439e-05 8 "OK" Expression
# [8,] 1 6.338078e-07 11 "OK" Expression
# [9,] 1 3.408389e-06 7 "OK" Expression
# [10,] 1 6.414833e-07 8 "OK" Expression
# [11,] 1 7.578907e-06 3 "OK" Expression
# [12,] 1 6.414833e-07 8 "OK" Expression
# [13,] 1 3.408389e-06 7 "OK" Expression
# [14,] 1 6.338078e-07 11 "OK" Expression
# [15,] 1 3.483439e-05 8 "OK" Expression
# [16,] 2.127372e-23 3.849798e-23 2 "OK" Expression
# [17,] 3.389557e-06 6.086176e-06 3 "OK" Expression
# [18,] 7.344551e-71 0 2 "OK" Expression
# [19,] 6.619713e-19 1.212066e-18 2 "OK" Expression
# [20,] 1 4.611403e-05 10 "OK" Expression
# [21,] 0 0 1 "OK" Expression
If we add a "centering" hint, though, we get more consistent results:
t(sapply(-10:10, function(i) integrate(function (x, offset) dnorm(x + offset, i, 0.07), -Inf, Inf, subdivisions = 10000L, offset = i)))
# value abs.error subdivisions message call
# [1,] 1 7.578907e-06 3 "OK" Expression
# [2,] 1 7.578907e-06 3 "OK" Expression
# [3,] 1 7.578907e-06 3 "OK" Expression
# [4,] 1 7.578907e-06 3 "OK" Expression
# [5,] 1 7.578907e-06 3 "OK" Expression
# [6,] 1 7.578907e-06 3 "OK" Expression
# [7,] 1 7.578907e-06 3 "OK" Expression
# [8,] 1 7.578907e-06 3 "OK" Expression
# [9,] 1 7.578907e-06 3 "OK" Expression
# [10,] 1 7.578907e-06 3 "OK" Expression
# [11,] 1 7.578907e-06 3 "OK" Expression
# [12,] 1 7.578907e-06 3 "OK" Expression
# [13,] 1 7.578907e-06 3 "OK" Expression
# [14,] 1 7.578907e-06 3 "OK" Expression
# [15,] 1 7.578907e-06 3 "OK" Expression
# [16,] 1 7.578907e-06 3 "OK" Expression
# [17,] 1 7.578907e-06 3 "OK" Expression
# [18,] 1 7.578907e-06 3 "OK" Expression
# [19,] 1 7.578907e-06 3 "OK" Expression
# [20,] 1 7.578907e-06 3 "OK" Expression
# [21,] 1 7.578907e-06 3 "OK" Expression
I recognize this is mitigation for heuristics, presumes knowing something about your distribution before integration, and is not a perfect "generic" solution. Just offering another perspective.

Solving non linear equation in R

I need to solve the following function for P and K:
I would like to find all (or a handful) of P's and K's that satisfy the equation.
I have tried using R's nleqslv package, but something is going wrong.
MPK<-function(X){
Y=numeric(2)
Y[1] = 4.34783*((.3*(X[1]^.23)+.7*(X[2]^.23))^3.34783)*0.069*(X[1]^-.77)
Y[2] = 0.3*((1-X[2])/(1-X[1]))^.7
Y
}
#solve for K, P
xstart = c(.5,.5)
nleqslv(x = xstart,fn = MPK)
What I get is the following:
$x
[1] 1.214578e+10 1.006411e+00
$fvec
[1] 5.531138e-03 7.636165e-10
$termcd
[1] 5
$message
[1] "Jacobian is too ill-conditioned (1/condition=8.9e-013) (see
allowSingular option)"
$scalex
[1] 1 1
$nfcnt
[1] 142
$njcnt
[1] 7
$iter
[1] 70
How do I specify that I need solutions where Y1 is equal to Y[2]?

Computing the pseudo inverse of a matrix in R

I am trying to compute the pseudo inverse of a matrix, call it M, which might look the following way:
M=matrix(c(-1,-1,1,0,0,1),nrow=2,ncol=3)
What I need is the left inverse of this matrix, such that:
M_inv_l M=I
Using the MASS package, I am able to find the right inverse:
M_inv_r=ginv(M)
Where M M_inv_r=I.
Is there a way to compute the left inverse instead of the right inverse? I haven't been able to find an answer on the forum.
Thanks
A matrix of full row rank has a left inverse:
> M %*% ginv(M)
[,1] [,2]
[1,] 1.000000e+00 -2.220446e-16
[2,] 1.110223e-16 1.000000e+00
A matrix of full column rank has a right inverse:
> ginv(t(M)) %*% t(M)
[,1] [,2]
[1,] 1.000000e+00 0
[2,] -5.551115e-17 1
See the Wikipedia article on generalized inverses.
I don't think that this is possible in general - you're trying to solve 9 linear equations with only 6 values. Specifically, look at the top row of your inverse:
-1* Minv[1,1] + -1*Minv[1,2] = 1 [1]
1* Minv[1,1] + 0*Minv[1,2] = 0 => Minv[1,1]=0 [2]
0* Minv[1,1] + 1*Minv[1,2] = 0 => Minv[1,2]=0 [3]
It should be clear that substituting [2] and [3] into [1] produces a contradiction.

Which results should I trust between command “Hessian” and “numericHessian”?

I am trying to get the Hessian matrix from my own data, and I have two results -
using the code Hessian from library(numDeriv)
using code numericHessian from library(maxLik)
The result from the Hessian is very very small relative to the result from the numericHessian.
In this case, which results should I trust?
Specifically, the data I used ranged from 350000 to 1100000 and they were 9X2 matrix with a total of 18 data values.
I used with a sort of standard deviation formula and the result from "numericHessian" was ranging from 230 to 466 with 2X2 matrix, whereas the result from "Hessian" ranged from -3.42e-18 to 1.34e-17 which was much less than the previous one.
Which one do you think is correct calculation for the sort of standard deviation?
The code is as follows:
data=read.table("C:/file.txt", header=T);
data <- as.matrix(data);
library(plyr)
library(MASS)
w1 = tail(data/(rowSums(data)),1)
w2 = t(w1)
f <- function(x){
w1 = tail(x/(rowSums(x)),1)
w2 = t(w1)
r = ((w1%*%cov(cbind(x))%*%w2)^(1/2))
return(r)
}
library(maxLik);
numericHessian(f, t0=rbind(data[1,1], data[1,2]))
library(numDeriv);
hessian(f, rbind(data[1,1], data[1,2]), method="Richardson")
The file.txt is the following:
1 2
137 201
122 342
142 111
171 126
134 123
823 876
634 135
541 214
423 142
The result from the "numericHessian" is:
[,1] [,2]
[1,] 0.007105427 0.007105427
[2,] 0.007105427 0.000000000
Then, the result from the "Hessian" is:
[,1] [,2]
[1,] -3.217880e-15 -1.957243e-16
[2,] -1.957243e-16 1.334057e-16
Thank you very much in advance.
You have not given a reproducible example, but I'll try anyway.
library(bbmle)
x <- 0:10
y <- c(26, 17, 13, 12, 20, 5, 9, 8, 5, 4, 8)
d <- data.frame(x,y)
LL <- function(ymax=15, xhalf=6)
-sum(stats::dpois(y, lambda=ymax/(1+x/xhalf), log=TRUE))
fit <- mle2(LL)
cc <- coef(fit)
Here are the finite-difference estimates of the Hessians (matrices of second derivatives) of the negative log-likelihood function at the MLE: inverting these matrices gives an estimate of the variance-covariance matrices of the parameters.
library(numDeriv)
hessian(LL,cc)
## [,1] [,2]
## [1,] 1.296717e-01 -1.185789e-15
## [2,] -1.185789e-15 4.922087e+00
library(maxLik)
numericHessian(LL, t0=cc)
## [,1] [,2]
## [1,] 0.1278977 0.000000
## [2,] 0.0000000 4.916956
So for this relatively trivial example, numDeriv::hessian and maxLik::numericHessian give very similar results. So there must be something you haven't shown us, or something special about the numerics of your problem. In order to proceed further, we need a reproducible example please ...
dat <- matrix(c(137,201,122,342,142,111,
171,126,134,123,823,876,
634,135,541,214,423,142),
byrow=TRUE,ncol=2)
f <- function(x){
w1 <- tail(x/(rowSums(x)),1)
sqrt(w1%*%cov(cbind(x))%*%t(w1))
}
p <- t(dat[1,1:2,drop=FALSE])
f(p) ## 45.25483
numDeriv::hessian(f,p)
## [,1] [,2]
## [1,] -3.217880e-15 -1.957243e-16
## [2,] -1.957243e-16 1.334057e-16
maxLik::numericHessian(f,t0=p)
## [,1] [,2]
## [1,] 0.007105427 0.007105427
## [2,] 0.007105427 0.000000000
OK, these clearly disagree. I'm not sure why, but in this particular case we can analyze what you're doing and see which one is right:
since your input matrix is a single column, x/rowSums(x) is a vector of ones, so the last element (w1 <- tail(...,1)) is just 1.
so your expression reduces to sqrt(cov(cbind(x))). Again, since x is a one-column matrix, cov() is just the variance, and sqrt(cov(.)) is just the standard deviation, or the norm of the vector.
the variance is a quadratic function of any element's deviation from the mean, and so the standard deviation is more or less linear in the deviation from the mean (except at zero), so we would expect the second derivatives to be zero. So it looks like numDeriv::hessian is giving the right answer
We can also confirm this by increasing eps for numericHessian:
maxLik::numericHessian(f,t0=p,eps=1e-3)
## [,1] [,2]
## [1,] 0 0.000000e+00
## [2,] 0 -7.105427e-09
The bottom line is that numDeriv uses a more accurate (but slower) method, but you can get reasonable answers from numericHessian if you're careful.

Can't get a positive definite variance matrix when very small eigen values

To run a Canonical correspondence analysis (cca package ade4) I need a positive definite variance matrix. (Which in theory is always the case)
but:
matrix(c(2,59,4,7,10,0,7,0,0,0,475,18714,4070,97,298,0,1,0,17,7,4,1,4,18,36),nrow=5)
> a
[,1] [,2] [,3] [,4] [,5]
[1,] 2 0 475 0 4
[2,] 59 7 18714 1 1
[3,] 4 0 4070 0 4
[4,] 7 0 97 17 18
[5,] 10 0 298 7 36
> eigen(var(a))
$values
[1] 6.380066e+07 1.973658e+02 3.551492e+01 1.033096e+01
[5] -1.377693e-09
The last eigen value is -1.377693e-09 which is < 0. But the theorical value is > 0.
I can't run the function if one of the eigen value is < 0
I really don't know how to fix this without changing the code of the function cca()
Thanks for help
You can change the input, just a little bit, to make the matrix positive definite.
If you have the variance matrix, you can truncate the eigenvalues:
correct_variance <- function(V, minimum_eigenvalue = 0) {
V <- ( V + t(V) ) / 2
e <- eigen(V)
e$vectors %*% diag(pmax(minimum_eigenvalue,e$values)) %*% t(e$vectors)
}
v <- correct_variance( var(a) )
eigen(v)$values
# [1] 6.380066e+07 1.973658e+02 3.551492e+01 1.033096e+01 1.326768e-08
Using the singular value decomposition, you can do the same thing directly with a.
truncate_singular_values <- function(a, minimum = 0) {
s <- svd(a)
s$u %*% diag( ifelse( s$d > minimum, s$d, minimum ) ) %*% t(s$v)
}
svd(a)$d
# [1] 1.916001e+04 4.435562e+01 1.196984e+01 8.822299e+00 1.035624e-01
eigen(var( truncate_singular_values(a,.2) ))$values
# [1] 6.380066e+07 1.973680e+02 3.551494e+01 1.033452e+01 6.079487e-09
However, this changes your matrix a by up to 0.1, which is a lot
(I suspect it is that high because the matrix a is square: as a result,
one of the eigenvalues of var(a) is exactly 0.)
b <- truncate_singular_values(a,.2)
max( abs(b-a) )
# [1] 0.09410187
We can actually do better simply by adding some noise.
b <- a + 1e-6*runif(length(a),-1,1) # Repeat if needed
eigen(var(b))$values
# [1] 6.380066e+07 1.973658e+02 3.551492e+01 1.033096e+01 2.492604e-09
Here are two approaches:
V <- var(a)
# 1
library(Matrix)
nearPD(V)$mat
# 2 perturb diagonals
eps <- 0.01
V + eps * diag(ncol(V))

Resources