How to apply a formula to a vector in R? - r

not for the first time, I guess that the answer is quite simple. But searching for R solutions is regularly hard work and after two hours its probably at time to ask someone...
I am working with a non-linear formula (this is only the first work on it, it will actually become non-linear soon) and to test my initial values, i would like to simply calculate the values over a series of x values.
Here is some code:
x <- c(1,2,3,4,5,6,7,8,9,10,11,12) #etc
y <- c(NA,332,248,234,84,56,26,24,27,33,37,25) #etc
# This is my formula I shall soon expand
fEst <- y ~ 1 / (x / a + 1) * b
# Initial value
a <- 800
# Initial value based on inverted formula and second measure
b <- y[2] * (x[2] / a + 1)
# Can i use my formula fEst to do this step?
p <- 1 / (x / a + 1) * b
The point is that I am working on the formula - and it seems strange to make each change, twice...
What I found was a package nls2 where something like this was possible and a function apply.a.formula which seems to be an element from another package - but as this is a very basic use of a function, I guess that the R base packe already has the appropriate functions. Just ... where?
Thanks!

I came across this thread whilst looking up the avenues you'd tried and the solution posted by Gabor. Note that apply.a.formula() is a made up function name that the OP in the thread was looking to find a real function for.
Using the example that Gabor provided in the thread this is a solution using the nls2 package:
## your data
x <- c(1,2,3,4,5,6,7,8,9,10,11,12) #etc
y <- c(NA,332,248,234,84,56,26,24,27,33,37,25) #etc
# This is my formula I shall soon expand
fEst <- y ~ 1 / (x / a + 1) * b
# Initial value
a <- 800
# Initial value based on inverted formula and second measure
b <- y[2] * (x[2] / a + 1)
## install.packages("nls2", depend = TRUE) if not installed
require(nls2)
fitted(nls2(fEst, start = c(a = a, b = b), alg = "brute"))
The last line gives:
R> fitted(nls2(fEst, start = c(a = a, b = b), alg = "brute"))
[1] 332.4145 332.0000 331.5866 331.1741 330.7627 330.3524 329.9430 329.5347
[9] 329.1273 328.7210 328.3157 327.9113
attr(,"label")
[1] "Fitted values"
which is essentially the same as 1 / (x / a + 1) * b would give:
R> 1 / (x / a + 1) * b
[1] 332.4145 332.0000 331.5866 331.1741 330.7627 330.3524 329.9430 329.5347
[9] 329.1273 328.7210 328.3157 327.9113
From the comments, Carl Witthoft notes that if you want to generalise equations like 1 / (x / a + 1) * b then a function might be a useful way of encapsulating the operation without typing out 1 / (x / a + 1) * b every time. For example
myeqn <- function(a, b, x) { 1 / (x / a + 1) * b }
R> myeqn(a, b, x)
[1] 332.4145 332.0000 331.5866 331.1741 330.7627 330.3524 329.9430 329.5347
[9] 329.1273 328.7210 328.3157 327.9113

Related

Adespatial and betapart return different values for what seems to be the same task

I just found a question on Researchgate, why adespatial's beta.div.comp(..., coef="BJ", quant=FALSE) and betapart's beta.multi(..., index.family="jaccard") return different output, despite one would assume the functions run similar calculations, given the index family is set to "jaccard" and the method of Baselga et al is used.
So I tested the following:
require("ade4")
data(doubs)
A = doubs$fish[-8,]
A <- ifelse(A > 0, 1, 0)
require("adespatial")
beta.div.comp(A, coef="BJ", quant=FALSE)$part
Resulting in:
BDtotal Repl Nes Repl/BDtotal Nes/BDtotal
0.3258676 0.1674413 0.1584263 0.5138323 0.4861677
Whereas when I now ran
require("betapart")
beta.multi(A, index.family="jaccard")
it returns
$beta.JTU
[1] 0.7885784
$beta.JNE
[1] 0.1470249
$beta.JAC
[1] 0.9356033
Obviously, the values are not in the same order, but also they are completely different.
So I went to github and copied the code from the relevant functions of both packages. I changed variable names where variables had a pendant in the other package, to indicate which values or parts of the script are basically the same for both packages.
This is the result:
require("adespatial")
require("betapart")
require("ade4")
# load some example data
data(doubs)
A = doubs$fish[-8,]
A <- ifelse(A > 0, 1, 0)
#-----------------------------------------------------------------------------
#### adespatial
### beta.div.comp
# first, run the function
beta.div.comp(A, coef="BJ", quant=FALSE)
## what it does:
n <- nrow(A)
a <- A %*% t(A)
b <- A %*% (1 - t(A))
c <- (1 - A) %*% t(A)
min.bc <- pmin(b, c)
D <- (b + c) / (a + b + c) # Jaccard dissimilarity
repl <- 2 * min.bc / (a + 2 * min.bc) # replacement, turnover
rich <- D - repl
D <- as.dist(D)
repl <- as.dist(repl)
rich <- as.dist(rich)
## output values:
# turnover/replacement
total.div <- sum(D) / (n * (n - 1)) # == mean(D) / 2
# nestedness
repl.div <- sum(repl) / (n * (n - 1)) # == mean(repl) / 2
# total
rich.div <- sum(rich) / (n * (n - 1)) # == mean(rich) / 2
## the following produces the same values using betapart:
mean(beta.pair(A, index.family = "jaccard")$beta.jac) / 2
mean(beta.pair(A, index.family = "jaccard")$beta.jtu) / 2
mean(beta.pair(A, index.family = "jaccard")$beta.jne) / 2
#-----------------------------------------------------------------------------
#### betapart
### beta.multi
# first, run the function
beta.multi(A, index.family="jaccard")
## what it does:
a <- A %*% t(A)
c <- abs(sweep(a, 2, diag(a)))
sumSi <- sum(diag(a)) # species by site richness
St <- sum(colSums(A) > 0) # regional species richness; or ncol(A), if all columns contain values > 0
ms.a <- sumSi - St # multi site shared species term
max.bc <- pmax(c, t(c))
min.bc <- pmin(c, t(c))
sum.max.bc <- sum(max.bc[lower.tri(max.bc)]) # == sum(as.dist(max.bc))
sum.min.bc <- sum(min.bc[lower.tri(min.bc)]) # == sum(as.dist(min.bc))
## output values:
# turnover/replacement
beta.jtu <- (2 * sum.min.bc) / (ms.a + (2 * sum.min.bc))
# nestedness
beta.jne <- (ms.a / (ms.a + (2 * sum.min.bc))) * ((sum.max.bc - sum.min.bc) / ((ms.a) + sum.max.bc + sum.min.bc))
# total
beta.jac <- (sum.min.bc + sum.max.bc) / (ms.a + sum.min.bc + sum.max.bc)
As you can see, there are some basic equations (the ones described in the relevant papers on the partitioning of beta diversity), which are similar for both approaches. However, the adespatial function first calculates some diversity matrices and then sums them up while the betapart approach first summarises the input matrix to obtain single values and then applies the equations for betadiversity decomposition.
Now, my question would be: Why are there different outputs? Are there errors in the code, or are the functions supposed to behave differently?
I found in Baselga (2012; doi: 10.1111/j.1466-8238.2011.00756.x) that beta.multi uses a multiple-site version of the dissimilarity measures, listed in Table 2 of his paper. beta.div.comp, on the other hand, basically does averaging of the pairwise dissimilarities. Those are different approaches that lead to different results.

How to solve an equation for a given variable in R?

This is equation a <- x * t - 2 * x. I want to solve this equation for t.
So basically, set a = 0 and solve for t . I am new to the R packages for solving equations. I need the package that solves for complex roots. The original equations I am work with have real and imaginary roots. I am looking for an algebraic solution only, not numerical.
I tried:
a <- x * t - 2 * x
solve(a,t)
I run into an error:
Error in solve.default(a, t) : 'a' (1000 x 1) must be square
You can use Ryacas to get the solution as an expression of x:
library(Ryacas)
x <- Sym("x")
t <- Sym("t")
Solve(x*t-2*x == 0, t)
# Yacas vector:
# [1] t == 2 * x/x
As you can see, the solution is t=2 (assuming x is not zero).
Let's try a less trivial example:
Solve(x*t-2*x == 1, t)
# Yacas vector:
# [1] t == (2 * x + 1)/x
If you want to get a function which provides the solution as a function of x, you can do:
solution <- Solve(x*t-2*x == 1, t)
f <- function(x){}
body(f) <- yacas(paste0("t Where ", solution))$text
f
# function (x)
# (2 * x + 1)/x
You might be looking for optimize:
a=function(x,t) x*t-2*x
optimize(a,lower=-100,upper=100,t=10)
optimize(a,lower=-100,upper=100,x=2)
If you need more help, I need a reproductible example.

Are binary operators / infix functions in R generic? And how to make use of?

From http://adv-r.had.co.nz/Functions.html or R: What are operators like %in% called and how can I learn about them? I learned that it is possible to write own "binary operators" or "infix functioncs" using the %-sign.
One example would be
'%+%' <- function(a, b) a*b
x <- 2
y <- 3
x %+% y # gives 6
But is it possible to use them in a generic way if they are from a pre-defined class (so that in some cases I don't have to use the %-sign)? For exampple x + y shall give 6 if they are from the class prod.
Yes, this is possible: use '+.<class name>' <- function().
Examples
'+.product' <- function(a, b) a * b
'+.expo' <- function(a, b) a ^ b
m <- 2; class(m) <- "product"
n <- 3; class(n) <- "product"
r <- 2; class(r) <- "expo"
s <- 3; class(s) <- "expo"
m + n # gives 6
r + s # gives 8
safety notes
The new defined functions will be called if at least one of the arguments is from the corresponding class m + 4 gives you 2 * 4 = 8 and not 2 + 4 = 6. If the classes don't match, you will get an error message (like for r + m). So all in all, be sure that you want to establish a new function behind such basic functions like +.

How to solve cubic equation analytically (exact solution) in R?

I'm trying to get solution of cubic equations analytically in R, not numerically.
I looked up on the internet and get the formula for cubic roots and wrote the following code:
The link is: http://www.math.vanderbilt.edu/~schectex/courses/cubic/
cub <- function(a,b,c,d) {
p <- -b/3/a
q <- p^3 + (b*c-3*a*(d))/(6*a^2)
r <- c/3/a
x <- (q+(q^2+(r-p^2)^3)^0.5)^(1/3)+(q-(q^2+(r-p^2)^3)^0.5)^(1/3)+p
x
}
However this function doesn't work in most cases and I guess it's because of the power of negative numbers inside the formula, for example I noticed R cannot get the real root of (-8)^(1/3) which is -2. But Im not sure how I could fix my code so that it can be used to solve for exact cubic solutions in general.
Thanks.
I'd use polyroot(). See here for more details.
polyroot(z = c(8,0,0,1))
# [1] 1+1.732051i -2+0.000000i 1-1.732051i
Try this:
# calcaulate -8 as a complex number
z <- as.complex(-8) # or z <- -8 + 0i
# find all three cube roots
zroot3 <- z^(1/3) * exp(2*c(0:2)*1i*pi/3)
zroot3
## [1] 1+1.732051i -2+0.000000i 1-1.732051i
# check that all three cube roots cube to original
zroot3^3
## [1] -8+0i -8+0i -8-0i
If you only want the real root then here is another option:
> x <- c( -8,8 )
> sign(x) * abs(x)^(1/3)
[1] -2 2
Or you may be interested in the Ryacas package or the polynom package for other options.
Here is a function to compute all the analytical solutions: 'cubsol' . Any comments would be most welcome. One question - at the moment the code searches rather inefficiently for which the real solution is amongst the three complex ones produced by ... s2 = cuberoot(q-s0^0.5); xtemp[1:3] <- s1+ s2 +p; Is there a more efficient way of knowing which one it would be before calculating it?
# - - - - - - - - - - - - - - - - - - - -
# Return all the complex cube roots of a number
cuberoot <- function(x){
return( as.complex(x)^(1/3)*exp(c(0,2,4)*1i*pi/3) );
}
# - - - - - - - - - - - - - - - - - - - -
# cubsol solves analytically the cubic equation and
# returns a list whose first element is the real roots and the
# second element the complex roots.
# test with :
#a = -1; b=-10; c=0; d=50; x=0.01*(-1000:1500); plot(x,a*x^3+b*x^2+c*x+d,t='l'); abline(h=0)
# coefs = c(a,b,c,d)
cubsol <- function(coeffs) {
if (!(length(coeffs) == 4)){
stop('Please provide cubsol with a 4-vector of coefficients')
}
a = coeffs[1]; b=coeffs[2]; c=coeffs[3]; d=coeffs[4];
rts = list();
p <- -b/3/a
q <- p^3 + (b*c-3*a*(d))/(6*a^2)
r <- c/3/a
s0 = q^2+(r-p^2)^3;
xtemp = as.complex(rep(0,9));
if (s0 >= 0){ nReRts=1; } else {nReRts=3; }
# Now find all the roots in complex space:
s0 = as.complex(s0);
s1 = cuberoot(q+s0^0.5)
s2 = cuberoot(q-s0^0.5);
xtemp[1:3] <- s1+ s2 +p; # I think this is meant to always contain
# the sure real soln.
# Second and third solution;
iSqr3 = sqrt(3)*1i;
xtemp[4:6] = p - 0.5*(s1+s2 + iSqr3*(s1-s2));
xtemp[7:9] = p - 0.5*(s1+s2 - iSqr3*(s1-s2));
ind1 = which.min(abs(a*xtemp[1:3]^3 + b*xtemp[1:3]^2 +c*xtemp[1:3] +d))
ind2 = 3+which.min(abs(a*xtemp[4:6]^3 + b*xtemp[4:6]^2 +c*xtemp[4:6] +d))
ind3 = 6+which.min(abs(a*xtemp[7:9]^3 + b*xtemp[7:9]^2 +c*xtemp[7:9] +d))
if (nReRts == 1){
rts[[1]] = c(Re(xtemp[ind1]));
rts[[2]] = xtemp[c(ind2,ind3)]
} else { # three real roots
rts[[1]] = Re(xtemp[c(ind1,ind2,ind3)]);
rts[[2]] = numeric();
}
return(rts)
} # end of function cubsol

nonlinear optimizaion in R

I tried to minimize the following function:
func <- function(qq){
x <- qq[1]
y <- qq[2]
output <- 1 - 2 * x + x^2 - 2 * y + 2 * x * y + y^2
return(output)
}
when x+y=1 and 0<=x,y<=1. To use gosolnp in Rsolnp package, firstly, I defined cons to use it in eqfun argument:
cons <- function(qq)
sum(qq)
Then I applied gosolnp function:
install.packages("Rsolnp")
require(Rsolnp)
gosolnp(fun = func, LB = c(0, 0), UB = c(1, 1), eqfun = cons, eqB = 1)
res$pars
[1] 0.8028775 0.1971225
res$value
[1] 2.606528e-09 -5.551115e-17
the answer should be x = 0 and y = 1, but as you can try in every run of gosolnp you will get new points which func is approximately 0 at that points (and not exactly).
Mathematica and Maple do optimization for this function very fast and give the true answer which is x = 0 and y = 1, but instead every run in R gives a new solution which is not correct.
I also tried another optimization function as spg() in alabama or DEoptim, but the problem remained unsolved.
So my question are:
1- is there any solution that I can minimize func in R?
2- is there any difference between precision in R and Mathematica and why Mathematica could give me the exact answer but R not?
Thank you in advance
If you have two variables x and y, with y = 1 - x, then you really have a problem in just one variable x. Noting that, you can reparametrise your function to be
1 - 2 * x + x^2 - 2 * (1 - x) + 2 * x * (1 - x) + (1 - x)^2
and going through the algebra shows that this is constant as a function of x. Thus any value of x in (0, 1) is a solution, and which one your algorithm converges to will basically be random: based on numerical roundoff and your choice of starting point.
The fact that gosolnp's returned value is zero to within the limits of numerical precision should have been a tipoff, or even just plotting the curve.
I can't speak to these particular packages, but nloptr(...) in package nloptr seems to work well:
# Non-Linear Optimization (package::nloptr)
F <- function(v){
x=v[1]
y=v[2]
output <- 1 - 2 * x + x^2 - 2 * y + 2 * x * y + y^2
}
Hc <- function(v) return(1-sum(v))
library(nloptr)
opt <- nloptr(x0=c(1/2,1/2), eval_f=F, lb = c(0,0), ub = c(1,1),
eval_g_eq = Hc,
opts = list(algorithm="NLOPT_GN_ISRES",maxeval=1e6))
opt$solution
# [1] 0.0005506997 0.9994492982
Your function is identically equal to 0 so there is no point in trying to minimize it.
library(Ryacas)
x <- Sym("x")
y <- 1-x
Simplify( 1 - 2 * x + x^2 - 2 * y + 2 * x * y + y^2)
which gives:
expression(0)

Resources