Due to the necessity of fitting a dataset that is related to a two dimensional diffusion process D2 process with a sestak berggren model (derived from logistic model) I needed to understand how to use the nlsLM
when in presence of a elbow/knee because the following "easy way did not work"
x=c(1.000000e-05, 1.070144e-05, 1.208082e-05, 1.456624e-05, 1.861581e-05, 2.490437e-05, 3.407681e-05, 4.696710e-05,
6.474653e-05, 8.870800e-05, 1.206194e-04, 1.624442e-04, 2.172716e-04, 2.882747e-04, 3.794489e-04, 4.956619e-04,
6.427156e-04, 8.275095e-04, 1.058201e-03, 1.344372e-03, 1.697222e-03, 2.129762e-03, 2.657035e-03, 3.296215e-03,
4.067301e-03, 4.992831e-03, 6.098367e-03, 7.412836e-03, 8.968747e-03, 1.080251e-02, 1.295471e-02, 1.547045e-02,
1.839960e-02, 2.179713e-02, 2.572334e-02, 3.024414e-02, 3.543131e-02, 4.136262e-02, 4.812205e-02, 5.579985e-02,
6.449256e-02, 7.430297e-02, 8.533991e-02, 9.771803e-02, 1.115573e-01, 1.269824e-01, 1.441219e-01, 1.631074e-01,
1.840718e-01, 2.071477e-01, 2.324656e-01, 2.601509e-01, 2.903210e-01, 3.230812e-01, 3.585200e-01, 3.967033e-01,
4.376671e-01, 4.814084e-01, 5.278744e-01, 5.769469e-01, 6.284244e-01, 6.819947e-01, 7.371982e-01, 7.933704e-01,
8.495444e-01, 9.042616e-01)
ynorm=c(
1.000000e+00, 8.350558e-01, 6.531870e-01, 4.910995e-01, 3.581158e-01, 2.553070e-01, 1.814526e-01, 1.290639e-01,
9.219591e-02, 6.623776e-02, 4.817180e-02, 3.543117e-02, 2.624901e-02, 1.961542e-02, 1.478284e-02, 1.123060e-02,
8.597996e-03, 6.631400e-03, 5.151026e-03, 4.028428e-03, 3.171096e-03, 2.511600e-03, 2.001394e-03, 1.604211e-03,
1.292900e-03, 1.047529e-03, 8.530624e-04, 6.981015e-04, 5.739778e-04, 4.740553e-04, 3.932255e-04, 3.275345e-04,
2.739059e-04, 2.299339e-04, 1.937278e-04, 1.637946e-04, 1.389500e-04, 1.182504e-04, 1.009406e-04, 8.641380e-05,
7.418032e-05, 6.384353e-05, 5.508090e-05, 4.762920e-05, 4.127282e-05, 3.583451e-05, 3.116813e-05, 2.715264e-05,
2.368759e-05, 2.068935e-05, 1.808802e-05, 1.582499e-05, 1.385102e-05, 1.212452e-05, 1.061032e-05, 9.278534e-06,
8.103650e-06, 7.063789e-06, 6.140038e-06, 5.315870e-06, 4.576585e-06, 3.908678e-06, 3.298963e-06, 2.732866e-06,
2.189810e-06, 1.614149e-06)
dfxy <- data.frame(x[1:length(ynorm)],ynorm)
fn=funSel <-"co*((1-x)^m)*(x^n)"
mod_fit <- nlsLM(ynorm~eval(parse(text=fn)),start=c(co=0.5,m=-1,n=0.5),data=dfxy)
plot(dfxy$x,dfxy$y,xlim=c(0,0.001))
plot(dfxy$x,(fitted(mod_fit))[1:length(dfxy$x)],xlim=c(0,0.001))
The only solution I've found is based on https://stackoverflow.com/a/54286595/6483091. So first finding the "elbow" and then applying the regression only to the reduced dataset. Everything in this way works but I was wondering if there can be other solutions (tweaking the parameter of the regression instead of making it in two steps, in some way let nlsLM "recognize" the curve using Dynamic First Derivate Threshold, but still forcing the fn for regression)
Also the "biggest problem is that I alredy know the "range" for the parameters" (i.e.
Applying a regression using "good" starting point (coefficients near the "ground truth" ynorm <- 0.973*(1-x)^(0.425)*x^(-1.008) ) but even if I give them as a starting point there is no way I obtain anything with similar values.
the "ground truth"
plot(x,yrnom) yt <- 0.973*(1-x)^(0.425)*x^(-1.008)
lines(x,yt/max(yt))
Here is a solution using nls and a hyperbolic fit:
x=c(1.000000e-05, 1.070144e-05, 1.208082e-05, 1.456624e-05, 1.861581e-05, 2.490437e-05, 3.407681e-05, 4.696710e-05,
6.474653e-05, 8.870800e-05, 1.206194e-04, 1.624442e-04, 2.172716e-04, 2.882747e-04, 3.794489e-04, 4.956619e-04,
6.427156e-04, 8.275095e-04, 1.058201e-03, 1.344372e-03, 1.697222e-03, 2.129762e-03, 2.657035e-03, 3.296215e-03,
4.067301e-03, 4.992831e-03, 6.098367e-03, 7.412836e-03, 8.968747e-03, 1.080251e-02, 1.295471e-02, 1.547045e-02,
1.839960e-02, 2.179713e-02, 2.572334e-02, 3.024414e-02, 3.543131e-02, 4.136262e-02, 4.812205e-02, 5.579985e-02,
6.449256e-02, 7.430297e-02, 8.533991e-02, 9.771803e-02, 1.115573e-01, 1.269824e-01, 1.441219e-01, 1.631074e-01,
1.840718e-01, 2.071477e-01, 2.324656e-01, 2.601509e-01, 2.903210e-01, 3.230812e-01, 3.585200e-01, 3.967033e-01,
4.376671e-01, 4.814084e-01, 5.278744e-01, 5.769469e-01, 6.284244e-01, 6.819947e-01, 7.371982e-01, 7.933704e-01,
8.495444e-01, 9.042616e-01)
ynorm=c(
1.000000e+00, 8.350558e-01, 6.531870e-01, 4.910995e-01, 3.581158e-01, 2.553070e-01, 1.814526e-01, 1.290639e-01,
9.219591e-02, 6.623776e-02, 4.817180e-02, 3.543117e-02, 2.624901e-02, 1.961542e-02, 1.478284e-02, 1.123060e-02,
8.597996e-03, 6.631400e-03, 5.151026e-03, 4.028428e-03, 3.171096e-03, 2.511600e-03, 2.001394e-03, 1.604211e-03,
1.292900e-03, 1.047529e-03, 8.530624e-04, 6.981015e-04, 5.739778e-04, 4.740553e-04, 3.932255e-04, 3.275345e-04,
2.739059e-04, 2.299339e-04, 1.937278e-04, 1.637946e-04, 1.389500e-04, 1.182504e-04, 1.009406e-04, 8.641380e-05,
7.418032e-05, 6.384353e-05, 5.508090e-05, 4.762920e-05, 4.127282e-05, 3.583451e-05, 3.116813e-05, 2.715264e-05,
2.368759e-05, 2.068935e-05, 1.808802e-05, 1.582499e-05, 1.385102e-05, 1.212452e-05, 1.061032e-05, 9.278534e-06,
8.103650e-06, 7.063789e-06, 6.140038e-06, 5.315870e-06, 4.576585e-06, 3.908678e-06, 3.298963e-06, 2.732866e-06,
2.189810e-06, 1.614149e-06)
dfxy <- data.frame(x[1:length(ynorm)],ynorm)
plot(ynorm ~ x.1.length.ynorm.., data = dfxy)
mod <- nls(ynorm ~ a/x.1.length.ynorm.. + b, data = dfxy, start = list(a = 1, b = 0))
lines(x = dfxy$x.1.length.ynorm.., y = predict(mod, newdata = dfxy$x.1.length.ynorm..))
The fit isn't perfect, though. I guess there is no continuous function to fit a right angle...
Depending on what you want to use the regression for, you could also use a loess regression:
dfxy <- data.frame(x[1:length(ynorm)],ynorm)
names(dfxy) <- c("x", "y")
plot(y ~ x, data = dfxy)
mod <- loess(y ~ x, data = dfxy, span = 0.1)
lines(x = dfxy$x, y = predict(mod, newdata = dfxy$x), col = "red")
Resulting in:
I don't know why I get an error message when I'm trying to use optim in R.
I have following data:
x <- c(6.0401209, 7.2888217, 0.4868070,
1.1699703, 51.5998419, 11.8766734,
2.3873264, 16.9583702, 21.6142835,
0.3133089, 3.4178360, 4.4367427,
2.0205100, 10.5798884, 0.4890031,
1.6734176, 10.2809820, 6.4705424,
5.6801965, 0.9438700)
And following log-likelihood function:
log.lik.bs <- function(gamma, betha, z){
n <- length(z)
- n * log(gamma) - n * log(2*sqrt(2*2*pi)) - sum(log(z)) + sum(log(sqrt(z/betha)) + sqrt(betha/z)) - (1/2*gamma^2) * sum((sqrt(z/betha) - sqrt(betha/z))^2)
}
What I'm trying to do is following:
optim(c(2, 6), log.lik.bs, control=list(fnscale=-1), x=x, method="BFGS")$par
But I get an error message:
Error in fn(par, ...) :
unused argument (x = c(6.04012089885228, 7.28882174812723, 0.486806990614708, 1.1699703323488, 51.5998418613029, 11.8766733963947, 2.38732637900487, 16.9583701851951, 21.6142834611592, 0.313308870127425, 3.41783600439905, 4.43674270859797, 2.02051001746263, 10.5798883747597, 0.489003100259996, 1.67341757119939, 10.2809820486722, 6.4705423816332, 5.68019649178721, 0.943869996033357))
It is not quite clear which parameters you are trying to optimize. I assume you want to optimize log.lik.bs with respect to gamma and betha for given z with initial values 2 and 6. In that case you have two errors in your code:
log.lik.bs is expecting an argument named z but you are providing an argument x. That is the error you are getting. Fix: z = x
When using optim your target function must take a single argument for the parameters. From ?optim:
A function to be minimized (or maximized), with first argument the vector of parameters over which minimization is to take place. It should return a scalar result.
Combining this I get:
x <- c(6.0401209, 7.2888217, 0.4868070,
1.1699703, 51.5998419, 11.8766734,
2.3873264, 16.9583702, 21.6142835,
0.3133089, 3.4178360, 4.4367427,
2.0205100, 10.5798884, 0.4890031,
1.6734176, 10.2809820, 6.4705424,
5.6801965, 0.9438700)
log.lik.bs <- function(x, z){
gamma <- x[1]
betha <- x[2]
n <- length(z)
- n * log(gamma) - n * log(2*sqrt(2*2*pi)) - sum(log(z))
+ sum(log(sqrt(z/betha)) + sqrt(betha/z))
- (1/2*gamma^2) * sum((sqrt(z/betha) - sqrt(betha/z))^2)
}
optim(c(2, 6), log.lik.bs, control=list(fnscale=-1), z=x, method="BFGS")$par
Unfortunately this still throws an error:
Error in optim(c(2, 6), log.lik.bs, control = list(fnscale = -1), z = x, :
non-finite finite-difference value [1]
In addition there are several warnings that NaNs where introduced by sqrt and log. So my interpretation of your question might be wrong. After all, the function goes to infinity as gamma goes to zero.
I've got the following code:
theta=0.05
n=1000
m=200
r=rnorm(2000)
#ER check function
nu=Vectorize(function(a,tau){return(abs(tau-(a<0))*a^2)})
#Selecting 10 lowest sum values (lowest10 function returns indices)
lowest10=function(x){
values=sort(x)[1:min(10,length(x))]
indices=match(values,x)
return(indices)
}
sym.expectile=function(beta,e,abs.r){return(beta[1]+beta[2]*e+beta[3]*abs.r)}
ERsum=function(beta,tau,start,end){
y=r[(start+1):end]
X1=rep(1,n-1)
X3=abs(r[start:(end-1)])
X2=c()
X2[1]=e.sym.optimal[start-m]
for (i in 2:(n-1)){
X2[i]=sym.expectile(beta,X2[i-1],X3[i-1])
}
X=matrix(c(X1,X2,X3),ncol=3)
res=y-X%*%beta
sum.nu=mean(nu(res,tau))
return(sum.nu)
}
ERsum.gr=function(beta,tau,start,end){
y=r[(start+1):end]
X1=rep(1,n-1)
X3=abs(r[start:(end-1)])
X2=c()
X2[1]=e.sym.optimal[start-m]
for (i in 2:(n-1)){
X2[i]=sym.expectile(beta,X2[i-1],X3[i-1])
}
X=matrix(c(X1,X2,X3),ncol=3)
partial.beta0=c()
for (i in 1:(n-1)){partial.beta0[i]=-(1-beta[2]^(i))/(1-beta[2])}
gr.beta0=2/T*sum(abs(tau-(y<X%*%beta))*(y-X%*%beta)*partial.beta0)/1000
partial.beta1=c()
partial.beta1[1]=-X2[1]
for (i in 2:(n-1)){partial.beta1[i]=partial.beta1[i-1]*beta[2]-X2[i]}
gr.beta1=2/T*sum(abs(tau-(y<X%*%beta))*(y-X%*%beta)*partial.beta1)/1000
partial.beta2=c()
partial.beta2[1]=-X3[1]
for (i in 2:(n-1)){partial.beta2[i]=partial.beta2[i-1]*beta[2]-X3[i]}
gr.beta2=2/T*sum(abs(tau-(y<X%*%beta))*(y-X%*%beta)*partial.beta2)/1000
c(gr.beta0,gr.beta1,gr.beta2)
}
beta=matrix(nrow=1e4,ncol=3)
beta[,1]=runif(1e4,-1,0)#beta0
beta[,2]=runif(1e4,0,1)#beta1
beta[,3]=runif(1e4,-1,0)#beta2
e.sym.optimal=c()
tau.found.sym.optim=0.02234724
library('expectreg')
e.sym.optimal[1]=expectile(r[1:m],tau.found.sym.optim)
ERsums.sym=c()
for (i in 1:nrow(beta)){
ERsums.sym[i]=ERsum(beta[i,],tau.found.sym.optim,m+1,m+n)
}
initialbeta.esym=beta[lowest10(ERsums.sym),]
intermedietebeta.esym=matrix(ncol=3,nrow=10)
for (i in 1:10){
intermedietebeta.esym[i,]=optim(initialbeta.esym[i,],ERsum,
gr=ERsum.gr,tau=tau.found.sym.optim,
start=m+1,end=m+n,
method="BFGS")$par
}
I tried to replace the optim function with optimx, but got the following error:
Error: Gradient function might be wrong - check it!
To check if my gradient is ok I tried to evaluate values of gradient function using function grad from numDeriv and directly calling my ERsum.gr function. For the sample vector
beta
[1] -0.8256490 0.7146256 -0.4945032
I obtained following results:
>grad(function(beta) ERsum(c(beta[1],beta[2],beta[3]),tau.found.sym.optim,m+1,m+n),beta)
[1] -0.6703170 2.8812666 -0.5573101
> ERsum.gr2(beta,tau.found.sym.optim,m+1,m+n)
[1] -0.6696467 2.8783853 -0.5567527
So here is my question: is it possible that these differences are just some numerical errors caused by rounding down the partial.beta0, partial.beta1, partial.beta2 which are just the components of the sum representing gradient? I think so, because if my analytical formula for gradient misses something, the discrepancies would be probably much larger, but how can I be sure? If this is a case is there any other way to obtain more accurate values of gradient?
You've got further problems down the line even if you solve the question of whether that is really a proper gradient, which I see as too complex to tackle. If you take out the gr argument and try to run with only optimx instead of optim, you get:
Error in intermedietebeta.esym[i, ] <- optimx(initialbeta.esym[i, ], ERsum, :
number of items to replace is not a multiple of replacement length
This probably relates to the fact that optimx does not return the same structure as is returned by optim:
> optimx(initialbeta.esym[i,],ERsum,
+ tau=tau.found.sym.optim,
+ start=m+1,end=m+n,
+ method="BFGS")$par
NULL
> optimx(initialbeta.esym[i,],ERsum,
+ tau=tau.found.sym.optim,
+ start=m+1,end=m+n,
+ method="BFGS") # leave out `$par`
p1 p2 p3 value fevals gevals niter convcode kkt1 kkt2 xtimes
BFGS -1.0325 0.2978319 0.04921863 0.09326904 102 100 NA 1 TRUE FALSE 3.366
If you disagree with the decision to allow a default gradient estimate, hten you need to narrow down your debugging to the function that throws the error:
Error: Gradient function might be wrong - check it!
> traceback()
3: stop("Gradient function might be wrong - check it! \n", call. = FALSE)
2: optimx.check(par, optcfg$ufn, optcfg$ugr, optcfg$uhess, lower,
upper, hessian, optcfg$ctrl, have.bounds = optcfg$have.bounds,
usenumDeriv = optcfg$usenumDeriv, ...)
1: optimx(initialbeta.esym[i, ], ERsum, gr = ERsum.gr, tau = tau.found.sym.optim,
start = m + 1, end = m + n, method = "BFGS")
And look at the documentation (there was no help page) and code for optimx:::optimx.check. This is the section of code that does the checking:
if (!is.null(ugr) && !usenumDeriv) {
gname <- deparse(substitute(ugr))
if (ctrl$trace > 0)
cat("Analytic gradient from function ", gname,
"\n\n")
fval <- ufn(par, ...)
gn <- grad(func = ufn, x = par, ...)
ga <- ugr(par, ...)
teps <- (.Machine$double.eps)^(1/3)
if (max(abs(gn - ga))/(1 + abs(fval)) >= teps) {
stop("Gradient function might be wrong - check it! \n",
call. = FALSE)
optchk$grbad <- TRUE
}