Exponential fitting with R - r

I have a dataset like this
df
x y
7.3006667 -0.14383333
-0.8983333 0.02133333
2.7953333 -0.07466667
and I would like to fit an exponential function like y = a*(exp(bx)).
This is what I tried and the error I get
f <- function(x,a,b) {a * exp(b * x)}
st <- coef(nls(log(y) ~ log(f(x, a, b)), df, start = c(a = 1, b = -1)))
Error in qr.qty(QR, resid) : NA/NaN/Inf in foreign function call (arg 5)
In addition: Warning messages:
1: In log(y) : NaNs produced
2: In log(y) : NaNs produced
fit <- nls(y ~ f(x, a, b), data = df, start = list(a = st[1], b = st[2]))
Error in nls(y ~ exp(a + b * x), data = df, start = list(a = st[1], :
singular gradient
I believe it has to do with the fact that the log is not defined for negative numbers but I don't know how to solve this.

I'm having trouble seeing the problem here.
f <- function(x,a,b) {a * exp(b * x)}
fit <- nls(y~f(x,a,b),df,start=c(a=1,b=1))
summary(fit)$coefficients
# Estimate Std. Error t value Pr(>|t|)
# a -0.02285668 0.03155189 -0.7244157 0.6008871
# b 0.25568987 0.19818736 1.2901422 0.4197729
plot(y~x, df)
curve(predict(fit,newdata=data.frame(x)), add=TRUE)
The coefficients are very poorly estimated, but that's not surprising: you have two parameters and three data points.
As to why your code fails: the first call to nls(...) generates an error, so st is never set to anything (although it may have a value from some earlier code). Then you try to use that in the second call to nls(...).

Related

Why I got non-numeric argument to binary operator with nls?

I have two vectors (example):
x=c(100,98,60,30,28,30,20,10)
y=c(10,9.8,5,3,2,3.4,2.8,1)
I would like to fit them using this function:
and get the fitting parameters a b c d
I used this:
m<-nls(x~a/1+e^(-b*(y-c)) + d)
but I got this error:
Error in y - c : non-numeric argument to binary operator
x and y are reversed and e^(...) should be exp(...). Also I found that setting d to 0 helped.
d <- 0 # fix d at 0
st <- list(a = mean(y), b = 1/sd(x), c = mean(x))
fm <- nls(y ~ a/(1+exp(-b*(x-c))) + d, start = st)
fm
giving:
Nonlinear regression model
model: y ~ a/(1 + exp(-b * (x - c)))
data: parent.frame()
a b c
19.96517 0.02623 99.73842
residual sum-of-squares: 1.82
Number of iterations to convergence: 9
Achieved convergence tolerance: 9.023e-06
Plotting this it seems to be a good fit visually:
plot(y ~ x)
lines(fitted(fm) ~ x, col = "red")
I think the reason is that c is considered as the combine operator. Change that to another symbol (c1 for instance). Of course you would also ened to specify meaningful starting paramaters, but I guess that was not your question.

Correct usage of stats4::mle

I want to use stats4::mle function to estimate the best parameters (2) of a distribution.
I would like to be sure my usage is correct and get guidance to avoid error
"Error in optim(start, f, method = method, hessian = TRUE, ...) :
initial value in 'vmmin' is not finite
In addition: Warning message:
In log(mu) : NaNs produced"
Function I would like to estimate is exp(beta0*a + beta1*b) and I would like to estimate the betas
Sample code:
a <- mydata$a # first variable
b <- mydata$b # second variable
y <- mydata$y # observed result
nll <- function(beta0, beta1) {
mu = y - exp(beta0 * a + beta1 * b)
- sum(log(mu))
}
est <- stats4::mle(minuslog = nll, start = list(beta0 = 0.0001, beta1 = 0.0001))
est
So:
Is this the correct way of doing things?
For the error, I understand this is due to values of mu getting to 0, but I don't know what I can do with it
Thanks for your help.

What I'm doing wrong here when trying to convert an nlmrt object to an nls object

I am trying to convert an "nlmrt object to an "nls" object using nls2. However, I can only manage to do it if I write explicitly the names of the parameters in the call. Can't I define the parameter names programmatically? See the reproducible example:
library(nlmrt)
scale_vector <- function(vector, ranges_in, ranges_out){
t <- (vector - ranges_in[1, ])/(ranges_in[2, ]-ranges_in[1, ])
vector <- (1-t) * ranges_out[1, ] + t * ranges_out[2, ]
}
shobbs.res <- function(x) {
# UNSCALED Hobbs weeds problen -- coefficients are rescaled internally using
# scale_vector
ranges_in <- rbind(c(0, 0, 0), c(100, 10, 0.1))
ranges_out <- rbind(c(0, 0, 0), c(1, 1, 1))
x <- scale_vector(x, ranges_in, ranges_out)
tt <- 1:12
res <- 100*x[1]/(1+10*x[2]*exp(-0.1*x[3]*tt)) - y }
y <- c(5.308, 7.24, 9.638, 12.866, 17.069, 23.192, 31.443,
38.558, 50.156, 62.948, 75.995, 91.972)
st <- c(b1=100, b2=10, b3=0.1)
ans1n <- nlfb(st, shobbs.res)
print(coef(ans1n))
This works:
library(nls2)
ans_nls2 <- nls2(y ~ shobbs.res(c(b1, b2, b3)) + y, start = coef(ans1n), alg = "brute")
However, this forces me to hard-code the parameters names in the call to nls2. For reasons related to my actual code, I would like to be able to do something like
ans_nls2 <- nls2(y ~ shobbs.res(names(st)) + y, start = coef(ans1n), alg = "brute")
But this returns an error:
Error in vector - ranges_in[1, ] :
non-numeric argument to binary operator
Is it possible to fix this, without having to hard-code explicitly the names of parameters in the call to nls2?
nls2 will accept a string as a formula:
co <- coef(ans1n)
fo_str <- sprintf("y ~ shobbs.res(c(%s)) + y", toString(names(co)))
nls2(fo_str, start = co, alg = "brute")
giving:
Nonlinear regression model
model: y ~ shobbs.res(c(b1, b2, b3)) + y
data: NULL
b1 b2 b3
196.1863 49.0916 0.3136
residual sum-of-squares: 2.587
Number of iterations to convergence: 3
Achieved convergence tolerance: NA

R nls Different Errors occur

I'm new in R programming and I don't get a solution to an error which occurs when I use the nls Function.
I try to fit the data from an ecdf (values are extracted and saved in y) to this function model with four parameters:
fitsim <- nls(y ~ exp(-(((a-Abfluss)/(c*(Abfluss-b)))^d)),
start = list( a=max(Abfluss), b=min(Abfluss),
c=3, d=1))
When I start the nls Function these error occurs:
Error in numericDeriv(form[[3L]], names(ind), env) :
Fehlender Wert oder etwas Unendliches durch das Modell erzeugt
which means there is a missing value ore some value with infinity is generated through the model.
My vectors Abfluss and y have both the same lengths. Aim is to get the parameter estimation.
Maybe the problem is, that the model only works under this conditions:
c>0, d>0, b<=Abfluss<=a.
I try already the na.rm=True command. Then another error appears:
Error in model.frame.default(formula = ~y + Abfluss, na.rm = TRUE) :
Variablenlängen sind unterschiedlich (gefunden für '(na.rm)')
which means, the Length of variables are different.
I appreciative for every kind of help and advice.
For a better understanding I attach my whole code with whole data:
time<-c(1851:2013)
Abfluss<- c(4853,4214,5803,3430,4645,4485,3100,4797,4030,3590,5396,9864,3683,4485,4064,3420,5396,
4895,3931,4238,3790,3520,4263,5474,3790,4700,5109,4525,4007,6340,4993,6903,8160,3600,3480,3540,
3540,4565,3333,7764,
4755,7940,3112,3169,4435,5365,9422,3150,10500,4512,3790,4618,6126,3769,3704,
5938,5669,4552,5458,5854,4867,6057,4783,5753,5736,4618,6091,5820,5007,7984, 4435,
4645,7465,5820,5988,6022,4300,6062,3302,4877,4586,5275,4410,3174,4966,4939,4638,
5541,5760,6495,5435,4952,4912,6092,5182,5820,5129,6436,6648,3063,5550,5160,4400,
9600,6400,6380,6300,6180,6899,4360,5550,4580,3894,5277,7520,6780,5100,5430,4550,
6620,4050,4560,5290,6610,8560,4943,6940,4744,6650,5700,7440,6200,4597,3697,7300,
4644,5456,6302,3741,5398,9500,6296,5279,5923,6412,6559,6559,5891,5737,5010,5790,
10300,4150,4870,6740,7560,8010,5120,8170,7430, 7330,5900, 11150)
#EV4-Distribution
dEV4 <- function(x, a, b, c,d) {
m<-exp(-(((a-Abfluss)/(c*(Abfluss-b)))^d))
return(m)
}
#Simulation example
Sim<-dEV4(Abfluss,a=max(Abfluss),b=min(Abfluss), c=3, d=1)
dEV4cdf<-cbind(Abfluss,Sim)
#Empirical cdf
p = ecdf(Abfluss)
y<- p(Abfluss) #Extracting of cumulated probabilities
m<-cbind(Abfluss,y)
#plot EV4 and ecdf
plot(dEV4cdf, type="p",main="EV4")
plot(ecdf(Abfluss), add=T)
#Fitting EV4 nls
fitsim <- nls(y ~ exp(-(((a-Abfluss)/(c*(Abfluss-b)))^d)),
start = list( a=max(Abfluss), b=min(Abfluss),
c=3, d=1), na.rm=TRUE)
Do not use starting values that are on the boundary of the feasible region and try nlxb in nlmrt instead (which can be used with the same arguments except data = is not optional):
library(nlmrt)
fitsim <- nlxb(y ~ exp(-(((a - Abfluss) / (c * (Abfluss - b))) ^ d)),
data = data.frame(y, Abfluss),
start = list(a = 2 * max(Abfluss), b = min(Abfluss) / 2, c = 3, d = 1))
plot(y ~ Abfluss, pch = 20)
o <- order(Abfluss)
fit <- y - fitsim$resid
lines(fit[o] ~ Abfluss[o], col = "red")
giving:
nlmrt class object: x
residual sumsquares = 0.02908 on 163 observations
after 5001 Jacobian and 6060 function evaluations
name coeff SE tstat pval gradient JSingval
a 20047.7 NA NA NA 1.119e-07 3251
b -1175384 NA NA NA 1.432e-09 0.1775
c 0.0129414 NA NA NA -0.1296 5.808e-06
d 12.146 NA NA NA -2.097e-06 6.798e-11

maximum likelihood estimation

I am new user of R and hope you will bear with me if my question is silly. I want to estimate the following model using the maximum likelihood estimator in R.
y= a+b*(lnx-α)
Where a, b, and α are parameters to be estimated and X and Y are my data set. I tried to use the following code that I get from the web:
library(foreign)
maindata <- read.csv("C:/Users/NUNU/Desktop/maindata/output2.csv")
h <- subset(maindata, cropid==10)
library(likelihood)
modelfun <- function (a, b, x) { b *(x-a)}
par <- list(a = 0, b = 0)
var<-list(x = "x")
par_lo <- list(a = 0, b = 0)
par_hi <- list(a = 50, b = 50)
var$y <- "y"
var$mean <- "predicted"
var$sd <- 0.815585
var$log <- TRUE
results <- anneal(model = modelfun, par = par, var = var,
source_data = h, par_lo = par_lo, par_hi = par_hi,
pdf = dnorm, dep_var = "y", max_iter = 20000)
The result I am getting is similar although the data is different, i.e., even when I change the cropid. Similarly, the predicted value generated is for x rather than y.
I do not know what I missed or went wrong. Your help is highly appreciated.
I am not sure if your model formula will lead to a unique solution, but in general you can find MLE with optim function
Here is a simple example for linear regression with optim:
fn <- function(beta, x, y) {
a = beta[1]
b = beta[2]
sum( (y - (a + b * log(x)))^2 )
}
# generate some data for testing
x = 1:100
# a = 10, b = 3.5
y = 10 + 3.5 * log(x)
optim(c(0,0,0),fn,x=x,y=y,method="BFGS")
you can change the function "fn" to reflect your model formula e.g.
sum( (y - (YOUR MODEL FORMULA) )^2 )
EDIT
I am just giving a simple example of using optim in case you have a custom model formula to optimize. I did not mean using it from simple linear regression, since lm will be sufficient.
I was a bit surprised that iTech used optim for what is a problem that is linear in its parameters. With his data for x and y:
> lm(y ~ log(x) )
Call:
lm(formula = y ~ log(x))
Coefficients:
(Intercept) log(x)
10.0 3.5
For linear problems, the least squares solution is the ML solution.

Resources