I am trying to do an exponential regression in R but I keep getting this error
Error in nls(y ~ a * exp(b * x), data = DF, start = list(a = -10, b = -10)):singular gradient
The data and code I'm using are:
x <-c(0.00, 6.40, 8.61, 15.20, 28.10, 42.60, 66.70, 73.00, 73.00, 85.00, 88.00, 88.00, 88.00, 88.00, 88.00, 88.00, 94.00, 94.00, 94.00, 94.00, 94.00, 94.00, 94.00, 94.00, 94.00, 94.00, 94.00, 102.00, 102.00, 102.00, 102.00, 102.00, 160.00, 160.00, 169.00, 320.00, 320.00, 320.00, 432.00, 432.00)
y <- c(6.52, 1.95, 1.51, 1.94, 3.04, 1.81, 2.07, 0.88, 1.59, 1.18, 0.47, 0.69, 0.90, 1.27, 0.94, 1.84, 0.71, 1.30, 0.50, 1.09, 0.69, 4.07, 0.68, 0.91, 0.64, 0.97, 0.99, 1.34, 0.82, 0.34, 0.39, 1.14, 0.90, 0.36, 0.86, 0.59, 0.36, 1.14, 1.09, 1.81)
DF <- data.frame(x,y)
m <- nls(y ~ a*exp(b*x), data = DF, start=list(a=-10, b=-10))
It's probably an easy fix but I have been stuck with this for days, thank you very much, any help highly appreciated!
First look at the data. Many times you need to supply starting values that are somewhat plausible.
plot(y~x)
It seemed clear that a would be positive and b would be negative. Furthermore the long "time-scale" ( I think in terms of survival analysis.) would require a fairly small b:
m <- nls(y ~ a*exp(b*x), data = DF, start=list(a=-1, b=-.1) )
> m
Nonlinear regression model
model: y ~ a * exp(b * x)
data: DF
a b
3.5092 -0.0128
residual sum-of-squares: 33.85
Number of iterations to convergence: 18
Achieved convergence tolerance: 5.09e-06
Add a curve to the plot with:
curve( 3.5*exp(-0.0128*x),add=TRUE, col="blue")
I cannot place an image in a comment, and so place it here. When I fit your data with an offset, "a * exp(b * x) + offset", as suggested to me by a scatterplot of the data, I get what appears to be a better fit with R-squared = 0.584 and RMSE = 0.714 from fitted parameters a = 5.3702154953394219E+00, b = -2.7909919440915620E-01, and offset = 1.1363689273642967E+00
Related
I'm trying to understand how anova calculates F-value for a Gamma glm.
I have some weird skewed data:
y <- c(0, 0.88, 0.94, 0, 0.95, 0.77, 3.22, 3.52, 1.22, 1.52, 1.23,
0.92, 1.11, 1.18, 1.47, 1.53, 0, 0, 1.09, 0.83, 0.8, 1.56, 6,
0.74, 1.18, 1.01, 0.82, 3.83, 1.75, 1.27, 1.54, 1.05, 1.08, 0.9,
0.77, 1.44, 4.55, 0, 1.44, 2.91, 0.71, 12.93, 0.77, 0, 1.14,
1.06, 3.96, 1.57, 1.63)
x <- c(6.9469287, 6.290469147, 6.1918829, 6.104770097, 5.939523496,
5.942857082, 6.163662277, 6.399218779, 5.783065061, 5.638420345,
5.552741687, 5.683432022, 5.857426116, 6.162680044, 5.957396843,
6.571818964, 5.446848271, 5.712962062, 5.653265224, 6.349141363,
5.46503105, 6.049651518, 7.380125424, 5.722479551, 5.950585693,
5.808206582, 6.096318404, 5.913429847, 5.997807119, 6.206943676,
6.550982371, 6.543636484, 6.822385253, 6.507588297, 5.940914702,
6.439753879, 6.899586949, 6.156580921, 7.116019293, 6.355315455,
6.538796291, 6.498027706, 6.196593891, 6.339028678, 6.23909998,
6.551869452, 6.688031206, 6.492259138, 5.997315277)
y <- y + 0.001
I added 0.001 to y to avoid zeros. For a simple regression I could reproduce the F test run by anova:
lm0 <- lm(y ~ 1)
lm1 <- lm(y ~ x)
#
y.p <- lm1$fitted.values # predicted/fitted values
SSE <- sum((y - y.p)^2)
SSR <- sum((y.p - mean(y))^2)
SST <- sum((y - mean(y))^2)
round(SST - (SSE + SSR), 4) #check
# [1] 0
#
SS1 <- sum(residuals(lm0, "deviance")^2) #=SST
SS2 <- sum(residuals(lm1, "deviance")^2) #=SSE
df1 <- lm0$df.residual
df2 <- lm1$df.residual
MSE <- SS2/df2
MSR <- ((SS1 - SS2)/(df1 - df2))
MSR/MSE # F-value
# [1] 5.927608
anova(lm0, lm1, test="F")$F[2]
# [1] 5.927608
However, I could not reproduce F for a Gamma-version of the regression:
lm0 <- glm(y ~ 1, family=Gamma(link="log"))
lm1 <- glm(y ~ x, family=Gamma(link="log"))
#
oo <- Gamma(link="log") # family info
y.p <- oo$mu.eta(eta) # fitted values on the original scale
# ... the same as for lm example above
MSR/MSE # F-value
# [1] 3.862559
anova(lm0, lm1, test="F")$F[2]
# [1] 7.356901
However if I take the MSR, which is obviously called "Deviance" in the anova-output and divide it by the Dispersion parameter from the model summary (which strangely for me can be produced from working residuals) I'll get the correct F:
# correct F for the gamma regression:
disp <- summary(lm1)$dispersion
mdisp <- sum(residuals(lm1, "working")^2)/df2 # MSE-variant with working residuals
disp - mdisp # check
# [1] 0
Dev <- anova(lm1)$Deviance[2]
MSR-Dev # check
# [1] 0
MSR/mdisp # correct F as in anova
# [1] 7.356901
For me (w/o a deeper mathematical education) these last manipulations - which I found through try and error - look like magic. Could somebody help me to understand how the link-function of the Gamma glm is "interwoven" in the MSR/MSE calculation? I need the understanding to be able to calculate F for a Gamma regression performed with the R fastglm package, which is not compatible with anova.
I have a set of observations from a Cauchy (theta,1) and I have a plot for the log-likelihood against different x values
obs=c(1.77, -0.23, 2.76, 3.80, 3.47, 56.75, -1.34, 4.24, -2.44, 3.29, 3.71, -2.40, 4.53, -0.07, -1.05, -13.87, -2.53, -1.75, 0.27, 43.21)
ll_c=function(theta, obs){ #define Loglikelihood function for Cauchy(θ,1) distribution
logl= sum(dcauchy(obs, location = theta, scale = 1, log = T))
return(logl)
}
x = seq(from=-10,to=10,by=0.1) #create test values
ll = NULL
for (i in x){
ll = c(ll, ll_c(i, obs)) #perform ll_c for all test values and store
}
plot(x, ll)
I also need to make a plot of the first derivative of the log-likelihood function against the same x values and I can not figure out how to do so.
fdll_c=function(theta,obs){
Dlogl=D(sum(dcauchy(obs,location=theta,scale=1,log=T)),'theta')
return(Dlogl)
}
fdll = NULL
for (j in x){
fdll = c(fdll, fdll_c(j,obs))
}
plot(x,fdll)
I have tried different variations on this code, but every time it has come back with an error or with a derivative of 0 at all points.
Maybe the following answers the question.
It uses an explicit log-likelihood partial derivative function and then applies it to a vector around 0.
obs <- c(1.77, -0.23, 2.76, 3.80, 3.47, 56.75, -1.34, 4.24, -2.44, 3.29, 3.71, -2.40, 4.53, -0.07, -1.05, -13.87, -2.53, -1.75, 0.27, 43.21)
dll_theta <- function(x, theta, scale){
cc <- (x - theta)/scale
-2*sum(1/cc)/scale
}
x <- seq(from = -10, to = 10, by = 0.001)
y <- sapply(x, function(.x) dll_theta(obs, theta = .x, scale = 1))
i <- which(abs(y) > 1e15)
plot(x[-i], y[-i], pch = ".")
I would like to plot a threshold model with smooth 95% confidence interval lines between line segments. You would think this would be on the simple side but I have not been able to find an answer!
My threshold/breakpoints are known, it would be great if there were a way to visualize this data. I have tried the segmented package which produces the following plot:
The plot shows a threshold model with a breakpoint at 5.4. However, the confidence intervals are not smooth between regression lines.
If anyone knows of any way to produce smooth (i.e. without the jump between line segments) CI lines between segmented regression lines (ideally in ggplot) that would be amazing. Thank you so much.
I have included sample data and the code I have tried below:
x <- c(2.26, 1.95, 1.59, 1.81, 2.01, 1.63, 1.62, 1.19, 1.41, 1.35, 1.32, 1.52, 1.10, 1.12, 1.11, 1.14, 1.23, 1.05, 0.95, 1.30, 0.79,
0.81, 1.15, 1.10, 1.29, 0.97, 1.05, 1.05, 0.84, 0.64, 0.80, 0.81, 0.61, 0.71, 0.75, 0.30, 0.30, 0.49, 1.13, 0.55, 0.77, 0.51,
0.67, 0.43, 1.11, 0.29, 0.36, 0.57, 0.02, 0.22, 3.18, 3.79, 2.49, 2.44, 2.12, 2.45, 3.22, 3.44, 3.86, 3.53, 3.13)
y <- c(22.37, 18.93, 16.99, 15.65, 14.62, 13.79, 13.09, 12.49, 11.95, 11.48, 11.05, 10.66, 10.30, 9.96, 9.65, 9.35, 9.07, 8.81,
8.56, 8.32, 8.09, 7.87, 7.65, 7.45, 7.25, 7.05, 6.86, 6.68, 6.50, 6.32, 6.15, 5.97, 5.80, 5.63, 5.47, 5.30,
5.13, 4.96, 4.80, 4.63, 4.45, 4.28, 4.09, 3.90, 3.71, 3.50, 3.27, 3.01, 2.70, 2.28, 22.37, 16.99, 11.05, 8.81,
8.56, 8.32, 7.25, 7.05, 6.50, 6.15, 5.63)
lin.mod <- lm(y ~ x)
segmented.mod <- segmented(lin.mod, seg.Z = ~x, psi=2)
plot(x, y)
plot(segmented.mod, add=TRUE, conf.level = 0.95)
which produces the following plot (and associated jumps in 95% confidence intervals):
segmented plot
Background: The non-smoothness in existing change point packages are due to the fact that frequentist packages operate with a fixed change point value. But as with all inferred parameters, this is wrong because there is indeed uncertainty concerning the location of the change.
Solution: AFAIK, only Bayesian methods can quantify that and the mcp package fills this space.
library(mcp)
model = list(
y ~ 1 + x, # Segment 1: Intercept and slope
~ 0 + x # Segment 2: Joined slope (no intercept change)
)
fit = mcp(model, data = data.frame(x, y))
Default plot (plot.mcpfit() returns a ggplot object):
plot(fit) + ggtitle("Default plot")
Each line represents a possible model that generated the data. The posterior for the change point is shown as a blue density. You can add a credible interval on top using plot(fit, q_fit = TRUE) or plot it alone:
plot(fit, lines = 0, q_fit = c(0.025, 0.975), cp_dens = FALSE) + ggtitle("Credible interval only")
If your change point is indeed known and if you want to model different residual scales for each segment (i.e., quasi-emulate segmented), you can do:
model2 = list(
y ~ 1 + x,
~ 0 + x + sigma(1) # Add intercept change in residual scale
)
fit = mcp(model2, df, prior = list(cp_1 = 1.9)) # Note: prior is a fixed value - not a distribution.
plot(fit, q_fit = TRUE, cp_dens = FALSE)
Notice that the CI does not "jump" around the change point as in segmented. I believe that this is the correct behavior. Disclosure: I am the author of mcp.
I am trying to fit a polynomial to my dataset, which looks like that (full dataset is at the end of the post):
The theory predicts that the formulation of the curve is:
which looks like this (for x between 0 and 1):
When I try to make a linear model in R by doing:
mod <- lm(y ~ poly(x, 2, raw=TRUE)/poly(x, 2))
I get the following curve:
Which is much different from what I would expect. Have you got any idea how to fit a new curve from this data so that it would be similar to the one, which theory predicts? Also, it should have only one minimum.
Full dataset:
Vector of x values:
x <- c(0.02, 0.03, 0.04, 0.05, 0.06, 0.07, 0.08, 0.09, 0.10, 0.11, 0.12,
0.13, 0.14, 0.15, 0.16, 0.17, 0.18, 0.19, 0.20, 0.21, 0.22, 0.23, 0.24, 0.25,
0.26, 0.27, 0.28, 0.29, 0.30, 0.31, 0.32, 0.33, 0.34, 0.35, 0.36, 0.37, 0.38,
0.39, 0.40, 0.41, 0.42, 0.43, 0.44, 0.45, 0.46, 0.47, 0.48, 0.49, 0.50, 0.51,
0.52, 0.53, 0.54, 0.55, 0.56, 0.57, 0.58, 0.59, 0.60, 0.61, 0.62, 0.63, 0.64,
0.65, 0.66, 0.67, 0.68, 0.69, 0.70, 0.71, 0.72, 0.73, 0.74, 0.75, 0.76, 0.77,
0.78, 0.79, 0.80, 0.81, 0.82, 0.83, 0.84, 0.85, 0.86, 0.87, 0.88, 0.89, 0.90,
0.91, 0.92, 0.93, 0.94, 0.95)
Vector of y values:
y <- c(4.104, 4.444, 4.432, 4.334, 4.285, 4.058, 3.901, 4.382,
4.258, 4.158, 3.688, 3.826, 3.724, 3.867, 3.811, 3.550, 3.736, 3.591,
3.566, 3.566, 3.518, 3.581, 3.505, 3.454, 3.529, 3.444, 3.501, 3.493,
3.362, 3.504, 3.365, 3.348, 3.371, 3.389, 3.506, 3.310, 3.578, 3.497,
3.302, 3.530, 3.593, 3.630, 3.420, 3.467, 3.656, 3.644, 3.715, 3.698,
3.807, 3.836, 3.826, 4.017, 3.942, 4.208, 3.959, 3.856, 4.157, 4.312,
4.349, 4.286, 4.483, 4.599, 4.395, 4.811, 4.887, 4.885, 5.286, 5.422,
5.527, 5.467, 5.749, 5.980, 6.242, 6.314, 6.587, 6.790, 7.183, 7.450,
7.487, 8.566, 7.946, 9.078, 9.308, 10.267, 10.738, 11.922, 12.178, 13.243,
15.627, 16.308, 19.246, 22.022, 25.223, 29.752)
Use nls to fit a nonlinear model. Note that the model formula is not uniquely defined as displayed in the question since if we multiply all the coefficients by any number the result will still give the same predictions. To avoid this we need to fix one coefficient. A first try used the coefficients shown in the question as starting values (except fixing one) but that failed so dropping C was tried and the resulting coefficients fed into a second fit with C = 1.
st <- list(a = 43, b = -14, c = 25, B = 18)
fm <- nls(y ~ (a + b * x + c * x^2) / (9 + B * x), start = st)
fm2 <- nls(y ~ (a + b * x + c * x^2) / (9 + B * x + C * x^2), start = c(coef(fm), C = 1))
plot(y ~ x)
lines(fitted(fm2) ~ x, col = "red")
(continued after chart)
Note: Here is an example of using nls2 to get starting values with random search. We assume that the coefficients each lie between -50 and 50.
library(nls2)
set.seed(123) # for reproducibility
v <- c(a = 50, b = 50, c = 50, B = 50, C = 50)
st0 <- as.data.frame(rbind(-v, v))
fm0 <- nls2(y ~ (a + b * x + c * x^2) / (9 + B * x + C * x^2), start = st0,
alg = "random", control = list(maxiter = 1000))
fm3 <- nls(y ~ (a + b * x + c * x^2) / (9 + B * x + C * x^2), st = coef(fm0))
Since you already have a theoretic prediction, you don't seem in need of a new model, and it's really only a plotting task:
png(); plot(y~x)
lines(x,mod,col="blue")
dev.off()
You cannot expect lm to produce a good approximation to a non-linear problem. The denominator involving x in that theoretic expression makes this inherently nonlinear.
I am trying to fit a negative exponential to some data in R, but the fitted line looks too high compared to the data, whereas the fit I get using Excel's built-in power fit looks more believable. Can someone tell me why? I've tried using the nls() function and also optim() and get similar parameters from both of those methods, but the fits for both look high.
x <- c(5.96, 12.86, 8.40, 2.03, 12.84, 21.44, 21.45, 19.97, 8.92, 25.00, 19.90, 20.00, 20.70, 16.68, 14.90, 26.00, 22.00, 22.00, 10.00, 5.70, 5.40, 3.20, 7.60, 0.59, 0.14, 0.85, 9.20, 0.79, 1.40, 2.68, 1.91)
y <- c(5.35, 2.38, 1.77, 1.87, 1.47, 3.27, 2.01, 0.52, 2.72, 0.85, 1.60, 1.37, 1.48, 0.39, 2.39, 1.83, 0.71, 1.24, 3.14, 2.16, 2.22, 11.50, 8.32, 38.98, 16.78, 32.66, 3.89, 1.89, 8.71, 9.74, 23.14)
xy.frame <- data.frame(x,y)
nl.fit <- nls(formula=(y ~ a * x^b), data=xy.frame, start = c(a=10, b=-0.7))
a.est <- coef(nl.fit)[1]
b.est <- coef(nl.fit)[2]
plot(x=xy.frame$x,y=xy.frame$y)
# curve looks too high
curve(a.est * x^b.est , add=T)
# these parameters from Excel seem to fit better
curve(10.495 * x^-0.655, add=T)
# alternatively use optim()
theta.init <- c(1000,-0.5, 50)
exp.nll <- function(theta, data){
a <- theta[1]
b <- theta[2]
sigma <- theta[3]
obs.y <- data$y
x <- data$x
pred.y <- a*x^b
nll <- -sum(dnorm(x=obs.y, mean=pred.y , sd=sigma, log=T))
nll
}
fit.optim <- optim(par=theta.init,fn=exp.nll,method="BFGS",data=xy.frame )
plot(x=xy.frame$x,y=xy.frame$y)
# still looks too high
curve(a.est * x^b.est, add=T)
The reason you're seeing the unexpected behavior is that the curves that look "too high" actually have much lower sums of squared errors than the curves from excel:
# Fit from nls
sum((y - a.est*x^b.est)^2)
# [1] 1588.313
# Fit from excel
sum((y - 10.495*x^ -0.655)^2)
# [1] 1981.561
The reason nls favors the higher curve is that it is working to avoid huge errors at small x values at the cost of slightly larger errors with large x values. One way to address this might be to apply a log-log transformation:
mod <- lm(log(y)~log(x))
(a.est2 <- exp(coef(mod)["(Intercept)"]))
# (Intercept)
# 10.45614
(b.est2 <- coef(mod)["log(x)"])
# log(x)
# -0.6529741
These are quite close to the coefficients from excel, and yield a more visually appealing fit (despite the worse performance on the sum-of-squared-errors metric):