Singular gradient in nls model - how can I make it work? - r

I have data, on which I want to fit a non-linear regression model. The model is a physical model to compute the Chlorid defusion coefficient. In my case the model looks like
Cx = Ci + (Cs - Ci) * erfc(x / (sqrt(4 * D * t))
with Ci = 0.020664, t = 28/365, x and Cx being in the data and Cs and D are the coefficients to be computed. Erfc is the complementary error function.
I have data in form of
data = data.frame(x=c(2.13, 4.38, 6.38, 8.13, 10.38, 13.88, 17.38),
Cx=c(0.085, 0.017, 0.011, 0.010, 0.009, 0.010, 0.009))
So what I coded in R was
erf <- function(x) 2 * pnorm(x * sqrt(2)) - 1 # error function
erfc <- function(x) 1 - erf(x) # complementary error function
m1 <- nls(formula = Cx ~ 0.020664 + (Cs - 0.020664) *
erfc(x / (sqrt(4 * D * (28/365)))),
data = data,
start = list(Cs = 0.5, D = 50))
Which gives me the error message "singular gradient". Since the data is already given and I can't really change the model either, has somebody an idea how to solve this?
(I saw that often times it is recommended to use a different library than nls when this error occurs, but these (i.e nlsr) couldn't derive the erfc function.)

Since erfc > 0 if Cs - 0.020664 is positive then the second term is positive so the entire RHS will be above 0.020664 whereas all the
points except the first are below it. Also if Cs - 0.020664 is negative
then all points will be below 0.020664 in which case it will be far from the first point.
Thus it is not a matter of finding a different algorithm -- your
model simply does not fit the data.
Also as a general comment simply looking for a different algorithm when nls fails is often a poor strategy since the situation we have here is often the case. A better strategy is to understand the model better and improve it.
If we modify the model slightly as a linear combination of 1 and erfc then we can get a good fit. The linear combination coefficients are .lin.A and .lin.B and do not need starting values when using the plinear algorithm. The algorithm converges in only 3 iterations which together with the plot below shows it fits the data well. Note that this revised model still has parameter D so if that is the main parameter of interest you can still use this model.
In the plot below the circles are the data points and the line is the fitted values at them.
fm <- nls(formula = Cx ~ cbind(A = 1, B = erfc(x / sqrt(4 * D * 28/365))),
data = data, start = list(D = 25), algorithm = "plinear")
fm
## Nonlinear regression model
## model: Cx ~ cbind(A = 1, B = erfc(x/sqrt(4 * D * 28/365)))
## data: data
## D .lin.A .lin.B
## 25.932426 0.009704 0.263631
## residual sum-of-squares: 2.041e-06
##
## Number of iterations to convergence: 3
## Achieved convergence tolerance: 6.247e-06
plot(Cx ~ x, data)
lines(fitted(fm) ~ x, data)

Related

Using the MLE and NLS functions in R for a nonlinear model

I'm having some issues being able to implement an MLE and NLS model for some estimation that I am trying to perform. The model is as follows
Y=A*(K^b_1)*(L^b_2)+e
Where e is just the error term and B and D are input variables. The goal is to try to estimate b_1 and b_2. My first equation is, how would I put this into the nls function as when I try to do this I get the following error
prod.nls<-nls(Y~A*(K^Beta_1)*(L^Beta_2), start =list(A = 2, Beta_1= 2, Beta_2=2))
Error in numericDeriv(form[[3L]], names(ind), env) :
Missing value or an infinity produced when evaluating the model
My other question is the model above can be rewritten in terms of logs,
log(Y)= log(A)+b_1log(K)+b_2log(L)
I omit the error term as it becomes irreverent and the A is just a scalar so I leave it out as well. However when I place that model into R using the mle function I get an error as follows,
prod.mle<-mle(log(Y)~log(K)+log(L))
Error in minuslogl() : could not find function "minuslogl"
In addition: Warning messages:
1: In formals(fun) : argument is not a function
2: In formals(fun) : argument is not a function
A small table of values from the dataset is provided below to be able to reproduce these errors.
Thank you for the help ahead of time.
Y
K
L
26971.71
32.46371
3013256.014
330252.5
28.42238
135261574.9
127345.3
5.199048
39168414.92
3626843
327.807
1118363069
37192.73
16.01538
9621912.503
1) Try removing A and just optimizing with the other 2 parameters. Then use the result as starting values to reoptimize. In the second application of nls we use the plinear algorithm which does not require starting values for parameters that enter linearly. When plinear is used the right hand side should be a matrix such that a linear parameter multiplies each column. In this case we only have one column.
fo <- Y ~ cbind(A = (K^Beta_1) * (L^Beta_2))
st <- list(Beta_1 = 2, Beta_2 = 2)
fm0 <- nls(fo, DF, start = st)
fm1 <- nls(fo, DF, start = coef(fm0), alg = "plinear"); fm1
giving:
Nonlinear regression model
model: Y ~ cbind(A = (K^Beta_1) * (L^Beta_2))
data: DF
Beta_1 Beta_2 .lin.A
0.25399 0.81422 0.03572
residual sum-of-squares: 2.468e+09
Number of iterations to convergence: 7
Achieved convergence tolerance: 6.56e-06
2) If we take logs of both sides then the formula is linear in all parameters assuming we use log(A) rather than A as one parameter thus we can use lm. Note that that is not an exactly equivalent problem to the original problem although it may be close enough for you. Below we use it as an alternative way to get starting values.
fm2 <- lm(log(Y) ~ log(K) + log(L), DF)
co2 <- coef(fm2)
st2 <- list(Beta_1 = co2[[2]], Beta_2 = co2[[3]])
fm3 <- nls(fo, DF, start = st2, alg = "plinear"); fm3
giving:
Nonlinear regression model
model: Y ~ cbind(A = (K^Beta_1) * (L^Beta_2))
data: DF
Beta_1 Beta_2 .lin.A
0.2540 0.8143 0.0357
residual sum-of-squares: 2.468e+09
Number of iterations to convergence: 6
Achieved convergence tolerance: 3.744e-06
Note
The input DF in reproducible form is:
DF <- structure(list(Y = c(26971.71, 330252.5, 127345.3, 3626843, 37192.73
), K = c(32.46371, 28.42238, 5.199048, 327.807, 16.01538), L = c(3013256.014,
135261574.9, 39168414.92, 1118363069, 9621912.503)),
class = "data.frame", row.names = c(NA, -5L))

Manual Perceptron example in R - are the results acceptable?

I am trying to get a perceptron algorithm for classification working but I think something is missing. This is the decision boundary achieved with logistic regression:
The red dots got into college, after performing better on tests 1 and 2.
This is the data, and this is the code for the logistic regression in R:
dat = read.csv("perceptron.txt", header=F)
colnames(dat) = c("test1","test2","y")
plot(test2 ~ test1, col = as.factor(y), pch = 20, data=dat)
fit = glm(y ~ test1 + test2, family = "binomial", data = dat)
coefs = coef(fit)
(x = c(min(dat[,1])-2, max(dat[,1])+2))
(y = c((-1/coefs[3]) * (coefs[2] * x + coefs[1])))
lines(x, y)
The code for the "manual" implementation of the perceptron is as follows:
# DATA PRE-PROCESSING:
dat = read.csv("perceptron.txt", header=F)
dat[,1:2] = apply(dat[,1:2], MARGIN = 2, FUN = function(x) scale(x)) # scaling the data
data = data.frame(rep(1,nrow(dat)), dat) # introducing the "bias" column
colnames(data) = c("bias","test1","test2","y")
data$y[data$y==0] = -1 # Turning 0/1 dependent variable into -1/1.
data = as.matrix(data) # Turning data.frame into matrix to avoid mmult problems.
# PERCEPTRON:
set.seed(62416)
no.iter = 1000 # Number of loops
theta = rnorm(ncol(data) - 1) # Starting a random vector of coefficients.
theta = theta/sqrt(sum(theta^2)) # Normalizing the vector.
h = theta %*% t(data[,1:3]) # Performing the first f(theta^T X)
for (i in 1:no.iter){ # We will recalculate 1,000 times
for (j in 1:nrow(data)){ # Each time we go through each example.
if(h[j] * data[j, 4] < 0){ # If the hypothesis disagrees with the sign of y,
theta = theta + (sign(data[j,4]) * data[j, 1:3]) # We + or - the example from theta.
}
else
theta = theta # Else we let it be.
}
h = theta %*% t(data[,1:3]) # Calculating h() after iteration.
}
theta # Final coefficients
mean(sign(h) == data[,4]) # Accuracy
With this, I get the following coefficients:
bias test1 test2
9.131054 19.095881 20.736352
and an accuracy of 88%, consistent with that calculated with the glm() logistic regression function: mean(sign(predict(fit))==data[,4]) of 89% - logically, there is no way of linearly classifying all of the points, as it is obvious from the plot above. In fact, iterating only 10 times and plotting the accuracy, a ~90% is reach after just 1 iteration:
Being in line with the training classification performance of logistic regression, it is likely that the code is not conceptually wrong.
QUESTIONS: Is it OK to get coefficients so different from the logistic regression:
(Intercept) test1 test2
1.718449 4.012903 3.743903
This is really more of a CrossValidated question than a StackOverflow question, but I'll go ahead and answer.
Yes, it's normal and expected to get very different coefficients because you can't directly compare the magnitude of the coefficients between these 2 techniques.
With the logit (logistic) model you're using a binomial distribution and logit-link based on a sigmoid cost function. The coefficients are only meaningful in this context. You've also got an intercept term in the logit.
None of this is true for the perceptron model. The interpretation of the coefficients are thus totally different.
Now, that's not saying anything about which model is better. There aren't comparable performance metrics in your question that would allow us to determine that. To determine that you should do cross-validation or at least use a holdout sample.

Fitting non-linear Langmuir Isotherm in R

I want to fit Isotherm models for the following data in R. The simplest isotherm model is Langmuir model given here model is given in the bottom of the page. My MWE is given below which throw the error. I wonder if there is any R package for Isotherm models.
X <- c(10, 30, 50, 70, 100, 125)
Y <- c(155, 250, 270, 330, 320, 323)
Data <- data.frame(X, Y)
LangIMfm2 <- nls(formula = Y ~ Q*b*X/(1+b*X), data = Data, start = list(Q = 1, b = 0.5), algorith = "port")
Error in nls(formula = Y ~ Q * b * X/(1 + b * X), data = Data, start = list(Q = 1, :
Convergence failure: singular convergence (7)
Edited
Some nonlinear models can be transform to linear models. My understanding is that there might be one-to-one relationship between the estimates of nonlinear model and its linear model form but their corresponding standard errors are not related to each other. Is this assertion true? Are there any pitfalls in fitting Nonlinear Models by transforming to linearity?
I am not aware of such packages and personally I don't think that you need one as the problem can be solved using a base R.
nls is sensitive to the starting parameters, so you should begin with a good starting guess. You can easily evaluate Q because it corresponds to the asymptotic limit of the isotherm at x-->Inf, so it is reasonable to begin with Q=323 (which is the last value of Y in your sample data set).
Next, you could do plot(Data) and add a line with an isotherm that corresponds to your starting parameters Q and b and tweak b to come up with a reasonable guess.
The plot below shows your data set (points) and a probe isotherm with Q = 323 and b = 0.5, generated by with(Data,lines(X,323*0.5*X/(1+0.5*X),col='red')) (red line). It seemed a reasonable starting guess to me, and I gave it a try with nls:
LangIMfm2 <- nls(formula = Y ~ Q*b*X/(1+b*X), data = Data, start = list(Q = 300, b = 1), algorith = "port")
# Nonlinear regression model
# model: Y ~ Q * b * X/(1 + b * X)
# data: Data
# Q b
# 366.2778 0.0721
# residual sum-of-squares: 920.6
#
# Algorithm "port", convergence message: relative convergence (4)
and plotted predicted line to make sure that nls found the right solution:
lines(Data$X,predict(LangIMfm2),col='green')
Having said that, I would suggest to use a more effective strategy, based on the linearization of the model by rewriting the isotherm equation in reciprocal coordinates:
z <- 1/Data
plot(Y~X,z)
abline(lm(Y~X,z))
M <- lm(Y~X,z)
Q <- 1/coef(M)[1]
# 363.2488
b <- coef(M)[1]/coef(M)[2]
# 0.0741759
As you could see, both approaches produce essentially the same result, but the linear model is more robust and doesn't require starting parameters (and, as far as I remember, it is the standard way of the isotherm analysis in the experimental physical chemistry).
You can use the SSmicmen self-starter function (see Ritz and Streibig, 2008, Nonlinear Regression with R) in the nlme package for R, which calculates initial parameters from the fit of the linearized form of the Michaelis-Menten (MM) equation. Fortunately, the MM equation possesses a form that can be adapted for the Langmuir equation, S = Smax*x/(KL + x). I've found the nlshelper and tidyverse packages useful for modeling and exporting the results of the nls command into tables and plots, particularly when modeling sample groups. Here's my code for modeling a single set of sorption data:
library(tidyverse)
library(nlme)
library(nlshelper)
lang.fit <- nls(Y ~ SSmicmen(X,Smax,InvKL), data=Data)
fit.summary <- tidy(lang.fit)
fit.coefs <- coef(lang.fit)
For simplicity, the Langmuir affinity constant is modeled here as 1/KL. Applying this code, I get the same parameter estimates as #Marat given above.
The simple code below allows for wrangling the data in order to create a ggplot object, containing the original points and fitted line (i.e., geom_point would represent the original X and Y data, geom_line would represent the original X plus YHat).
FitY <- tibble(predict(lang.fit))
YHat <- FitY[,1]
Data2 <- cbind(Data, YHat)
If you want to model multiple groups of data (say, based on a "Sample_name" column, then the lang.fit variable would be calculated as below, this time using the nlsList command:
lang.fit <- nlsList(Y ~ SSmicmen(X,Smax,InvKL) | Sample_name, data=Data)
The problem is the starting values. We show two approaches to this as well as an alternative that converges even using the starting values in the question.
1) plinear The right hand side is linear in Q*b so it would be better to absorb b into Q and then we have a parameter that enters linearly so it is easier to solve. Also with the plinear algorithm no starting values are needed for the linear parameter so only the starting value for b need be specified. With plinear the right hand side of the nls formula should be specified as the vector that multiplies the linear parameter. The result of running nls giving fm0 below will be coefficients named b and .lin where Q = .lin / b.
We already have our answer from fm0 but if we want a clean run in terms of b and Q rather than b and .lin we can run the original formula in the question using the starting values implied by the coefficients returned by fm0 as shown.
fm0 <- nls(Y ~ X/(1+b*X), Data, start = list(b = 0.5), alg = "plinear")
st <- with(as.list(coef(fm0)), list(b = b, Q = .lin/b))
fm <- nls(Y ~ Q*b*X/(1+b*X), Data, start = st)
fm
giving
Nonlinear regression model
model: Y ~ Q * b * X/(1 + b * X)
data: Data
b Q
0.0721 366.2778
residual sum-of-squares: 920.6
Number of iterations to convergence: 0
Achieved convergence tolerance: 9.611e-07
We can display the result. The points are the data and the red line is the fitted curve.
plot(Data)
lines(fitted(fm) ~ X, Data, col = "red")
(contineud after plot)
2) mean Alternately, using a starting value of mean(Data$Y) for Q seems to work well.
nls(Y ~ Q*b*X/(1+b*X), Data, start = list(b = 0.5, Q = mean(Data$Y)))
giving:
Nonlinear regression model
model: Y ~ Q * b * X/(1 + b * X)
data: Data
b Q
0.0721 366.2779
residual sum-of-squares: 920.6
Number of iterations to convergence: 6
Achieved convergence tolerance: 5.818e-06
The question already had a reasonable starting value for b which we used but if one were needed one could set Y to Q*b so that they cancel and X to mean(Data$X) and solve for b to give b = 1 - 1/mean(Data$X) as a possible starting value. Although not shown using this starting value for b with mean(Data$Y) as the starting value for Q also resulted in convergence.
3) optim If we use optim the algorithm converges even with the initial values used in the question. We form the residual sum of squares and minimize that:
rss <- function(p) {
Q <- p[1]
b <- p[2]
with(Data, sum((Y - b*Q*X/(1+b*X))^2))
}
optim(c(1, 0.5), rss)
giving:
$par
[1] 366.27028219 0.07213613
$value
[1] 920.62
$counts
function gradient
249 NA
$convergence
[1] 0
$message
NULL

Calculating R^2 for a nonlinear least squares fit

Suppose I have x values, y values, and expected y values f (from some nonlinear best fit curve).
How can I compute R^2 in R? Note that this function is not a linear model, but a nonlinear least squares (nls) fit, so not an lm fit.
You just use the lm function to fit a linear model:
x = runif(100)
y = runif(100)
spam = summary(lm(x~y))
> spam$r.squared
[1] 0.0008532386
Note that the r squared is not defined for non-linear models, or at least very tricky, quote from R-help:
There is a good reason that an nls model fit in R does not provide
r-squared - r-squared doesn't make sense for a general nls model.
One way of thinking of r-squared is as a comparison of the residual
sum of squares for the fitted model to the residual sum of squares for
a trivial model that consists of a constant only. You cannot
guarantee that this is a comparison of nested models when dealing with
an nls model. If the models aren't nested this comparison is not
terribly meaningful.
So the answer is that you probably don't want to do this in the first
place.
If you want peer-reviewed evidence, see this article for example; it's not that you can't compute the R^2 value, it's just that it may not mean the same thing/have the same desirable properties as in the linear-model case.
Sounds like f are your predicted values. So the distance from them to the actual values devided by n * variance of y
so something like
1-sum((y-f)^2)/(length(y)*var(y))
should give you a quasi rsquared value, so long as your model is reasonably close to a linear model and n is pretty big.
As a direct answer to the question asked (rather than argue that R2/pseudo R2 aren't useful) the nagelkerke function in the rcompanion package will report various pseudo R2 values for nonlinear least square (nls) models as proposed by McFadden, Cox and Snell, and Nagelkerke, e.g.
require(nls)
data(BrendonSmall)
quadplat = function(x, a, b, clx) {
ifelse(x < clx, a + b * x + (-0.5*b/clx) * x * x,
a + b * clx + (-0.5*b/clx) * clx * clx)}
model = nls(Sodium ~ quadplat(Calories, a, b, clx),
data = BrendonSmall,
start = list(a = 519,
b = 0.359,
clx = 2304))
nullfunct = function(x, m){m}
null.model = nls(Sodium ~ nullfunct(Calories, m),
data = BrendonSmall,
start = list(m = 1346))
nagelkerke(model, null=null.model)
The soilphysics package also reports Efron's pseudo R2 and adjusted pseudo R2 value for nls models as 1 - RSS/TSS:
pred <- predict(model)
n <- length(pred)
res <- resid(model)
w <- weights(model)
if (is.null(w)) w <- rep(1, n)
rss <- sum(w * res ^ 2)
resp <- pred + res
center <- weighted.mean(resp, w)
r.df <- summary(model)$df[2]
int.df <- 1
tss <- sum(w * (resp - center)^2)
r.sq <- 1 - rss/tss
adj.r.sq <- 1 - (1 - r.sq) * (n - int.df) / r.df
out <- list(pseudo.R.squared = r.sq,
adj.R.squared = adj.r.sq)
which is also the pseudo R2 as calculated by the accuracy function in the rcompanion package. Basically, this R2 measures how much better your fit becomes compared to if you would just draw a flat horizontal line through them. This can make sense for nls models if your null model is one that allows for an intercept only model. Also for particular other nonlinear models it can make sense. E.g. for a scam model that uses stricly increasing splines (bs="mpi" in the spline term), the fitted model for the worst possible scenario (e.g. where your data was strictly decreasing) would be a flat line, and hence would result in an R2 of zero. Adjusted R2 then also penalize models with higher nrs of fitted parameters. Using the adjusted R2 value would already address a lot of the criticisms of the paper linked above, http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2892436/ (besides if one swears by using information criteria to do model selection the question becomes which one to use - AIC, BIC, EBIC, AICc, QIC, etc).
Just using
r.sq <- max(cor(y,yfitted),0)^2
adj.r.sq <- 1 - (1 - r.sq) * (n - int.df) / r.df
I think would also make sense if you have normal Gaussian errors - i.e. the correlation between the observed and fitted y (clipped at zero, so that a negative relationship would imply zero predictive power) squared, and then adjusted for the nr of fitted parameters in the adjusted version. If y and yfitted go in the same direction this would be the R2 and adjusted R2 value as reported for a regular linear model. To me this would make perfect sense at least, so I don't agree with outright rejecting the usefulness of pseudo R2 values for nls models as the answer above seems to imply.
For non-normal error structures (e.g. if you were using a GAM with non-normal errors) the McFadden pseudo R2 is defined analogously as
1-residual deviance/null deviance
See here and here for some useful discussion.
Another quasi-R-squared for non-linear models is to square the correlation between the actual y-values and the predicted y-values. For linear models this is the regular R-squared.
As an alternative to this problem I used at several times the following procedure:
compute a fit on data with the nls function
using the resulting model make predictions
Trace (plot...) the data against the values predicted by the model (if the model is good, points should be near the bissectrix).
Compute the R2 of the linear régression.
Best wishes to all. Patrick.
With the modelr package
modelr::rsquare(nls_model, data)
nls_model <- nls(mpg ~ a / wt + b, data = mtcars, start = list(a = 40, b = 4))
modelr::rsquare(nls_model, mtcars)
# 0.794
This gives essentially the same result as the longer way described by Tom from the rcompanion resource.
Longer way with nagelkerke function
nullfunct <- function(x, m){m}
null_model <- nls(mpg ~ nullfunct(wt, m),
data = mtcars,
start = list(m = mean(mtcars$mpg)))
nagelkerke(nls_model, null_model)[2]
# 0.794 or 0.796
Lastly, using predicted values
lm(mpg ~ predict(nls_model), data = mtcars) %>% broom::glance()
# 0.795
Like they say, it's only an approximation.

How to put a complicated equation into a R formula?

We have the diameter of trees as the predictor and tree height as the dependent variable. A number of different equations exist for this kind of data and we try to model some of them and compare the results.
However, we we can't figure out how to correctly put one equation into the corresponding R formula format.
The trees data set in R can be used as an example.
data(trees)
df <- trees
df$h <- df$Height * 0.3048 #transform to metric system
df$dbh <- (trees$Girth * 0.3048) / pi #transform tree girth to diameter
First, the example of an equation that seems to work well:
form1 <- h ~ I(dbh ^ -1) + I( dbh ^ 2)
m1 <- lm(form1, data = df)
m1
Call:
lm(formula = form1, data = df)
Coefficients:
(Intercept) I(dbh^-1) I(dbh^2)
27.1147 -5.0553 0.1124
Coefficients a, b and c are estimated, which is what we are interested in.
Now the problematic equation:
Trying to fit it like this:
form2 <- h ~ I(dbh ^ 2) / dbh + I(dbh ^ 2) + 1.3
gives an error:
m1 <- lm(form2, data = df)
Error in terms.formula(formula, data = data)
invalid model formula in ExtractVars
I guess this is because / is interpreted as a nested model and not an arithmetic operator?
This doesn't give an error:
form2 <- h ~ I(I(dbh ^ 2) / dbh + I(dbh ^ 2) + 1.3)
m1 <- lm(form2, data = df)
But the result is not the one we want:
m1
Call:
lm(formula = form2, data = df)
Coefficients:
(Intercept) I(I(dbh^2)/dbh + I(dbh^2) + 1.3)
19.3883 0.8727
Only one coefficient is given for the whole term within the outer I(), which seems to be logic.
How can we fit the second equation to our data?
Assuming you are using nls the R formula can use an ordinary R function, H(a, b, c, D), so the formula can be just h ~ H(a, b, c, dbh) and this works:
# use lm to get startingf values
lm1 <- lm(1/(h - 1.3) ~ I(1/dbh) + I(1/dbh^2), df)
start <- rev(setNames(coef(lm1), c("c", "b", "a")))
# run nls
H <- function(a, b, c, D) 1.3 + D^2 / (a + b * D + c * D^2)
nls1 <- nls(h ~ H(a, b, c, dbh), df, start = start)
nls1 # display result
Graphing the output:
plot(h ~ dbh, df)
lines(fitted(nls1) ~ dbh, df)
You've got a couple problems. (1) You're missing parentheses for the denominator of form2 (and R has no way to know that you want to add a constant a in the denominator, or where to put any of the parameters, really), and much more problematic: (2) your 2nd model isn't linear, so lm won't work.
Fixing (1) is easy:
form2 <- h ~ 1.3 + I(dbh^2) / (a + b * dbh + c * I(dbh^2))
Fixing (2), though there are many ways to estimate parameters for a nonlinear model, the nls (nonlinear least squares) is a good place to start:
m2 <- nls(form2, data = df, start = list(a = 1, b = 1, c = 1))
You need to provide starting guesses for the parameters in nls. I just picked 1's, but you should use better guesses that ballpark what the parameters might be.
edit: fixed, no longer incorrectly using offset ...
An answer that complements #shujaa's:
You can transform your problem from
H = 1.3 + D^2/(a+b*D+c*D^2)
to
1/(H-1.3) = a/D^2+b/D+c
This would normally mess up the assumptions of the model (i.e., if H were normally distributed with constant variance, then 1/(H-1.3) wouldn't be. However, let's try it anyway:
data(trees)
df <- transform(trees,
h=Height * 0.3048, #transform to metric system
dbh=Girth * 0.3048 / pi #transform tree girth to diameter
)
lm(1/(h-1.3) ~ poly(I(1/dbh),2,raw=TRUE),data=df)
## Coefficients:
## (Intercept) poly(I(1/dbh), 2, raw = TRUE)1
## 0.043502 -0.006136
## poly(I(1/dbh), 2, raw = TRUE)2
## 0.010792
These results would normally be good enough to get good starting values for the nls fit. However, you can do better than that via glm, which uses a link function to allow for some forms of non-linearity. Specifically,
(fit2 <- glm(h-1.3 ~ poly(I(1/dbh),2,raw=TRUE),
family=gaussian(link="inverse"),data=df))
## Coefficients:
## (Intercept) poly(I(1/dbh), 2, raw = TRUE)1
## 0.041795 -0.002119
## poly(I(1/dbh), 2, raw = TRUE)2
## 0.008175
##
## Degrees of Freedom: 30 Total (i.e. Null); 28 Residual
## Null Deviance: 113.2
## Residual Deviance: 80.05 AIC: 125.4
##
You can see that the results are approximately the same as the linear fit, but not quite.
pframe <- data.frame(dbh=seq(0.8,2,length=51))
We use predict, but need to correct the prediction to account for the fact that we subtracted a constant from the LHS:
pframe$h <- predict(fit2,newdata=pframe,type="response")+1.3
p2 <- predict(fit2,newdata=pframe,se.fit=TRUE) ## predict on link scale
pframe$h_lwr <- with(p2,1/(fit+1.96*se.fit))+1.3
pframe$h_upr <- with(p2,1/(fit-1.96*se.fit))+1.3
png("dbh_tmp1.png",height=4,width=6,units="in",res=150)
par(las=1,bty="l")
plot(h~dbh,data=df)
with(pframe,lines(dbh,h,col=2))
with(pframe,polygon(c(dbh,rev(dbh)),c(h_lwr,rev(h_upr)),
border=NA,col=adjustcolor("black",alpha=0.3)))
dev.off()
Because we have used the constant on the LHS (this almost, but doesn't quite, fit into the framework of using an offset -- we could only use an offset if our formula were 1/H - 1.3 = a/D^2 + ..., i.e. if the constant adjustment were on the link (inverse) scale rather than the original scale), this doesn't fit perfectly into ggplot's geom_smooth framework
library("ggplot2")
ggplot(df,aes(dbh,h))+geom_point()+theme_bw()+
geom_line(data=pframe,colour="red")+
geom_ribbon(data=pframe,colour=NA,alpha=0.3,
aes(ymin=h_lwr,ymax=h_upr))
ggsave("dbh_tmp2.png",height=4,width=6)

Resources