R expression from `nls` fit? - r

I've fit a parametric function using nls, and now I want to print out an expression of the function with the learned parameters substituted back in. For example:
x <- runif(100, 0, 100)
m <- 13 * exp(-0.05 * x^2) + 0.1 + runif(100,0,0.1)
mod <- nls(m ~ a*exp(-b*x^2)+c, start=list(a=10,b=0.1,c=0.1))
I can extract the formula and coefficients like so:
formula(mod)
# m ~ a * exp(-b * x^2) + c
coef(mod)
# a b c
# 13.00029360 0.04975388 0.14457936
But I don't see a way to substitute them back directly. The only thing I can seem to do involves writing out the formula again:
substitute(m ~ a * exp(-b * x^2) + c, as.list(round(coef(mod), 4)))
# m ~ 13.0003 * exp(-0.0498 * x^2) + 0.1446
My ultimate goal here is to read a fitted nls object from an RDS file on disk and show its functional expression in an org-mode document.

Is this what you're looking for?
do.call(substitute, args=list(formula(mod), as.list(round(coef(mod),4))))
# m ~ 13.0097 * exp(-0.0501 * x^2) + 0.1536
It works because do.call first evaluates both of the arguments in args and only then uses substitute() to substitute the coefficients into the formula expression. i.e., the expression that do.call() ultimately evaluates looks like this one, as desired:
as.call(list(substitute, formula(mod), as.list(round(coef(mod),4))))
# .Primitive("substitute")(m ~ a * exp(-b * x^2) + c, list(a = 13.0097,
# b = 0.0501, c = 0.1536))

Related

How to interpret coefficients of logistic regression

I'm trying to figure out how the coefficients of logistic regression with a polynomial term relate to predictions. Specifically, I'm interested in the location on the x-axis where the prediction is highest. Example below:
set.seed(42)
# Setup some dummy data
x <- 1:200
y <- rep(0, length(x))
y[51:150] <- rbinom(100, 1, 0.5)
# Fit a model
family <- binomial()
model <- glm(y ~ poly(x, 2), family = family)
# Illustrate model
plot(x, y)
lines(x, family$linkinv(predict(model)), col = 2)
The model above gives me these coefficients:
coef(model)
#> (Intercept) poly(x, 2)1 poly(x, 2)2
#> -1.990317 -3.867855 -33.299893
Created on 2021-08-03 by the reprex package (v1.0.0)
The manual page for poly() states the following:
The orthogonal polynomial is summarized by the coefficients, which can be used to evaluate it via the three-term recursion given in Kennedy & Gentle (1980, pp. 343–4), and used in the predict part of the code.
However, I don't have access to the book, nor am I able to discern from the predict.glm S3 method how these coefficients are handled. Is there a way to reconstruct the location of the summit (around 100 in the example) from the coefficients alone (i.e. without using predict() to find the maximum)?
Derivation of the location of the predicted maximum from the theoretical expressions of the orthogonal polynomials
I got a copy of the "Statistical Computing" book by Kennedy and Gentle (1982) referenced in the documentation of poly and now share my findings about the calculation of the orthogonal polynomials, and how we can use them to find the location x of the maximum predicted value.
The orthogonal polynomials presented in the book (pp. 343-4) are monic (i.e. the highest order coefficient is always 1) and are obtained by the following recurrence procedure:
where q is the number of orthogonal polynomials considered.
Note the following relationship of the above terminology with the documentation of poly:
The "three-term recursion" appearing in the excerpt included in your question is the RHS of the third expression which has precisely three terms.
The rho(j+1) coefficients in the third expression are called "centering constants".
The gamma(j) coefficients in the third expression do not have a name in the documentation but they are directly related to the "normalization constants", as seen below.
For reference, here I paste the relevant excerpt of the "Value" section of the poly documentation:
A matrix with rows corresponding to points in x and columns corresponding to the degree, with attributes "degree" specifying the degrees of the columns and (unless raw = TRUE) "coefs" which contains the centering and normalization constants used in constructing the orthogonal polynomials
Going back to the recurrence, we can derive the values of parameters rho(j+1) and gamma(j) from the third expression by imposing the orthogonality condition on p(j+1) w.r.t. p(j) and p(j-1).
(It's important to note that the orthogonality condition is not an integral, but a summation on the n observed x points, so the polynomial coefficients depend on the data! --which is not the case for instance for the Tchebyshev orthogonal polynomials).
The expressions for the parameters become:
For the polynomials of orders 1 and 2 used in your regression, we get the following expressions, already written in R code:
# First we define the number of observations in the data
n = length(x)
# For p1(x):
# p1(x) = (x - rho1) p0(x) (since p_{-1}(x) = 0)
rho1 = mean(x)
# For p2(x)
# p2(x) = (x - rho2) p1(x) - gamma1
gamma1 = var(x) * (n-1)/n
rho2 = sum( x * (x - mean(x))^2 ) / (n*gamma1)
for which we get:
> c(rho1, rho2, gamma1)
[1] 100.50 100.50 3333.25
Note that coefs attribute of poly(x,2) is:
> attr(poly(x,2), "coefs")
$alpha
[1] 100.5 100.5
$norm2
[1] 1 200 666650 1777555560
where $alpha contains the centering constants, i.e. the rho values (which coincide with ours --incidentally all centering constants are equal to the average of x when the distribution of x is symmetric for any q! (observed and proved)), and $norm2 contains the normalization constants (in this case for p(-1,x), p(0,x), p(1,x), and p(2,x)), that is the constants c(j) that normalize the polynomials in the recurrence formula --by dividing them by sqrt(c(j))--, making the resulting polynomials r(j,x) satisfy sum_over_i{ r(j,x_i)^2 } = 1; note that r(j,x) are the polynomials stored in the object returned by poly().
From the expression already given above, we observe that gamma(j) is precisely the ratio of two consecutive normalization constants, namely: gamma(j) = c(j) / c(j-1).
We can check that our gamma1 value coincides with this ratio by computing:
gamma1 == attr(poly(x,2), "coefs")$norm2[3] / attr(poly(x,2), "coefs")$norm2[2]
which returns TRUE.
Going back to your problem of finding the maximum of the values predicted by your model, we can:
Express the predicted value as a function of r(1,x) and r(2,x) and the coefficients from the logistic regression, namely:
pred(x) = beta0 + beta1 * r(1,x) + beta2 * r(2,x)
Derive the expression w.r.t. x, set it to 0 and solve for x.
In R code:
# Get the normalization constants alpha(j) to obtain r(j,x) from p(j,x) as
# r(j,x) = p(j,x) / sqrt( norm(j) ) = p(j,x) / alpha(j)
alpha1 = sqrt( attr(poly(x,2), "coefs")$norm2[3] )
alpha2 = sqrt( attr(poly(x,2), "coefs")$norm2[4] )
# Get the logistic regression coefficients (beta1 and beta2)
coef1 = as.numeric( model$coeff["poly(x, 2)1"] )
coef2 = as.numeric( model$coeff["poly(x, 2)2"] )
# Compute the x at which the maximum occurs from the expression that is obtained
# by deriving the predicted expression pred(x) = beta0 + beta1*r(1,x) + beta2*r(2,x)
# w.r.t. x and setting the derivative to 0.
xmax = ( alpha2^-1 * coef2 * (rho1 + rho2) - alpha1^-1 * coef1 ) / (2 * alpha2^-1 * coef2)
which gives:
> xmax
[1] 97.501114
i.e. the same value obtained with the other "empirical" method described in my previous answer.
The full code to obtain the location x of the maximum of the predicted values, starting off from the code you provided, is:
# First we define the number of observations in the data
n = length(x)
# Parameters for p1(x):
# p1(x) = (x - rho1) p0(x) (since p_{-1}(x) = 0)
rho1 = mean(x)
# Parameters for p2(x)
# p2(x) = (x - rho2) p1(x) - gamma1
gamma1 = var(x) * (n-1)/n
rho2 = mean( x * (x - mean(x))^2 ) / gamma1
# Get the normalization constants alpha(j) to obtain r(j,x) from p(j,x) as
# r(j,x) = p(j,x) / sqrt( norm(j) ) = p(j,x) / alpha(j)
alpha1 = sqrt( attr(poly(x,2), "coefs")$norm2[3] )
alpha2 = sqrt( attr(poly(x,2), "coefs")$norm2[4] )
# Get the logistic regression coefficients (beta1 and beta2)
coef1 = as.numeric( model$coeff["poly(x, 2)1"] )
coef2 = as.numeric( model$coeff["poly(x, 2)2"] )
# Compute the x at which the maximum occurs from the expression that is obtained
# by deriving the predicted expression pred(x) = beta0 + beta1*r(1,x) + beta2*r(2,x)
# w.r.t. x and setting the derivative to 0.
( xmax = ( alpha2^-1 * coef2 * (rho1 + rho2) - alpha1^-1 * coef1 ) / (2 * alpha2^-1 * coef2) )
Assuming you want to find the maximum of the prediction analytically for this particular case where the orthogonal polynomials are of order 1 and 2, I propose the following approach:
SUMMARY
1) Infer the polynomial coefficients
This can easily be done by fitting a linear model to the respective polynomial values contained in the model matrix.
2) Derive the prediction expression w.r.t. x and set the derivative to 0
Solve for x in the prediction expression inferred from the polynomial fit in (1) and obtain the value of x at which the prediction's maximum occurs.
DETAILS
1) Polynomial coefficients
Following from the line where you fit the GLM model, we estimate the coefficients for the polynomial of order 1, p1(x) = a0 + a1*x, and the coefficients for the polynomial of order 2, p2(x) = b0 + b1*x + b2*x^2:
X = model.matrix(model)
p1 = X[, "poly(x, 2)1"]
p2 = X[, "poly(x, 2)2"]
p1.lm = lm(p1 ~ x)
a0 = p1.lm$coeff["(Intercept)"]
a1 = p1.lm$coeff["x"]
p2.lm = lm(p2 ~ x + I(x^2))
b0 = p2.lm$coeff["(Intercept)"]
b1 = p2.lm$coeff["x"]
b2 = p2.lm$coeff["I(x^2)"]
This gives:
> c(a0, a1, b0, b1, b2)
(Intercept) x (Intercept) x I(x^2)
-1.2308840e-01 1.2247602e-03 1.6050353e-01 -4.7674315e-03 2.3718565e-05
2) Derivative of the prediction to find the maximum
The expression for the prediction, z, (before the inverse link function) is:
z = Intercept + coef1 * p1(x) + coef2 * p2(x)
We derive this expression and set it to 0 to obtain:
coef1 * a1 + coef2 * (b1 + 2 * b2 * xmax) = 0
Solving for xmax we get:
xmax = - (coef1 * a1 + coef2 * b1) / (2 * coef2 * b2)
In R code, this is computed as:
coef1 = as.numeric( model$coeff["poly(x, 2)1"] )
coef2 = as.numeric( model$coeff["poly(x, 2)2"] )
(xmax = - ( coef1 * a1 + coef2 * b1 ) / (2 * coef2 * b2))
which gives:
x
97.501114
CHECK
We can verify the maximum by adding it to the prediction's curve as a green cross:
# Prediction curve computed analytically
Intercept = model$coeff["(Intercept)"]
pred.analytical = family$linkinv( Intercept + coef1 * p1 + coef2 * p2 )
# Find the prediction's maximum analytically
pred.max = family$linkinv( Intercept + coef1 * (a0 + a1 * xmax) +
coef2 * (b0 + b1 * xmax + b2 * xmax^2) )
# Plot
plot(x, y)
# The following two lines should coincide!
lines(x, pred.analytical, col = 3)
lines(x, family$linkinv(predict(model)), col = 2)
# Location of the maximum!
points(xmax, pred.max, pch="x", col="green")
which gives:

Adding self starting values to an nls regression in R

I have existing code for fitting a sigmoid curve to data in R. How can I used selfstart (or another method) to automatically find start values for the regression?
sigmoid = function(params, x) {
params[1] / (1 + exp(-params[2] * (x - params[3])))
}
dataset = data.frame("x" = 1:53, "y" =c(0,0,0,0,0,0,0,0,0,0,0,0,0,0.1,0.18,0.18,0.18,0.33,0.33,0.33,0.33,0.41,0.41,0.41,0.41,0.41,0.41,0.5,0.5,0.5,0.5,0.68,0.58,0.58,0.68,0.83,0.83,0.83,0.74,0.74,0.74,0.83,0.83,0.9,0.9,0.9,1,1,1,1,1,1,1) )
x = dataset$x
y = dataset$y
# fitting code
fitmodel <- nls(y~a/(1 + exp(-b * (x-c))), start=list(a=1,b=.5,c=25))
# visualization code
# get the coefficients using the coef function
params=coef(fitmodel)
y2 <- sigmoid(params,x)
plot(y2,type="l")
points(y)
This is a common (and interesting) problem in non-linear curve fitting.
Background
We can find sensible starting values if we take a closer look at the function sigmoid
We first note that
So for large values of x, the function approaches a. In other words, as a starting value for a we may choose the value of y for the largest value of x.
In R language, this translates to y[which.max(x)].
Now that we have a starting value for a, we need to decide on starting values for b and c. To do that, we can make use of the geometric series
and expand f(x) = y by keeping only the first two terms
We now set a = 1 (our starting value for a), re-arrange the equation and take the logarithm on both sides
We can now fit a linear model of the form log(1 - y) ~ x to obtain estimates for the slope and offset, which in turn provide the starting values for b and c.
R implementation
Let's define a function that takes as an argument the values x and y and returns a list of parameter starting values
start_val_sigmoid <- function(x, y) {
fit <- lm(log(y[which.max(x)] - y + 1e-6) ~ x)
list(
a = y[which.max(x)],
b = unname(-coef(fit)[2]),
c = unname(-coef(fit)[1] / coef(fit)[2]))
}
Based on the data for x and y you give, we obtain the following starting values
start_val_sigmoid(x, y)
#$a
#[1] 1
#
#$b
#[1] 0.2027444
#
#$c
#[1] 15.01613
Since start_val_sigmoid returns a list we can use its output directly as the start argument in nls
nls(y ~ a / ( 1 + exp(-b * (x - c))), start = start_val_sigmoid(x, y))
#Nonlinear regression model
# model: y ~ a/(1 + exp(-b * (x - c)))
# data: parent.frame()
# a b c
# 1.0395 0.1254 29.1725
# residual sum-of-squares: 0.2119
#
#Number of iterations to convergence: 9
#Achieved convergence tolerance: 9.373e-06
Sample data
dataset = data.frame("x" = 1:53, "y" =c(0,0,0,0,0,0,0,0,0,0,0,0,0,0.1,0.18,0.18,0.18,0.33,0.33,0.33,0.33,0.41,0.41,0.41,0.41,0.41,0.41,0.5,0.5,0.5,0.5,0.68,0.58,0.58,0.68,0.83,0.83,0.83,0.74,0.74,0.74,0.83,0.83,0.9,0.9,0.9,1,1,1,1,1,1,1) )
x = dataset$x
y = dataset$y

what is intercept in coef of smooth.basis with fourie basis?

suppose I have a data like y and I fit a smooth function to this data with Fourier basis
y<- c(1,2,5,8,9,2,5)
x <- seq_along(y)
Fo <- create.fourier.basis(c(0, 7), 4)
precfd = smooth.basis(x,y,Fo)
plotfit.fd(y, x, precfd$fd)
precfd <- smooth.basis(x, y, Fo);coef(precfd)
the out put of last line gives me this:
const 411.1060285
sin1 -30.5584033
cos1 6.5740933
sin2 26.2855849
cos2 -26.0153965
I know what is the coefficient but what in const? in original formula there is no constant part as this link say:
http://lampx.tugraz.at/~hadley/num/ch3/3.3a.php
The first basis function in create.fourier.basis is a constant function to allow for a non-zero mean (intercept) in the data. From the documentation of the create.fourier.basis function:
The first basis function is the unit function with the value one everywhere. The next two are the sine/cosine pair with period defined in the argument period. The fourth and fifth are the sin/cosine series with period one half of period. And so forth. The number of basis functions is usually odd.
You can drop the first (unit) basis function in create.fourier.basis with the argument dropind = 1. Below some example code that illustrates which basis functions are used in create.fourier.basis. Note: the scaling of the basis functions depends on the period argument in create.fourier.basis.
Example 1: non-zero mean
library(fda)
## time sequence
tt <- seq(from = 0, to = 1, length = 100)
## basis functions
phi_0 <- 1
phi_1 <- function(t) sin(2 * pi * t) / sqrt(1 / 2)
phi_2 <- function(t) cos(2 * pi * t) / sqrt(1 / 2)
## signal
f1 <- 10 * phi_0 + 5 * phi_1(tt) - 5 * phi_2(tt)
## noise
eps <- rnorm(100)
## data
X1 <- f1 + eps
## create Fourier basis with intercept
four.basis1 <- create.fourier.basis(rangeval = range(tt), nbasis = 3)
## evaluate values basis functions
## eval.basis(tt, four.basis1)
## fit Fourier basis to data
four.fit1 <- smooth.basis(tt, X1, four.basis1)
coef(four.fit1)
Example 2: zero mean
## signal
f2 <- 5 * phi_1(tt) - 5 * phi_2(tt)
## data
X2 <- f2 + eps
## create Fourier basis without intercept
four.basis2 <- create.fourier.basis(rangeval = range(tt), nbasis = 3, dropind = 1)
## evaluate values basis functions
## eval.basis(tt, four.basis2)
## fit Fourier basis to data
four.fit2 <- smooth.basis(tt, X2, four.basis2)
coef(four.fit2)

Why I got non-numeric argument to binary operator with nls?

I have two vectors (example):
x=c(100,98,60,30,28,30,20,10)
y=c(10,9.8,5,3,2,3.4,2.8,1)
I would like to fit them using this function:
and get the fitting parameters a b c d
I used this:
m<-nls(x~a/1+e^(-b*(y-c)) + d)
but I got this error:
Error in y - c : non-numeric argument to binary operator
x and y are reversed and e^(...) should be exp(...). Also I found that setting d to 0 helped.
d <- 0 # fix d at 0
st <- list(a = mean(y), b = 1/sd(x), c = mean(x))
fm <- nls(y ~ a/(1+exp(-b*(x-c))) + d, start = st)
fm
giving:
Nonlinear regression model
model: y ~ a/(1 + exp(-b * (x - c)))
data: parent.frame()
a b c
19.96517 0.02623 99.73842
residual sum-of-squares: 1.82
Number of iterations to convergence: 9
Achieved convergence tolerance: 9.023e-06
Plotting this it seems to be a good fit visually:
plot(y ~ x)
lines(fitted(fm) ~ x, col = "red")
I think the reason is that c is considered as the combine operator. Change that to another symbol (c1 for instance). Of course you would also ened to specify meaningful starting paramaters, but I guess that was not your question.

profile confidence intervals in R: mle2

I am trying to use the command mle2, in the package bbmle. I am looking at p2 of "Maximum likelihood estimation and analysis with the bbmle package" by Bolker. Somehow I fail to enter the right start values. Here's the reproducible code:
l.lik.probit <-function(par, ivs, dv){
Y <- as.matrix(dv)
X <- as.matrix(ivs)
K <-ncol(X)
b <- as.matrix(par[1:K])
phi <- pnorm(X %*% b)
sum(Y * log(phi) + (1 - Y) * log(1 - phi))
}
n=200
set.seed(1000)
x1 <- rnorm(n)
x2 <- rnorm(n)
x3 <- rnorm(n)
x4 <- rnorm(n)
latentz<- 1 + 2.0 * x1 + 3.0 * x2 + 5.0 * x3 + 8.0 * x4 + rnorm(n,0,5)
y <- latentz
y[latentz < 1] <- 0
y[latentz >=1] <- 1
x <- cbind(1,x1,x2,x3,x4)
values.start <-c(1,1,1,1,1)
foo2<-mle2(l.lik.probit, start=list(dv=0,ivs=values.start),method="BFGS",optimizer="optim", data=list(Y=y,X=x))
And this is the error I get:
Error in mle2(l.lik.probit, start = list(Y = 0, X = values.start), method = "BFGS", :
some named arguments in 'start' are not arguments to the specified log-likelihood function
Any idea why? Thanks for your help!
You've missed a couple of things, but the most important is that by default mle2 takes a list of parameters; you can make it take a parameter vector instead, but you have to work a little bit harder.
I have tweaked the code slightly in places. (I changed the log-likelihood function to a negative log-likelihood function, without which this would never work!)
l.lik.probit <-function(par, ivs, dv){
K <- ncol(ivs)
b <- as.matrix(par[1:K])
phi <- pnorm(ivs %*% b)
-sum(dv * log(phi) + (1 - dv) * log(1 - phi))
}
n <- 200
set.seed(1000)
dat <- data.frame(x1=rnorm(n),
x2=rnorm(n),
x3=rnorm(n),
x4=rnorm(n))
beta <- c(1,2,3,5,8)
mm <- model.matrix(~x1+x2+x3+x4,data=dat)
latentz<- rnorm(n,mean=mm%*%beta,sd=5)
y <- latentz
y[latentz < 1] <- 0
y[latentz >=1] <- 1
x <- mm
values.start <- rep(1,5)
Now we do the fit. The main thing is to specify vecpar=TRUE and to use parnames to let mle2 know the names of the elements in the parameter vector ...
library("bbmle")
names(values.start) <- parnames(l.lik.probit) <- paste0("b",0:4)
m1 <- mle2(l.lik.probit, start=values.start,
vecpar=TRUE,
method="BFGS",optimizer="optim",
data=list(dv=y,ivs=x))
As pointed out above for this particular example you have just re-implemented the probit regression (although I understand that you now want to extend this to allow for heteroscedasticity in some way ...)
dat2 <- data.frame(dat,y)
m2 <- glm(y~x1+x2+x3+x4,family=binomial(link="probit"),
data=dat2)
As a final note, I would say that you should check out the parameters argument, which allows you to specify a sub-linear model for any one of the parameters, and the formula interface:
m3 <- mle2(y~dbinom(prob=pnorm(eta),size=1),
parameters=list(eta~x1+x2+x3+x4),
start=list(eta=0),
data=dat2)
PS confint(foo2) appears to work fine (giving profile CIs as requested) with this set-up.
ae <- function(x,y) all.equal(unname(coef(x)),unname(coef(y)),tol=5e-5)
ae(m1,m2) && ae(m2,m3)

Resources