Equal and opposite slopes in segmented package - r

Hi I am trying to use a segmented package in R to fit a piecewise linear regression model to estimate break point in my data. I have used the following code to get this graph.
library(segmented)
set.seed(5)
x <- c(1:10, 13:22)
y <- numeric(20)
## Create first segment
y[1:10] <- 20:11 + rnorm(10, 0, 1.5)
## Create second segment
y[11:20] <- seq(11, 15, len=10) + rnorm(10, 0, 1.5)
## fitting a linear model
lin.mod <- lm(y~x)
segmented.mod <- segmented(lin.mod, seg.Z = ~x, psi=15)
summary(segmented.mod)
plot(x,y, pch=".",cex=4,xlab="x",ylab="y")
plot(segmented.mod, add=T, lwd = 3,col = "red")
My theoretical calculations suggests that the slopes of the two lines about the breakpoint should be equal in magnitude but opposite in sign. I am beginner to lm and glms. I was hoping if there is a way to estimate breakpoints with slopes constrained by the relation, slope1=-slope2
enter image description here

This is not supported in the segmented package.
nls2 with "plinear-brute" algorithm could be used. In the output .lin1 and .lin2 are the constant term and the slope respectively. This tries each value in the range of x as a possible bp fitting a linear regression to each.
library(nls2)
st <- data.frame(bp = seq(min(x), max(x)))
nls2(y ~ cbind(1, abs(x - bp)), start = st, alg = "plinear-brute")
giving:
Nonlinear regression model
model: y ~ cbind(1, abs(x - bp))
data: parent.frame()
bp .lin1 .lin2
14.000000 9.500457 0.709624
residual sum-of-squares: 45.84213
Number of iterations to convergence: 22
Achieved convergence tolerance: NA
Here is another example which may clarify this since it generates the data from the same model as fit:
library(nls2)
set.seed(123)
n <- 100
bp <- 25
x <- 1:n
y <- rnorm(n, 10 + 2 * abs(x - bp))
st <- data.frame(bp = seq(min(x), max(x)))
fm <- nls2(y ~ cbind(1, abs(x - bp)), start = st, alg = "plinear-brute")
giving:
> fm
Nonlinear regression model
model: y ~ cbind(1, abs(x - bp))
data: parent.frame()
bp .lin1 .lin2
25.000 9.935 2.005
residual sum-of-squares: 81.29
Number of iterations to convergence: 100
Achieved convergence tolerance: NA
Note: In the above we assumed that bp is an integer in the range of x but we can relax that if such condition is not desired by using the result of nls2 as the starting value of an nls optimization, i.e. nls(y ~ cbind(1, abs(x - bp)), start = coef(fm)[1], alg = "plinear") .

Related

R - Singular gradient matrix at initial parameter estimates

I'm trying to fit a harmonic equation to my data, but when I'm applying the nls function, R gives me the following error:
Error in nlsModel(formula, mf, start, wts) : singular gradient matrix at initial parameter estimates.
All posts I've seen, related to this error, are of exponential functions, where a linearization is used to fix this error, but in this case, I'm not able to solve it in this way. I tried to use other starting points but it still not working.
CODE:
y <- c(20.91676, 20.65219, 20.39272, 20.58692, 21.64712, 23.30965, 23.35657, 24.22724, 24.83439, 24.34865, 23.13173, 21.96117)
t <- c(1, 2, 3, 4 , 5 , 6, 7, 8, 9, 10, 11, 12)
# Fitting function
fit <- function(x, a, b, c) {a+b*sin(2*pi*x)+c*cos(2*pi*x)}
res <- nls(y ~ fit(t, a, b, c), data=data.frame(t,y), start = list(a=1,b=0, c=1))
Can you help me? Thanks!
There are several problems:
cos(2*pi*t) is a vector of all ones for the t given in the question so the model is not identifiable given that there is already an intercept
the model is linear in the parameters so one can use lm rather than nls and no starting values are needed
the model does not work well even if we address those points as seen by the large second coefficient. Improve the model.
lm(y ~ sin(2*pi*t))
giving:
Call:
lm(formula = y ~ sin(2 * pi * t))
Coefficients:
(Intercept) sin(2 * pi * t)
2.195e+01 -2.262e+14
Instead try this model using the plinear algorithm which does not require starting values for the parameters that enter linearly. This implements the model .lin1 + .lin2 * cos(a * t + b) where the .lin1 and .lin2 parameters are implicit parameters that enter linearly and don't need starting values.
fm <- nls(y ~ cbind(1, cos(a * t + b)), start = list(a = 1, b = 1), alg = "plinear")
plot(y ~ t)
lines(fitted(fm) ~ t, col = "red")
fm
giving:
Nonlinear regression model
model: y ~ cbind(1, cos(a * t + b))
data: parent.frame()
a b .lin1 .lin2
0.5226 4.8814 22.4454 -2.1530
residual sum-of-squares: 0.7947
Number of iterations to convergence: 9
Achieved convergence tolerance: 8.865e-06

Syntax for three-piece segmented regression using NLS in R when concave

My goal is to fit a three-piece (i.e., two break-point) regression model to make predictions using propagate's predictNLS function, making sure to define knots as parameters, but my model formula seems off.
I've used the segmented package to estimate the breakpoint locations (used as starting values in NLS), but would like to keep my models in the NLS format, specifically, nlsLM {minipack.lm} because I am fitting other types of curves to my data using NLS, want to allow NLS to optimize the knot values, am sometimes using variable weights, and need to be able to easily calculate the Monte Carlo confidence intervals from propagate. Though I'm very close to having the right syntax for the formula, I'm not getting the expected/required behaviour near the breakpoint(s). The segments SHOULD meet directly at the breakpoints (without any jumps), but at least on this data, I'm getting a weird local minimum at the breakpoint (see plots below).
Below is an example of my data and general process. I believe my issue to be in the NLS formula.
library(minpack.lm)
library(segmented)
y <- c(-3.99448113, -3.82447011, -3.65447803, -3.48447030, -3.31447855, -3.14448753, -2.97447972, -2.80448401, -2.63448380, -2.46448069, -2.29448796, -2.12448912, -1.95448783, -1.78448797, -1.61448563, -1.44448719, -1.27448469, -1.10448651, -0.93448525, -0.76448637, -0.59448626, -0.42448586, -0.25448588, -0.08448548, 0.08551417, 0.25551393, 0.42551411, 0.59551395, 0.76551389, 0.93551398)
x <- c(61586.1711, 60330.5550, 54219.9925, 50927.5381, 48402.8700, 45661.9175, 37375.6023, 33249.1248, 30808.6131, 28378.6508, 22533.3782, 13901.0882, 11716.5669, 11004.7305, 10340.3429, 9587.7994, 8736.3200, 8372.1482, 8074.3709, 7788.1847, 7499.6721, 7204.3168, 6870.8192, 6413.0828, 5523.8097, 3961.6114, 3460.0913, 2907.8614, 2016.1158, 452.8841)
df<- data.frame(x,y)
#Use Segmented to get estimates for parameters with 2 breakpoints
my.seg2 <- segmented(lm(y ~ x, data = df), seg.Z = ~ x, npsi = 2)
#extract knot, intercept, and coefficient values to use as NLS start points
my.knot1 <- my.seg2$psi[1,2]
my.knot2 <- my.seg2$psi[2,2]
my.m_2 <- slope(my.seg2)$x[1,1]
my.b1 <- my.seg2$coefficients[[1]]
my.b2 <- my.seg2$coefficients[[2]]
my.b3 <- my.seg2$coefficients[[3]]
#Fit a NLS model to ~replicate segmented model. Presumably my model formula is where the problem lies
my.model <- nlsLM(y~m*x+b+(b2*(ifelse(x>=knot1&x<=knot2,1,0)*(x-knot1))+(b3*ifelse(x>knot2,1,0)*(x-knot2-knot1))),data=df, start = c(m = my.m_2, b = my.b1, b2 = my.b2, b3 = my.b3, knot1 = my.knot1, knot2 = my.knot2))
How it should look
plot(my.seg2)
How it does look
plot(x, y)
lines(x=x, y=predict(my.model), col='black', lty = 1, lwd = 1)
I was pretty sure I had it "right", but when the 95% confidence intervals are plotted with the line and prediction resolution (e.g., the density of x points) is increased, things seem dramatically incorrect.
Thank you all for your help.
Define g to be a grouping vector having the same length as x which takes on values 1, 2, 3 for the 3 sections of the X axis and create an nls model from these. The resulting plot looks ok.
my.knots <- c(my.knot1, my.knot2)
g <- cut(x, c(-Inf, my.knots, Inf), label = FALSE)
fm <- nls(y ~ a[g] + b[g] * x, df, start = list(a = c(1, 1, 1), b = c(1, 1, 1)))
plot(y ~ x, df)
lines(fitted(fm) ~ x, df, col = "red")
(continued after graph)
Constraints
Although the above looks ok and may be sufficient it does not guarantee that the segments intersect at the knots. To do that we must impose the constraints that both sides are equal at the knots:
a[2] + b[2] * my.knots[1] = a[1] + b[1] * my.knots[1]
a[3] + b[3] * my.knots[2] = a[2] + b[2] * my.knots[2]
so
a[2] = a[1] + (b[1] - b[2]) * my.knots[1]
a[3] = a[2] + (b[2] - b[3]) * my.knots[2]
= a[1] + (b[1] - b[2]) * my.knots[1] + (b[2] - b[3]) * my.knots[2]
giving:
# returns a vector of the three a values
avals <- function(a1, b) unname(cumsum(c(a1, -diff(b) * my.knots)))
fm2 <- nls(y ~ avals(a1, b)[g] + b[g] * x, df, start = list(a1 = 1, b = c(1, 1, 1)))
To get the three a values we can use:
co <- coef(fm2)
avals(co[1], co[-1])
To get the residual sum of squares:
deviance(fm2)
## [1] 0.193077
Polynomial
Although it involves a large number of parameters, a polynomial fit could be used in place of the segmented linear regression. A 12th degree polynomial involves 13 parameters but has a lower residual sum of squares than the segmented linear regression. A lower degree could be used with corresponding increase in residual sum of squares. A 7th degree polynomial involves 8 parameters and visually looks not too bad although it has a higher residual sum of squares.
fm12 <- nls(y ~ cbind(1, poly(x, 12)) %*% b, df, start = list(b = rep(1, 13)))
deviance(fm12)
## [1] 0.1899218
It may, in part, reflect a limitation in segmented. segmented returns a single change point value without quantifying the associated uncertainty. Redoing the analysis using mcp which returns Bayesian posteriors, we see that the second change point is bimodally distributed:
library(mcp)
model = list(
y ~ 1 + x, # Intercept + slope in first segment
~ 0 + x, # Only slope changes in the next segments
~ 0 + x
)
# Fit it with a large number of samples and plot the change point posteriors
fit = mcp(model, data = data.frame(x, y), iter = 50000, adapt = 10000)
plot_pars(fit, regex_pars = "^cp*", type = "dens_overlay")
FYI, mcp can plot credible intervals as well (the red dashed lines):
plot(fit, q_fit = TRUE)

Fit an exponential decay model in R

I am very new to R and I appreciate the help
I have some data that looks like this.
Y is negatively correlated with X, in a nonlinear way. It seems to be approximated by a formula of the following form y=1+ax where a<1.
If I wanted to fit that data in R to find a what function would I use? NLS?
Next time please provide test data. We have done it for you this time. Then we use nls as shown.
set.seed(123)
# generate test data
n <- 35
x <- 1:n
a <- 0.5
y <- 1 + a^x + rnorm(n, 0, .01)
fm <- nls(y ~ 1+a^x, start = list(a = mean((y-1)^(1/x), na.rm = TRUE)))
fm
giving:
Nonlinear regression model
model: y ~ 1 + a^x
data: parent.frame()
a
0.5025
residual sum-of-squares: 0.003031
Number of iterations to convergence: 5
Achieved convergence tolerance: 1.346e-06
Plot
plot(y ~ x)
lines(fitted(fm) ~ x, col = "red")

How to model a mixture of finite components from different parametric families with JAGS?

Imagine a underlying process that draws a number from a normal distribution with probability $\alpha$ and from a uniform distribution with probability $1 - \alpha$.
The observed sequence of numbers generated by this process therefore follows a distribution $f$ that is a mixture of 2 components and mixing weights of $\alpha$ and $1 - \alpha$.
How would you model this kind of mixture with JAGS when the observed sequence is the only input, but the parametric families are known?
Example (in R):
set.seed(8361299)
N <- 100
alpha <- 0.3
mu <- 5
max <- 50
# Which component to choose from?
latent_class <- rbinom(N, 1, alpha)
Y <- ifelse(latent_class, runif(N, min=mu, max=max), rnorm(N, mean=mu))
The generated (observed) Y looks like:
With JAGS, it should be possible to obtain the mixing weights, as well as the parameters of the known components?
Mixture models of the same parametric distribution are pretty straightforward in JAGS/BUGS, but mixture models with varying parametric responses (like yours) are a little more tricky. One method is to use the 'ones trick' whereby we manually calculate the likelihood of the response (selecting one of the two distributions as specified by the latent part of the model) and fit this to the (fake) response of a Bernoulli trial for each data point. For example:
# Your data generation:
set.seed(8361299)
N <- 100
alpha <- 0.3
mu <- 5
max <- 50
# Which component to choose from?
latent_class <- rbinom(N, 1, alpha)
Y <- ifelse(latent_class, runif(N, min=mu, max=max), rnorm(N, mean=mu))
# The model:
model <- "model{
for(i in 1:N){
# Log density for the normal part:
ld_norm[i] <- logdensity.norm(Y[i], mu, tau)
# Log density for the uniform part:
ld_unif[i] <- logdensity.unif(Y[i], lower, upper)
# Select one of these two densities:
density[i] <- exp(ld_norm[i]*norm_chosen[i] + ld_unif[i]*(1-norm_chosen[i]))
# Generate a likelihood for the MCMC sampler:
Ones[i] ~ dbern(density[i])
# The latent class part as usual:
norm_chosen[i] ~ dbern(prob)
}
# Priors:
lower ~ dnorm(0, 10^-6)
upper ~ dnorm(0, 10^-6)
prob ~ dbeta(1,1)
mu ~ dnorm(0, 10^-6)
tau ~ dgamma(0.01, 0.01)
# Specify monitors, data and initial values using runjags:
#monitor# lower, upper, prob, mu, tau
#data# N, Y, Ones
#inits# lower, upper
}"
# Run the model using runjags (or use rjags if you prefer!)
library('runjags')
lower <- min(Y)-10
upper <- max(Y)+10
Ones <- rep(1,N)
results <- run.jags(model, sample=20000, thin=1)
results
plot(results)
This seems to recover your parameters pretty well (your alpha is 1-prob), but watch out for autocorrelation (and convergence).
Matt
EDIT: Since you asked about generalising to more than 2 distributions, here is equivalent (but more generalisable) code:
# The model:
model <- "model{
for(i in 1:N){
# Log density for the normal part:
ld_comp[i, 1] <- logdensity.norm(Y[i], mu, tau)
# Log density for the uniform part:
ld_comp[i, 2] <- logdensity.unif(Y[i], lower, upper)
# Select one of these two densities and normalise with a Constant:
density[i] <- exp(ld_comp[i, component_chosen[i]] - Constant)
# Generate a likelihood for the MCMC sampler:
Ones[i] ~ dbern(density[i])
# The latent class part using dcat:
component_chosen[i] ~ dcat(probs)
}
# Priors for 2 parameters using a dirichlet distribution:
probs ~ ddirch(c(1,1))
lower ~ dnorm(0, 10^-6)
upper ~ dnorm(0, 10^-6)
mu ~ dnorm(0, 10^-6)
tau ~ dgamma(0.01, 0.01)
# Specify monitors, data and initial values using runjags:
#monitor# lower, upper, probs, mu, tau
#data# N, Y, Ones, Constant
#inits# lower, upper, mu, tau
}"
library('runjags')
# Initial values to get the chains started:
lower <- min(Y)-10
upper <- max(Y)+10
mu <- 0
tau <- 0.01
Ones <- rep(1,N)
# The constant needs to be big enough to avoid any densities >1 but also small enough to calculate probabilities for observations of 1:
Constant <- 10
results <- run.jags(model, sample=10000, thin=1)
results
This code will work for as many distributions as you need, but expect exponentially worse autocorrelation with more distributions.

Fit 'nls': singular gradient matrix at initial parameter estimates

I'm new using 'nls' and I'm encountering problems finding the starting parameters. I've read several posts and tried various parameters and formula constructions but I keep getting errors.
This is a small example of what I'm doing and I'd very much appreciate if anyone could give me some tips!
# Data to which I want to fit a non-linear function
x <- c(0, 4, 13, 30, 63, 92)
y <- c(0.00000000, 0.00508822, 0.01103990, 0.02115466, 0.04036655, 0.05865331)
z <- 0.98
# STEPS:
# 1 pool, z fixed. This works.
fit <- nls(y ~ z * ((1 - exp(-k1*x))),
start=list(k1=0))
# 2 pool model, z fixed
fit2 <- nls(y ~ z * (1 - exp(-k1*x)) + (1 - exp(-k2*x)),
start=list(k1=0, k2=0)) # Error: singular gradient matrix at initial parameter estimates
# My goal: 2 pool model, z free
fit3 <- nls(y ~ z * (1 - exp(-k1*x)) + (1 - exp(-k2*x)),
start=list(z=0.5, k1=0, k2=0))
It has been a while since you asked the question but maybe you are still interested in some comments:
At least your fit2 works fine when one varies the starting parameters (see code and plots below). I guess that fit3 is then just a "too complicated" model given these data which follow basically just a linear trend. That implies that two parameters are usually sufficient to describe the data reasonable well (see second plot).
So as a general hint: When you obtain
singular gradient matrix at initial parameter estimates
you can
1) vary the starting values/your initial parameter estimates
and/or
2) try to simplify your model by looking for redundant parameters which usually cause troubles.
I also highly recommend to always plot the data first together with your initial guesses (check also this question).
Here is a plot showing the outcome for your fit, fit2 and a third function defined by me which is given in the code below:
As you can see, there is almost no difference between your fit2 and the function which has a variable z and one additional exponential. Two parameters seem pretty much enough to describe the system reasonable well (also one is already quite good represented by the black line in the plot above). If you then want to fit a line through a certain data point, you can also check out this answer.
So how does it now look like when one uses a linear function with two free parameters and a function with variable z, one exponential term and a variable offset? That is shown in the following plot; again there is not much of a difference:
How do the residuals compare?
> fit
Nonlinear regression model
model: y ~ zfix * ((1 - exp(-k1 * x)))
data: parent.frame()
k1
0.0006775
residual sum-of-squares: 1.464e-05
> fit2
Nonlinear regression model
model: y ~ zfix * (1 - exp(-k1 * x)) + (1 - exp(-k2 * x))
data: parent.frame()
k1 k2
-0.0006767 0.0014014
residual sum-of-squares: 9.881e-06
> fit3
Nonlinear regression model
model: y ~ Z * (1 - exp(-k1 * x))
data: parent.frame()
Z k1
0.196195 0.003806
residual sum-of-squares: 9.59e-06
> fit4
Nonlinear regression model
model: y ~ a * x + b
data: parent.frame()
a b
0.0006176 0.0019234
residual sum-of-squares: 6.084e-06
> fit5
Nonlinear regression model
model: y ~ z * (1 - exp(-k1 * x)) + k2
data: parent.frame()
z k1 k2
0.395106 0.001685 0.001519
residual sum-of-squares: 5.143e-06
As one could guess, the fit with only one free parameter gives the worst while the one with three free parameters gives the best result; however, there is not much of a difference (in my opinion).
Here is the code I used:
x <- c(0, 4, 13, 30, 63, 92)
y <- c(0.00000000, 0.00508822, 0.01103990, 0.02115466, 0.04036655, 0.05865331)
zfix <- 0.98
plot(x,y)
# STEPS:
# 1 pool, z fixed. This works.
fit <- nls(y ~ zfix * ((1 - exp(-k1*x))), start=list(k1=0))
xr = data.frame(x = seq(min(x),max(x),len=200))
lines(xr$x,predict(fit,newdata=xr))
# 2 pool model, z fixed
fit2 <- nls(y ~ zfix * (1 - exp(-k1*x)) + (1 - exp(-k2*x)), start=list(k1=0, k2=0.5))
lines(xr$x,predict(fit2,newdata=xr), col='red')
# 3 z variable
fit3 <- nls(y ~ Z * (1 - exp(-k1*x)), start=list(Z=zfix, k1=0.2))
lines(xr$x,predict(fit3,newdata=xr), col='blue')
legend('topleft',c('fixed z, single exp', 'fixed z, two exp', 'variable z, single exp'),
lty=c(1,1,1),
lwd=c(2.5,2.5,2.5),
col=c('black', 'red','blue'))
#dev.new()
plot(x,y)
# 4 fit linear function a*x + b
fit4 <- nls(y ~ a *x + b, start=list(a=1, b=0.))
lines(xr$x,predict(fit4,newdata=xr), col='blue')
fit5 <- nls(y ~ z * (1 - exp(-k1*x)) + k2, start=list(z=zfix, k1=0.1, k2=0.5))
lines(xr$x,predict(fit5,newdata=xr), col='red')
legend('topleft',c('linear approach', 'variable z, single exp, offset'),
lty=c(1,1),
lwd=c(2.5,2.5),
col=c('blue', 'red'))

Resources