I am new to simulation especially when it comes to time series and so I apologize if this question seems too naive. I am trying to understand why simulating this ar(2) model generates an error:
arima.sim(list(order = c(2, 0, 0), ar = c(0.7, 0.3)), n = time_n, sd=0.2)
Error in arima.sim(list(order = c(2, 0, 0), ar = c(0.7, 0.3)), n = time_n, :
'ar' part of model is not stationary
Any pointer will be appreciated!
According to theory (e.g. see here), in order for an autoregressive model to be stationary, if r are the roots of the autoregressive polynomial
1 - phi_1 x - phi_2 x ...
then
The linear AR(p) process is strictly stationary and ergodic
if and only if |rj|>1 for all j, where |rj| is the modulus of the
complex number rj.
In your case
polyroot(c(1, -0.7, -0.3))
gives (1,-3.333)
In fact, this is the actual code within arima.sim:
minroots <- min(Mod(polyroot(c(1, -model$ar))))
if (minroots <= 1)
stop("'ar' part of model is not stationary")
Looking at the patterns and being lazy about the math, I suspect that the criterion for AR2 translates to (ph1 + phi2 < 1).
Related
I'm trying to implement a "change point" analysis, or a multiphase regression using nls() in R.
Here's some fake data I've made. The formula I want to use to fit the data is:
$y = \beta_0 + \beta_1x + \beta_2\max(0,x-\delta)$
What this is supposed to do is fit the data up to a certain point with a certain intercept and slope ($\beta_0$ and $\beta_1$), then, after a certain x value ($\delta$), augment the slope by $\beta_2$. That's what the whole max thing is about. Before the $\delta$ point, it'll equal 0, and $\beta_2$ will be zeroed out.
So, here's my function to do this:
changePoint <- function(x, b0, slope1, slope2, delta){
b0 + (x*slope1) + (max(0, x-delta) * slope2)
}
And I try to fit the model this way
nls(y ~ changePoint(x, b0, slope1, slope2, delta),
data = data,
start = c(b0 = 50, slope1 = 0, slope2 = 2, delta = 48))
I chose those starting parameters, because I know those are the starting parameters, because I made the data up.
However, I get this error:
Error in nlsModel(formula, mf, start, wts) :
singular gradient matrix at initial parameter estimates
Have I just made unfortunate data? I tried fitting this on real data first, and was getting the same error, and I just figured that my initial starting parameters weren't good enough.
(At first I thought it could be a problem resulting from the fact that max is not vectorized, but that's not true. It does make it a pain to work with changePoint, wherefore the following modification:
changePoint <- function(x, b0, slope1, slope2, delta) {
b0 + (x*slope1) + (sapply(x-delta, function (t) max(0, t)) * slope2)
}
This R-help mailing list post describes one way in which this error may result: the rhs of the formula is overparameterized, such that changing two parameters in tandem gives the same fit to the data. I can't see how that is true of your model, but maybe it is.
In any case, you can write your own objective function and minimize it. The following function gives the squared error for data points (x,y) and a certain value of the parameters (the weird argument structure of the function is to account for how optim works):
sqerror <- function (par, x, y) {
sum((y - changePoint(x, par[1], par[2], par[3], par[4]))^2)
}
Then we say:
optim(par = c(50, 0, 2, 48), fn = sqerror, x = x, y = data)
And see:
$par
[1] 54.53436800 -0.09283594 2.07356459 48.00000006
Note that for my fake data (x <- 40:60; data <- changePoint(x, 50, 0, 2, 48) + rnorm(21, 0, 0.5)) there are lots of local maxima depending on the initial parameter values you give. I suppose if you wanted to take this seriously you'd call the optimizer many times with random initial parameters and examine the distribution of results.
Just wanted to add that you can do this with many other packages. If you want to get an estimate of uncertainty around the change point (something nls cannot do), try the mcp package.
# Simulate the data
df = data.frame(x = 1:100)
df$y = c(rnorm(20, 50, 5), rnorm(80, 50 + 1.5*(df$x[21:100] - 20), 5))
# Fit the model
model = list(
y ~ 1, # Intercept
~ 0 + x # Joined slope
)
library(mcp)
fit = mcp(model, df)
Let's plot it with a prediction interval (green line). The blue density is the posterior distribution for the change point location:
# Plot it
plot(fit, q_predict = T)
You can inspect individual parameters in more detail using plot_pars(fit) and summary(fit).
I have a little problem: I'm using the R library ConsReg to fit a linear model with: i) upper and lower bound on the parameters(they neet to belong to [0, 1] and ii) add a constraint on the parameters, such that intercept + x1 <= 1.
I have read all CosReg documentation and explore internet, but I can't find how to include the intercept in the constraint. I know it sounds a little bit odd to constraint the intercept, but our model absolutely requires it.
According to ConsReg documentation, I have completed this:
""ConsReg(y~ x, family = 'gaussian', optimizer = 'mcmc', LOWER = 0, UPPER = 1,
constraints = my_constraints, ini.pars.coef = c(0.5, 0.4), data = df))""
So, I need help to set 'my_constraints' with intercept + x1 <= 1.
Does anybody knows how to do that?
I have somewhat a complex physical model with five unknown parameters to fit, but no success so far.
I used nls2 first to get some estimates for the start values, but then nls, nlxb, and nlsLM all threw the famous "singular gradient error at initial parameter estimates" error.
For the start values for nls2, I extracted them from the literature, so I think that I have good starting values at least for nls2. The parameter estimates extracted from nls2 make quite sense physically as well; however, don't resolve the issue with the singular gradient matrix error.
Since it's a physical model, every coefficient has a physical meaning, and I prefer not to fix any of them.
I should also mention that all five unknown parameters in the model equation are positive and the shape parameter m can go up to 2.
Reading through many posts and trying different solution suggestions, I have come to conclusion that I have either over-parameterization or unidentifiable parameters problem.
My question is that should I stop trying to use nls with this specific model (with this many unknown parameters) or is there any way out?
I am quite new to topic, so any help, mathematically or code-wise, is greatly appreciated.
Here is my MWE:
# Data
x <- c(0, 1000, 2000, 2500, 2750, 3000, 3250, 3500, 3750, 4000, 5000)
y <- c(1.0, 0.99, 0.98, 0.95, 0.795, 0.59, 0.35, 0.295, 0.175, 0.14, 0.095)
# Start values for nls2
bounds <- data.frame(a = c(0.8, 1.5), b = c(1e+5, 1e+7), c = c(0.4, 1.4), n = c(0.1, 2), m = c(0.1, 2))
# Model equation function
mod <- function(x, a, b, c, n, m){
t <- b*85^n*exp(-c/0.0309)
(1 - exp(-(a/(t*x))^m))
}
# # Model equation
# mod <- y ~ (1 - exp(-(a/(b*85^n*exp(-c/0.0309)*x))^m))
# Model fit with nls2
fit2 <- nls2(y ~ mod(x, a, b, c, n, m), data = data.frame(x, y), start = bounds, algorithm = "brute-force")
# Model fit with nls
fit <- nls(y ~ mod(x, a, b, c, n, m), data = data.frame(x, y), start = coef(fit2))
The more I look at this the more confused I get, but I'm going to try again.
Looking again at your expression, we have the expression inside the exponential
-(a/(b*85^n*exp(-c/0.0309)*x))^m
We can rewrite this as
-( [a/(b*85^n*exp(-c/0.0309))] * 1/x )^m
(please check my algebra!)
If this is correct, then that whole bold-faced blob doesn't affect the functional form of x — it all collapses to a single constant in the equation. (In other words, {a,b,c,n} are all jointly unidentifiable.) Lumping that stuff into a single parameter phi :
1 - exp(-(phi/x)^m)
phi is a shape parameter (has the same units as x, should be roughly the same magnitude as a typical value of x): let's try a starting value of 2500 (the mean value of x)
m is a shape parameter; we can't go too badly wrong starting from m==1
Now nls works fine without any extra help:
n1 <- nls(y~1 - exp(-(phi/x)^m), start=list(phi=2500,m=1), data=data.frame(x,y))
and gets phi=2935, m=6.49.
Plot predictions:
plot(x,y, ylim=c(0,1))
xvec <- seq(0, 5000, length=101)
lines(xvec, predict(n1, newdata=data.frame(x=xvec)))
Another way to think about what this curve is doing: we can transform the equation to -log(1-y) = phi^m*(1/x)^m: that is, -log(1-y) should follow a power-law curve with respect to 1/x.
Here's what that looks like:
plot(1/x, -log(1-y))
## curve() uses "x" as the current x-axis variable, i.e.
## read "x" as "1/x" below.
with(as.list(coef(n1)), curve(phi^m*x^m, add=TRUE))
In this format, it appears to fit the central data well but fails for large values of 1/x (the x=0 point is missing here because it goes to infinity).
I am trying to set up a simple numerical MLE estimation of a multinomial distribution.
The multinomial has one constraint - all the cell probabilities need to add up to one.
Usually the way to have this constraint is to re-express one of the probabilities as (1 - sum of the others)
When I run this however, I have a problem as during the optimization procedure, I might have logarithm of a negative value.
Any thoughts of how to fix this? I tried using another optimization package (Rsolnp) and it worked, but I am trying to make it work with the simple default R optim in order to avoid constrained/nonlinear optimization.
Here is my code (I know that I can get the result in this particular case analytically, but this is a toy example, my actual problem is bigger than this here).
set.seed(1234)
test_data <- rmultinom(n = 1, size = 1000, prob = rep(1/4, 4))
N <- test_data
loglik_function <- function(theta){
output <- -1*(N[1]*log(theta[1]) + N[2]*log(theta[2]) + N[3]*log(theta[3]) + N[4]*log(1- sum(theta)))
return(output)
}
startval <- rep(0.1, 3)
my_optim <- optim(startval, loglik_function, lower = 0.0001, upper = 0.9999, method = "L-BFGS-B")
Any thoughts or help would be very much appreciated. Thanks
Full heads-up: I know you asked about (constrained) ML estimation, but how about doing this the Bayesian way à la Stan/rstan. I will remove this if it's not useful/missing the point.
The model is only a few lines of code.
library(rstan)
model_code <- "
data {
int<lower=1> K; // number of choices
int<lower=0> y[K]; // observed choices
}
parameters {
simplex[K] theta; // simplex of probabilities, one for every choice
}
model {
// Priors
theta ~ cauchy(0, 2.5); // weakly informative
// Likelihood
y ~ multinomial(theta);
}
generated quantities {
real ratio;
ratio = theta[1] / theta[2];
}
"
You can see how easy it is to implement the simplex constraint on the thetas using the Stan data type simplex. In the Stan language, simplex allows you to easily implement a probability (unit) simplex
where K denotes the number of parameters (here: choices).
Also note how we use the generated quantities code block, to calculate derived quantities (here ratio) based on the parameters (here theta[1] and theta[2]). Since we have access to the posterior distributions of all parameters, calculating the distribution of derived quantities is trivial.
We then fit the model to your test_data
fit <- stan(model_code = model_code, data = list(K = 4, y = test_data[, 1]))
and show a summary of the parameter estimates
summary(fit)$summary
# mean se_mean sd 2.5% 25%
#theta[1] 0.2379866 0.0002066858 0.01352791 0.2116417 0.2288498
#theta[2] 0.26 20013 0.0002208638 0.01365478 0.2358731 0.2526111
#theta[3] 0.2452539 0.0002101333 0.01344665 0.2196868 0.2361817
#theta[4] 0.2547582 0.0002110441 0.01375618 0.2277589 0.2458899
#ratio 0.9116350 0.0012555320 0.08050852 0.7639551 0.8545142
#lp__ -1392.6941655 0.0261794859 1.19050097 -1395.8297494 -1393.2406198
# 50% 75% 97.5% n_eff Rhat
#theta[1] 0.2381541 0.2472830 0.2645305 4283.904 0.9999816
#theta[2] 0.2615782 0.2710044 0.2898404 3822.257 1.0001742
#theta[3] 0.2448304 0.2543389 0.2722152 4094.852 1.0007501
#theta[4] 0.2545946 0.2638733 0.2822803 4248.632 0.9994449
#ratio 0.9078901 0.9648312 1.0764747 4111.764 0.9998184
#lp__ -1392.3914998 -1391.8199477 -1391.3274885 2067.937 1.0013440
as well as a plot showing point estimates and CIs for the theta parameters
plot(fit, pars = "theta")
Update: Constrained ML estimation using maxLik
You can in fact implement constrained ML estimation using methods provided by the maxLik library. I found it a bit "fiddly", because convergence seems to be quite sensitive to changes in the starting values and the optimisation method used.
For what it's worth, here is a reproducible example:
library(maxLik)
x <- test_data[, 1]
Define the log-likelihood function for a multinomial distribution; I've included an if statement here to prevent theta < 0 cases from throwing an error.
loglik <- function(theta, x)
if (all(theta > 0)) sum(dmultinom(x, prob = theta, log = TRUE)) else 0
I use the Nelder-Mead optimisation method here to find the maximum of the log-likelihood function. The important bit here is the constraints argument that implements a constraint in the form of the equality A theta + B = 0, see ?maxNM for details and examples.
res <- maxNM(
loglik,
start = rep(0.25, length(x)),
constraints = list(
eqA = matrix(rep(1, length(x)), ncol = length(x)),
eqB = -1),
x = x)
We can inspect the results
summary(res)
--------------------------------------------
Nelder-Mead maximization
Number of iterations: 111
Return code: 0
successful convergence
Function value: -10.34576
Estimates:
estimate gradient
[1,] 0.2380216 -0.014219040
[2,] 0.2620168 0.012664714
[3,] 0.2450181 0.002736670
[4,] 0.2550201 -0.002369234
Constrained optimization based on SUMT
Return code: 1
penalty close to zero
1 outer iterations, barrier value 5.868967e-09
--------------------------------------------
and confirm that indeed the sum of the estimates equals 1 (within accuracy)
sum(res$estimate)
#[1] 1.000077
Sample data
set.seed(1234)
test_data <- rmultinom(n = 1, size = 1000, prob = rep(1/4, 4))
I'm trying to do Bayesian Occupancy Analysis with site covariates. My first step is making a function. I keep on getting the + in my R console indicating it thinks my code is incomplete. Having ran lines individually I am pretty certain the issue lies in the first line of code. However, I can't work out where exactly I've missed something out and hence where the problem originally lies.
data.fn <- function(R = 39, T = 14, xmin= 0, xmax= 1, alpha.psi = 0.4567,
beta.psi = 0.0338, alpha.p = 0.4, beta.p = 0.4) {
y <- array(dim = c(R,T)) #This creates an array for counts
#Ecological Process
#Covariate values
X <- sort(runif(n=R, min = xmin, max = xmax))
#Expected occurence-covariate relationship
psi <- plogis(alpha.psi + beta.psi *X) #this applies the inverse logit
#Add Bernoulli Noise - drawing indicator of occurence (z) from bernoulli psi
z<- rbinom(n = R, size = 1, prob = psi)
occ.fs <- sum(z) #"Finite Sample Occupancy"
"Make a census"
p.eff <- z*p
for (i in 1:T) {
y[,i] <- rbinom(n=R, size = 1, prob = p.eff)
}
}
There's more code - i.e. the {} function is complete but the issue started before that was ran and I keep having issues uploading the code into Stack.
The error message is simply + all down the left hand side of the R console
EDIT
Could there be something wrong with how R is sensing stuff? For instance with the following code
naive.pred <- plogis(predict(glm(apply(y, 1, max) ~ X + I (X^2),
family = binomial)))
I got the error message - unexpected symbol (the bracket) in the family = binomial yet each bracket is paired correctly- there's no extra unnecessary brackets?
While I did not see a + issue when I looked over your code, you are not simulating the observed data correctly and there was a p object inside your function that did not have an argument passed to it. You did create a logit linear predictor for psi using alpha.psi and beta.psi, however you are lacking a logit linear predictor for the probability of detecting a species given that they are there using alpha.p and beta.p. Assuming that the covariate X is used for both the latent occupancy state and the observation model the code should become.
data.fn <- function(R = 39, T = 14, xmin= 0, xmax= 1, alpha.psi = 0.4567,
beta.psi = 0.0338, alpha.p = 0.4, beta.p = 0.4) {
y <- array(dim = c(R,T)) #This creates an array for counts
#Ecological Process
#Covariate values
X <- sort(runif(n=R, min = xmin, max = xmax))
#Expected occurence-covariate relationship
psi <- plogis(alpha.psi + beta.psi *X) #this applies the inverse logit
#Add Bernoulli Noise - drawing indicator of occurence (z) from bernoulli psi
z<- rbinom(n = R, size = 1, prob = psi)
occ.fs <- sum(z) #"Finite Sample Occupancy"
# Linear predictor for detection,
# assuming the same covariate is used for detection
p.eff <- plogis(alpha.p + beta.p * X)
for (i in 1:T) {
y[,i] <- rbinom(n=R, size = 1, prob = p.eff * z)
}
return(list(y = y, z = z, X = X, occ.fs = occ.fs))
}
This code assumes that you are passing logit scale parameters to the data, so if you are trying to simulate data such that X has a very marginal and positive influence on occupancy then you are off to the races, so to speak. If you are looking for a more pronounced effect, than you should increase the effect size. Finally, 39 site is very few for an occupancy analysis given that binary deteciton / non-detection data is quite information poor. Don't be surprised if you posterior estimates you get from analyzing the dataset do not actually return the parameters used to simulate the data.