directed cycle error in JAGS when making multiple predictions - r

I use my JAGS model, and then use that model to make predictions, propogating parameter uncertainty into those predictions. I currently have a zero-inflated Poisson (ZIP) model that runs fine making a single prediction. It looks like this:
j.model ="
model{
for (i in 1:N){
#this fits the blended model to your data.
y[i] ~ dpois(m[i]*t[i])
#This blends the poisson and zero inflation models
m[i] <- mu[i]*x[i] + 0.00001
#this is the bernoulli outcome of the zero inflation
x[i] ~ dbern(pro[i])
#this logit transforms the theta model for 0-1 probability of zero inflation
logit(pro[i]) <- theta[i]
#mu[i] is predictors for poisson model- log link function.
log(mu[i]) <- int.p + x1[i]*b1.p + x2[i]*b2.p + x3[i]*b3.p + x4[i]*b4.p + x5[i]*b5.p
#theta[i] is predictors for zero inflation
theta[i] <- int.z + x1[i]*b1.z + x2[i]*b2.z + x3[i]*b3.z + x4[i]*b4.z + x5[i]*b5.z
}
#predictions
#This gets the range of pred.m blended model values, before they get sent through the poisson distribution, and corrected for time.
#x5 prediction
for (i in 1:N.pred){
x5.pred[i] <- pred5.mu[i]*pred5.x[i] + 0.00001
pred5.x[i] ~ dbern(pred5.pro[i])
logit(pred5.pro[i]) <- pred5.theta[i]
log(pred5.mu[i]) <- int.p + x1.m*b1.p + x2.m*b2.p + x3.m*b3.p + x4.m*b4.p + x5.r[i]*b5.p
pred5.theta[i] <- int.z + x1.m*b1.z + x2.m*b2.z + x3.m*b3.z + x4.m*b4.z + x5.r[i]*b5.z
}
#priors
b1.p ~ dnorm(0, .0001)
b2.p ~ dnorm(0, .0001)
b3.p ~ dnorm(0, .0001)
b4.p ~ dnorm(0, .0001)
b5.p ~ dnorm(0, .0001)
b1.z ~ dnorm(0, .0001)
b2.z ~ dnorm(0, .0001)
b3.z ~ dnorm(0, .0001)
b4.z ~ dnorm(0, .0001)
b5.z ~ dnorm(0, .0001)
int.p ~ dnorm(0, .0001)
int.z ~ dnorm(0, .0001)
}
"
However, if I insert a second prediction loop into this model for predictor x4 I get an error. The second prediction loops looks like this, and is inserted between the model loop and the x5 prediction loop.
#x4 prediction
for (i in 1:N.pred){
map.pred[i] <- pred4.mu[i]*pred4.x[i] + 0.00001
pred4.x[i] ~ dbern(pred4.pro[i])
logit(pred4.pro[i]) <- pred4.theta[i]
log(pred4.mu[i]) <- int.p + x1.m*b1.p + x2.m*b2.p + x3.m*b3.p + x4.r[i]*b4.p + x5.m*b5.p
pred4.theta[i] <- int.z + x1.m*b1.z + x2.m*b2.z + x3.m*b3.z + x4.r[i]*b4.z + x5.m*b5.z
}
It returns this error:
RUNTIME ERROR:
Compilation error on line 20.
Unable to resolve node pred4.mu[1]
This may be due to an undefined ancestor node or a directed cycle in the graph
This surprises me, as this prediction loop is formatted exactly the same as the x5 prediction loop, so I know everything should be defined, as it worked great before. Is there something about having two of these that is causing the problem?

Related

"this initial value does not correspond to a stochastic node" in WinBugs

I gave initial values for all the stochastic nodes, but WinBUGS still gives me the message that
this initial value does not correspond to a stochastic nodes
What node I am missing here?
#Model
model
{
for( i in 1 :N) {
Y[i] ~ dnorm(mu[i], tau)
Z[i] ~ dnorm(mu[i], tau)
mu[i] <- theta[1] + theta[2]* exp(-exp(-theta[3]*(x[i]-theta[4])))
}
theta[1]<-1.8
theta[2] ~ dnorm(0,0.01)
theta[3] ~ dnorm(0,0.01)
theta[4] ~ dnorm(0,0.01)
tau ~ dgamma(0.001, 0.001)
sigma <- 1 / sqrt(tau)
}
#Data
list(x=c(1971,1976,1981,1986,1991,1996,2001,2006,2011,2016,2021,2026,2031,2036,2041,2046,2051,2056,2061),
Y=c(5.2,4.7,4.5,4.2,3.6,3.4,3.1,2.8,2.4,2.3,NA,NA,NA,NA,NA,NA,NA,NA,NA), N=19)
#Initial values
list(theta=c(0,0,0), tau=1)

JAGS multiple linear regression with y[i] GAMMA (bayesian)

I have a question about this model in JAGS, I want to make a bayesian linear regression with a y[i] that follows not a normal distribution but a gamma.
The model is this:
"model {
Priors:
a ~ dnorm(0, 0.0001) # mean, precision = N(0, 10^4)
b ~ dnorm(0, 0.0001)
shape ~ dunif(0, 100)
# Likelihood data model:
for (i in 1:N) {
linear_predictor[i] <- a + b * x[i]
# dgamma(shape, rate) in JAGS:
y[i] ~ dgamma(shape, shape / exp(linear_predictor[i]))
}
}
"
What should I change to make this code usable for a multiple linear regression with this data?
dataListGamma = list(
x = x,
y = y,
Nx = dim(x)[2],
Ntotal = dim(x)[1]
)
i'm receiving this error:
Error in node (shape/(exp(linear_predictor[1331])))
how can this be possible? i cant understand
if i run it again it changes the value that makes the problem
Something like this (making b a vector with identical, independent priors for each element, and constructing the linear predictor with a for loop) should work:
model {
# Priors:
a ~ dnorm(0, 0.0001) # mean, precision = N(0, 10^4)
for (j in 1:Nx)
b[i] ~ dnorm(0, 0.0001)
}
shape ~ dunif(0, 100)
# Likelihood data model:
linear_predictor[i] <- a
for (i in 1:Ntotal) {
for (j in 1:Nx) {
linear_predictor[i] <- linear_predictor[i] + b[j]*N[i][j]
}
y[i] ~ dgamma(shape, shape / exp(linear_predictor[i]))
}
}

Fitting a polynomial regression model selected by `leaps::regsubsets`

I have performed best subset selection of linear regression model using leaps::regsubsets. Then I chose the model with 14 predictors and using coef(model, 14) gave me the following output:
structure(c(16.1303774392893, -0.0787496652705482, -0.104929454314886,
-1.22322411065346, 1.14718778105312, 0.75468065020279, 0.455617836039703,
0.521951041899427, 0.0124590834643436, -0.0002293804247409,
1.26667965342874e-07, 1.4002805624594e-06, -9.90560347112683e-07,
1.8809273394337e-06, 5.48249071436573e-07), .Names = c("(Intercept)", "X1",
"X2", "poly(X4, 2)1", "poly(X5, 2)1", "poly(X6, 2)2", "poly(X7, 2)2",
"poly(X9, 2)1", "X10", "X12", "X13", "X14", "X16", "X17", "X18"))
To get this model, I need to fit it with lm. As poly(X, 2)1 is linear and poly(X, 2)2 is quadratic, I did:
lm(X20 ~ X1 + X2 + X4 + X5 + I(X6 ^ 2) + I(X7 ^ 2) +
X9 + X10 + X12 + X13 + X14 + X16 + X17 + X18, df)
I think I know why coefficients are different (see poly() in lm(): difference between raw vs. orthogonal), but why don't they give the same fitted values and adjusted R2?
Of course, using poly(X, 2)[,2] in the formula gives complete consistency with regsubsets output. But is it valid to use only second term orthogonal polynomial and specify the model as follows?
lm(X20 ~ X1 + X2 + X4 + X5 + poly(X6, 2)[,2] + poly(X7, 2)[,2] +
X9 + X10 + X12 + X13 + X14 + X16 + X17 + X18, df)
Is there more direct way to retrieve single model from regsubsets output than specifying the model by hand?
but why don't they give the same fitted values and adjusted R2?
Fitted values won't necessarily be the same, if you don't use all columns from poly.
set.seed(0)
y <- runif(100)
x <- runif(100)
X <- poly(x, 3)
all.equal(lm(y ~ X)$fitted, lm(y ~ x + I(x ^ 2) + I(x ^ 3))$fitted)
#[1] TRUE
all.equal(lm(y ~ X[, 1:2])$fitted, lm(y ~ x + I(x ^ 2))$fitted)
#[1] TRUE
all.equal(lm(y ~ X - 1)$fitted, lm(y ~ x + I(x ^ 2) + I(x ^ 3) - 1)$fitted) ## no intercept
#[1] "Mean relative difference: 33.023"
all.equal(lm(y ~ X[, c(1, 3)])$fitted, lm(y ~ x + I(x ^ 3))$fitted)
#[1] "Mean relative difference: 0.03008166"
all.equal(lm(y ~ X[, c(2, 3)])$fitted, lm(y ~ I(x ^ 2) + I(x ^ 3))$fitted)
#[1] "Mean relative difference: 0.03297488"
We only have ~ 1 + poly(x, degree)[, 1:k] equivalent to ~ 1 + x + I(x ^ 2) + ... + I(x ^ k), for any k <= degree. (I explicitly write out the intercept, to emphasize that we have to start from polynomial of degree 0.)
(The reason is related to how an orthogonal polynomial is generated. See How `poly()` generates orthogonal polynomials? How to understand the "coefs" returned? for great great details. Note that when doing a QR factorization X = QR, as R is an upper triangular matrix (not a diagonal matrix), Q[, ind] will not have the same column space with X[, ind] for an arbitrary subset ind, unless ind = 1:k.)
So, I(x ^ 2) is not equivalent to ploy(x, 2)[, 2], and you will get different fitted values hence (adjusted) R2.
is it valid to use only second term orthogonal polynomial and specify the model as follows?
It is really a bad idea for leaps (or generally any modeler) to drop columns from an orthogonal polynomial. An orthogonal polynomial is a factor-alike term, whose significance is determined by F-statistics (i.e., treating all columns as a whole), rather than t-statistics for individual columns.
In fact, even for raw polynomials, it is not a good idea to omit any low order term. For example, y ~ 1 + I(x ^ 2) omitting linear term is not a good idea. A basic problem here is that it is not invariant to linear shift. For example, if we shift x for x1:
shift <- runif(1) ## an arbitrary value; can be `mean(x)`
x1 <- x - shift
then y ~ 1 + I(x ^ 2) is not equivalent to y ~ 1 + I(x1 ^ 2), but still, y ~ 1 + x + I(x ^ 2) is equivalent to y ~ 1 + x1 + I(x1 ^ 2).
all.equal(lm(y ~ 1 + I(x ^ 2))$fitted, lm(y ~ 1 + I(x1 ^ 2))$fitted)
#[1] "Mean relative difference: 0.02020984"
all.equal(lm(y ~ 1 + x + I(x ^ 2))$fitted, lm(y ~ 1 + x1 + I(x1 ^ 2))$fitted)
#[1] TRUE
I briefly mentioned the issue of dropping columns at R: How to or should I drop an insignificant orthogonal polynomial basis in a linear model?, but my examples here give you more insight.
Is there more direct way to retrieve single model from regsubsets output than specifying the model by hand?
I don't know; at least I did not figure it out almost 2 years ago when answering this thread: Get all models from leaps regsubsets.
One remaining question though. Assuming that leaps returns poly(X, 2)1 I should definitely retain poly(X, 2)1 in my model. But what if only poly(X, 2)1 is returned by leaps? Can higher order term can be dropped then?
There is no problem dropping higher order terms (in this case where you originally fitted a quadratic polynomial). As I said, we have equivalence for ind = 1:j, where j <= degree. But make sure you understand this. Take the following two examples.
If leaps drops poly(x, 5)3 and poly(x, 5)5. you can safely remove poly(x, 5)5, but are still advised to retain poly(x, 5)3. This is, instead of fitting an 5-th order polynomial, you fit a 4-th order one.
If leaps drops poly(x, 6)3 and poly(x, 6)5. Since poly(x, 6)6 is not dropped, you are advised to drop no terms at all.

Text string in Knitr (Rstudio): Error in parse

I am attempting to use knitr trough Rstudio to document a model that save a text string to a *txt file.
When doing so, i get this R markdown error message:
*Error in parse(text = x, srcfile = src) : <text>:2:24: unexpected
INCOMPLETE_STRING 14: var.m <- 1/tau.m # between-trial variance 15:
Calls: <Anonymous> ... <Anonymous> -> parse_all -> parse_all.character -> parse*
Anyone know to fix this?
This string works fine:
Modelstring.baseline = " Text goes here "
This string works fine:
Modelstring.baseline = "
# Binomial likelihood, logit link, MTC
# Fixed effect model
#CV mortality
model{ # *** PROGRAM STARTS
for(i in 1:ns){ # LOOP THROUGH STUDIES
mu[i] ~ dnorm(0,.0001) # vague priors for all trial baselines
for (k in 1:na[i]) { # LOOP THROUGH ARMS
r[i,k] ~ dbin(p[i,k],n[i,k]) # binomial likelihood
logit(p[i,k]) <- mu[i] + d[t[i,k]]-d[t[i,1]] # model for linear predictor
rhat[i,k] <- p[i,k] * n[i,k] # expected value of the numerators
dev[i,k] <- 2 * (r[i,k] * (log(r[i,k])-log(rhat[i,k])) # Deviance contribution
+ (n[i,k]-r[i,k]) * (log(n[i,k]-r[i,k]) - log(n[i,k]-rhat[i,k])))
}
resdev[i] <- sum(dev[i,1:na[i]]) # summed residual deviance contribution for this trial
}
totresdev <- sum(resdev[]) # Total Residual Deviance
d[1]<- 0 # treatment effect is zero for reference treatment
for (k in 2:nt) { d[k] ~ dnorm(0,.0001) } # vague priors for treatment effects
"
Whiles this string generate a parser error:
Modelstring.baseline = "
model{ # *** PROGRAM STARTS
for (i in 1:ns)
{ # LOOP THROUGH STUDIES
r[i] ~ dbin(p[i],n[i]) # Likelihood
logit(p[i]) <- mu[i] # Log-odds of response
mu[i] ~ dnorm(m,tau.m) # Random effects model
}
mu.new ~ dnorm(m,tau.m) # predictive dist. (log-odds)
m ~ dnorm(0,.0001) # vague prior for mean
var.m <- 1/tau.m # between-trial variance
#---Non-informative prior
#tau.m <- pow(sd.m,-2)
#sd.m ~ dunif(0,5)
#---Vaguely informative prior
#tau.m ~ dgamma(0.001,.001)
#sd.m ~ pow(tau.m,-0.5)
#---Informative prior R.M Turner et al LN(-3.95, 1.79)
tau.m <- 1/tausq
tausq ~ dlnorm(-3.95, 0.31) #0.31 = 1/(1.79*1.79)
}
"

Bayesian ANCOVA in R via jags

I'm trying to implement a Bayesian ANCOVA that takes account of heteroscedasticity in R using JAGS. However, despite going through several tutorials of Bayesian simple regression and ANOVA, I can't understand how to prepare the file for JAGS. Here is my code so far:
y1 = rexp(57, rate=0.8) # dependent variable
x1 = hist(rbeta(57, 6, 2)) # continuous factor
x2 = rep(c(1, 2), 57/2) # categorical factor
groups = 2
n = 57
# list of variables
lddados <- list(g=groups, n=length(x), y=y, x1=x1, x2=x2)
sink('reglin.txt') # nome do arquivo aqui
cat('
# model
{
for(i in 1:n){
mu[i] = a0 + a[i]
y[i] = a0 + x1*a[ x2[i] ] + ε[i]
}
priors
y ~ dgamma(0.001,0.01)
for(i in 1:n){
inter[i] ~ dgamma(0.001,0.001)
coef[i] ~ dnorm(0.0,1.0E-
likelihood
got stuck...
}
}#------fim do modelo
')
sink()
Im currently trying out ANCOVA using rjags myself...
From my understanding, I would test this (untested);
require(rjags)
require(coda)
model_string <- "
model {
for ( i in 1:n ){
mu[i] <- a0 + a[x2[i]] + a3 * x1[i] # linear predictor
y[i] ~ dnorm(mu[i], prec) # y is norm. dist.
}
# priors
a0 ~ dnorm(0, 1.0E-6) # intercept
a[1] ~ dnorm(0, 1.0E-6) # effect of x1 at x2 level 1
a[2] ~ dnorm(0, 1.0E-6) # effect of x1 at x2 level 2
a3 ~ dnorm(0, 1.0E-6) # regression coefficient for x1 (covariate)
prec ~ dgamma(0.001, 0.001) # precision (inverse of variance)
}
"
# initial values for the mcmc
inits_list <- list(a=0, b=c(0,0), prec=100)
# model, initial values and data in right format
jags_model <- jags.model(textConnection(model_string), data=data, inits=inits_list, n.adapt = 500, n.chains = 3, quiet = T)
# burn-in
update(jags_model, 10000)
# run the mcmc chains using the coda package
mcmc_samples <- coda.samples(jags_model, c("mu", "a", "a1", "a2", "prec"), n.iter = 100000)
Tell me if it works...
Recommended books; McCarthy M. Bayesian Methods for Ecology and Kruschke JK. Doing Bayesian Data Analysis

Resources