Intro to JAGS analysis - r

I am a student studying bayesian statistics and have just begun to use JAGS using a intro script written by my lecturer, with us (the students) having to only enter the data and the number of iterations. The following is the script with my data added into it:
setwd("C:\\Users\\JohnSmith\\Downloads")
rawdata = read.table("bwt.txt",header=TRUE)
Birthweight = rawdata$Birthweight
Age = rawdata$Age
model = "model
{
beta0 ~ dnorm(0, 1/1000^2)
beta1 ~ dnorm(0, 1/1000^2)
log_sigma ~ dunif(-10, 10)
sigma <- exp(log_sigma)
for(i in 1:N)
{
mu[i] <- beta0 + beta1 * Age[i]
Birthweight[i] ~ dnorm(mu[i], 1/sigma^2)
}
}
"
data = list(x=Birthweight, y=Age, N=24)
# Variables to monitor
variable_names = c('beta0','beta1')
# How many burn-in steps?
burn_in = 1000
# How many proper steps?
steps = 100000
# Thinning?
thin = 10
# Random number seed
seed = 2693795
# NO NEED TO EDIT PAST HERE!!!
# Just run it all and use the results list.
library('rjags')
# Write model out to file
fileConn=file("model.temp")
writeLines(model, fileConn)
close(fileConn)
if(all(is.na(data)))
{
m = jags.model(file="model.temp", inits=list(.RNG.seed=seed, .RNG.name="base::Mersenne-Twister"))
} else
{
m = jags.model(file="model.temp", data=data, inits=list(.RNG.seed=seed, .RNG.name="base::Mersenne-Twister"))
}
update(m, burn_in)
draw = jags.samples(m, steps, thin=thin, variable.names = variable_names)
# Convert to a list
make_list <- function(draw)
{
results = list()
for(name in names(draw))
{
# Extract "chain 1"
results[[name]] = as.array(draw[[name]][,,1])
# Transpose 2D arrays
if(length(dim(results[[name]])) == 2)
results[[name]] = t(results[[name]])
}
return(results)
}
results = make_list(draw)
However, when I run the following code I get the following error message:
Error in jags.model(file = "model.temp", data = data, inits = list(.RNG.seed = seed, :
RUNTIME ERROR:
Compilation error on line 11.
Unknown parameter Age
In addition: Warning messages:
1: In jags.model(file = "model.temp", data = data, inits = list(.RNG.seed = seed, :
Unused variable "x" in data
2: In jags.model(file = "model.temp", data = data, inits = list(.RNG.seed = seed, :
Unused variable "y" in data
But as far as I can see, line 11 is blank, which leaves me stumped as to where the error is coming from. If anyone can give me some tips as to solve this, it will be greatly appreciated.

The names of the elements of your list of data (data) should match the names of the variables in your model.
You have:
data = list(x=Birthweight, y=Age, N=24)
so JAGS is looking for variables called x and y in your model. However, in your model, you have:
mu[i] <- beta0 + beta1 * Age[i]
Birthweight[i] ~ dnorm(mu[i], 1/sigma^2)
That is, your variables are called Age and Birthweight.
So, either change your list to:
data <- list(Birthweight=Birthweight, Age=Age, N=24)
or change your model to:
mu[i] <- beta0 + beta1 * y[i]
x[i] ~ dnorm(mu[i], 1/sigma^2)
Had you done readLines('model.temp') (or opened model.temp in a text editor), you would have seen that line 11 of that file refers to the line that contains mu[i] <- beta0 + beta1 * Age[i], which is the first error that JAGS encountered due to the reference to Age, for which neither data nor a prior was provided.

Related

Error: Attempt to redefine node in linear regression

I have fitted following simple linear regression Bayesian model using rjags.
I was able to run the model by specifying all the predictors separately(like for a lm object). Now I want to learn how to specify the predictors by introducing them as a matrix instead of specifying them separately.
So I ran the following code, but it gave some errors.
I used tobbaco data set in rrr package to provide a reproducible example.
library(rrr)
require(dplyr)
library(rjags)
tobacco <- as_data_frame(tobacco)
N1 = length(tobacco$Y1.BurnRate)
x1 = model.matrix(Y1.BurnRate~X2.PercentChlorine+X3.PercentPotassium ,data = tobacco)
bayes_model_mul1=
"model {
for(i in 1:N1){
Y1.BurnRate[i]~dnorm(mu1[i],tau1)
for(j in 1:3){
mu1[i]=beta1[j]*x1[i,j]
}
}
for (l in 1:3) { beta1[l] ~dnorm(0, 0.001) }
tau1 ~ dgamma(.01,.01)
sigma_tau1 = 1/tau1
}"
model3 <- jags.model(textConnection(bayes_model_mul1),
data = list(Y1.BurnRate=tobacco$Y1.BurnRate, x1=x1, N1=N1),
n.chains=1)
After I run model3 , I got following error.
Error in jags.model(textConnection(bayes_model_mul1), data = list(Y1.BurnRate = tobacco$Y1.BurnRate, :
RUNTIME ERROR:
Compilation error on line 6.
Attempt to redefine node mu1[1]
Can anyone help me figure this out ?
Does this due to introducing predictors as a matrix ?
There are a few ways to do this, here are two:
Use matrix multiplication outside of the likelihood loop
m1 =
"model {
mu1 = x1 %*% beta1 # ---> this
for(i in 1:N1){
Y1.BurnRate[i] ~ dnorm(mu1[i], tau1)
}
for (l in 1:3) { beta1[l] ~ dnorm(0, 0.001) }
tau1 ~ dgamma(.01,.01)
sigma_tau1 = 1/tau1
}"
Use inprod to multiply the parameters with the design matrix
m2 =
"model {
for(i in 1:N1){
mu1[i] = inprod(beta1, x1[i,]) #----> this
Y1.BurnRate[i] ~ dnorm(mu1[i], tau1)
}
for (l in 1:3) { beta1[l] ~ dnorm(0, 0.001) }
tau1 ~ dgamma(.01,.01)
sigma_tau1 = 1/tau1
}"
You received an error with for(j in 1:3){ mu1[i] = beta1[j]* x1[i,j] } as every time you loop though the parameter index j you overwrite mu1[i]. It also doesn't sum up the individual terms. You may be able to index mu1 with j as well and then sum but untested ...

Constrain order of parameters in R JAGS

I am puzzled by a simple question in R JAGS. I have for example, 10 parameters: d[1], d[2], ..., d[10]. It is intuitive from the data that they should be increasing. So I want to put a constraint on them.
Here is what I tried to do but it give error messages saying "Node inconsistent with parents":
model{
...
for (j in 1:10){
d.star[j]~dnorm(0,0.0001)
}
d=sort(d.star)
}
Then I tried this:
d[1]~dnorm(0,0.0001)
for (j in 2:10){
d[j]~dnorm(0,0.0001)I(d[j-1],)
}
This worked, but I don't know if this is the correct way to do it. Could you share your thoughts?
Thanks!
If you are ever uncertain about something like this, it is best to just simulate some data to determine if the model structure you suggest works (spoiler alert: it does).
Here is the model that I used:
cat('model{
d[1] ~ dnorm(0, 0.0001) # intercept
d[2] ~ dnorm(0, 0.0001)
for(j in 3:11){
d[j] ~ dnorm(0, 0.0001) I(d[j-1],)
}
for(i in 1:200){
y[i] ~ dnorm(mu[i], tau)
mu[i] <- inprod(d, x[i,])
}
tau ~ dgamma(0.01,0.01)
}',
file = "model_example.R")```
And here are the data I simulated to use with this model.
library(run.jags)
library(mcmcplots)
# intercept with sorted betas
set.seed(161)
betas <- c(1,sort(runif(10, -5,5)))
# make covariates, 1 for intercept
x <- cbind(1,matrix(rnorm(2000), nrow = 200, ncol = 10))
# deterministic part of model
y_det <- x %*% betas
# add noise
y <- rnorm(length(y_det), y_det, 1)
data_list <- list(y = as.numeric(y), x = x)
# fit the model
mout <- run.jags('model_example.R',monitor = c("d", "tau"), data = data_list)
Following this, we can plot out the estimates and overlay the true parameter values
caterplot(mout, "d", reorder = FALSE)
points(rev(c(1:11)) ~ betas, pch = 18,cex = 0.9)
The black points are the true parameter values, the blue points and lines are the estimates. Looks like this set up does fine so long as there are enough data to estimate all of those parameters.
It looks like there is an syntax error in the first implementation. Just try:
model{
...
for (j in 1:10){
d.star[j]~dnorm(0,0.0001)
}
d[1:10] <- sort(d.star) # notice d is indexed.
}
and compare the results with those of the second implementation. According to the documentation, these are both correct, but it is advised to use the function sort.

Function lipsitz.test {generalhoslem} is not working for object clm{ordinal}

I am now tring to test the goodness of fit of an ordianl model using lipsitz.test {generalhoslem}. According to the document, the function can deal with both polr and clm. However, when I try to use clm in the lipsitz.testfunction, an error occurs. Here is an example
library("ordinal")
library(generalhoslem)
data("wine")
fm1 <- clm(rating ~ temp * contact, data = wine)
lipsitz.test(fm1)
Error in names(LRstat) <- "LR statistic" :
'names' attribute [1] must be the same length as the vector [0]
In addition: Warning message:
In lipsitz.test(fm1) :
n/5c < 6. Running this test when n/5c < 6 is not recommended.
Is there any solution to fix this? Thanks a lot.
I'm not sure if this is off-topic and should be on CrossValidated. It's partly a problem with the coding of the test and partly about the statistics of the test itself.
There are two problems. I've just spotted a bug in the code when using clm and will push a fix to CRAN (corrected code below).
There does however appear to be a more fundamental problem with the example data. Basically, the Lipsitz test requires fitting a new model with dummy variables of the groupings. When fitting the new model with this example, the model fails and so some of the coefficients are not calculated. If using polr, the new model gets the warning that it is rank-deficient; if using clm, the new model gets a message that two coefficients are not fitted due to singularities. I think this example data set is just unsuitable for this kind of analysis.
The corrected code is below and I have used a larger example dataset on which the test runs.
lipsitz.test <- function (model, g = NULL) {
oldmodel <- model
if (class(oldmodel) == "polr") {
yhat <- as.data.frame(fitted(oldmodel))
} else if (class(oldmodel) == "clm") {
predprob <- oldmodel$model[, 2:ncol(oldmodel$model)]
yhat <- predict(oldmodel, newdata = predprob, type = "prob")$fit
} else warning("Model is not of class polr or clm. Test may fail.")
formula <- formula(oldmodel$terms)
DNAME <- paste("formula: ", deparse(formula))
METHOD <- "Lipsitz goodness of fit test for ordinal response models"
obs <- oldmodel$model[1]
if (is.null(g)) {
g <- round(nrow(obs)/(5 * ncol(yhat)))
if (g < 6)
warning("n/5c < 6. Running this test when n/5c < 6 is not recommended.")
}
qq <- unique(quantile(1 - yhat[, 1], probs = seq(0, 1, 1/g)))
cutyhats <- cut(1 - yhat[, 1], breaks = qq, include.lowest = TRUE)
dfobs <- data.frame(obs, cutyhats)
dfobsmelt <- melt(dfobs, id.vars = 2)
observed <- cast(dfobsmelt, cutyhats ~ value, length)
if (g != nrow(observed)) {
warning(paste("Not possible to compute", g, "rows. There might be too few observations."))
}
oldmodel$model <- cbind(oldmodel$model, cutyhats = dfobs$cutyhats)
oldmodel$model$grp <- as.factor(vapply(oldmodel$model$cutyhats,
function(x) which(observed[, 1] == x), 1))
newmodel <- update(oldmodel, . ~ . + grp, data = oldmodel$model)
if (class(oldmodel) == "polr") {
LRstat <- oldmodel$deviance - newmodel$deviance
} else if (class(oldmodel) == "clm") {
LRstat <- abs(-2 * (newmodel$logLik - oldmodel$logLik))
}
PARAMETER <- g - 1
PVAL <- 1 - pchisq(LRstat, PARAMETER)
names(LRstat) <- "LR statistic"
names(PARAMETER) <- "df"
structure(list(statistic = LRstat, parameter = PARAMETER,
p.value = PVAL, method = METHOD, data.name = DNAME, newmoddata = oldmodel$model,
predictedprobs = yhat), class = "htest")
}
library(foreign)
dt <- read.dta("http://www.ats.ucla.edu/stat/data/hsbdemo.dta")
fm3 <- clm(ses ~ female + read + write, data = dt)
lipsitz.test(fm3)
fm4 <- polr(ses ~ female + read + write, data = dt)
lipsitz.test(fm4)

R2Winbugs - error in inits specification?

While attempting to adapt a working WinBUGS model and mitigate it to R using R2WinBUGS, I got several error messages.
I believe is related to the specification of the inits, but have been unable resolve the issue.
The first was error message was using inits1:
list(A=1, d=c(NA,0,0,0,0), mu=c(0,0,0,0))
Error in bugs(mydata, inits = inits1, model.file = "mtcfe.txt", parameters = c("or"), :
Number of initialized chains (length(inits)) != n.chains
After reading the suggested fix by Uwe Liggers "List containing lists solution" I modified the the inits1 to inits2:
inits2 <- list(list(A=1, d=c(NA,0,0,0,0), mu=c(0,0,0,0)))
And received the error:
Error in bugs.run(n.burnin, bugs.directory, WINE = WINE, useWINE = useWINE, : Look at the log file and try again with 'debug=TRUE' to figure out what went wrong within Bugs.
I have also attempted the fix suggested by AndyC at this post "getting-winbugs-leuk-example-to-work-from-r-using-r2winbugs". By changing the inits to:
inits4 <- function(){list(list(A=1, d=c(NA,0,0,0,0), mu=c(0,0,0,0)))}
And received the error:
Error in bugs.run(n.burnin, bugs.directory, WINE = WINE, useWINE = useWINE, : Look at the log file and try again with 'debug=TRUE' to figure out what went wrong within Bugs.
This is my attempt on R2WinBUGS.
Note, I have included several inits in the code that did not work:
work.dir <- "removed from example"
setwd(work.dir)
getwd() # check working directory
# Load Package
library(R2WinBUGS)
# Read data
mydata <- list(nt=5,
ns=4,
r=structure(
.Data = c(2506,7834,6729,2139,
2548,7860,6710,4418),
.Dim = c(4,2)),
n=structure(
.Data = c(2697, 8212, 7266, 2333,
2701,8280,7257,4687),
.Dim= c(4,2)),
t=structure(
.Data = c( 1,1,1,1,
2,3,4,5),
.Dim = c(4,2)),
na=structure(
.Data = c(2,2,2,2))
)
bugs.data(mydata)
# Set initial values
inits <- function(){list(A=1, d=c(NA,0,0,0,0), mu=c(0,0,0,0))}
#inits2 <- list(A=1, d=c(NA,0,0,0,0), mu=c(0,0,0,0))
#inits3 <- function(){list(inits2,inits2)}
#inits4 <- function(){list(inits,inits)}
#inits5 <- list(inits,inits)
# CALL WinBUGS AND SAVE RESULTS IN VARIABLE out.re
out.re <- bugs(mydata,inits=inits1, # load data and initial values
model.file="mtcfe.txt", # file with model to run
parameters=c("or"),
n.thin=1,
n.chains=1, n.iter=1500, n.burnin=500,
bugs.directory=bd,
working.directory=work.dir,
debug=TRUE)
print(out.re,digits=4) #lists all results as in WinBUGS stats(*)
And this is the working WinBUGS code, adapted from a WinBUGS course held by the universities of Bristol and Leicester:
# Binomial likelihood, logit link, MTC
# Fixed effect model
model{ # *** PROGRAM STARTS
for(i in 1:ns){ # LOOP THROUGH STUDIES
mu[i] ~ dnorm(0,.0001) # vague priors for all trial baselines
for (k in 1:na[i]) { # LOOP THROUGH ARMS
r[i,k] ~ dbin(p[i,k],n[i,k]) # binomial likelihood
logit(p[i,k]) <- mu[i] + d[t[i,k]]-d[t[i,1]] # model for linear predictor
rhat[i,k] <- p[i,k] * n[i,k] # expected value of the numerators
dev[i,k] <- 2 * (r[i,k] * (log(r[i,k])-log(rhat[i,k])) #Deviance contribution
+ (n[i,k]-r[i,k]) * (log(n[i,k]-r[i,k]) - log(n[i,k]-rhat[i,k])))
}
resdev[i] <- sum(dev[i,1:na[i]]) # summed residual deviance contribution for this trial
}
totresdev <- sum(resdev[]) #Total Residual Deviance
d[1]<- 0 # treatment effect is zero for reference treatment
for (k in 2:nt) { d[k] ~ dnorm(0,.0001) } # vague priors for treatment effects
# pairwise ORs and LORs for all possible pair-wise comparisons
for (c in 1:(nt-1)) { for (k in (c+1):nt) {
or[c,k] <- exp(d[k] - d[c])
lor[c,k] <- (d[k]-d[c])
}
}
# ranking
for (k in 1:nt) {
rk[k] <- nt+1-rank(d[],k) # assumes events are “good”
# rk[k] <- rank(d[],k) # assumes events are “bad”
best[k] <- equals(rk[k],1) #calculate probability that treat k is best
}
# Absolute effects
A ~ dnorm(-2.6,precA)
precA <- pow(0.38,-2) # prior precision for Treatment A, sd=0.38 on logit scale
for (k in 1:nt) { logit(T[k]) <- A + d[k] }
} # *** PROGRAM ENDS
#Inits
list(A=1, d=c(NA,0,0,0,0), mu=c(0,0,0,0))
#Data
list(nt=5.00000E+00, ns=4.00000E+00, r= structure(.Data= c(2.50600E+03, 2.54800E+03, 7.83400E+03, 7.86000E+03, 6.72900E+03, 6.71000E+03, 2.13900E+03, 4.41800E+03), .Dim=c(4, 2)), n= structure(.Data= c(2.69700E+03, 2.70100E+03, 8.21200E+03, 8.28000E+03, 7.26600E+03, 7.25700E+03, 2.33300E+03, 4.68700E+03), .Dim=c(4, 2)), t= structure(.Data= c(1.00000E+00, 2.00000E+00, 1.00000E+00, 3.00000E+00, 1.00000E+00, 4.00000E+00, 1.00000E+00, 5.00000E+00), .Dim=c(4, 2)), na=c(2.00000E+00, 2.00000E+00, 2.00000E+00, 2.00000E+00))

error message JAGS subset out of range

I am attempting to call the following jags model in R:
model{
# Main model level 1
for (i in 1:N){
ficon[i] ~ dnorm(mu[i], tau)
mu[i] <- alpha[country[i]]
}
# Priors level 1
tau ~ dgamma(.1,.1)
# Main model level 2
for (j in 1:J){
alpha[j] ~ dnorm(mu.alpha, tau.alpha)
}
# Priors level 2
mu.alpha ~ dnorm(0,.01)
tau.alpha ~ dgamma(.1,.1)
sigma.1 <- 1/(tau)
sigma.2 <- 1/(tau.alpha)
ICC <- sigma.2 / (sigma.1+sigma.2)
}
This is a hierarchical model, where ficon is a continuous variable 0-60, that may have a different mean or distribution by country. N = number of total observations (2244) and J = number of countries (34). When I run this model, I keep getting the following error message:
Compilation error on line 5.
Subset out of range: alpha[35]
This code worked earlier, but it's not working now. I assume the problem is that there are only 34 countries, and that's why it's getting stuck at i=35, but I'm not sure how to solve the problem. Any advice you have is welcome!
The R code that I use to call the model:
### input files JAGS ###
data <- list(ficon = X$ficon, country = X$country, J = 34, N = 2244)
inits1 <- list(alpha = rep(0, 34), mu.alpha = 0, tau = 1, tau.alpha = 1)
inits2 <- list(alpha = rep(1, 34), mu.alpha = 1, tau = .5, tau.alpha = .5)
inits <- list(inits1, inits2)
# call empty model
eqlsempty <- jags(data, inits, model.file = "eqls_emptymodel.R",
parameters = c("mu.alpha", "sigma.1", "sigma.2", "ICC"),
n.chains = 2, n.iter = itt, n.burnin = bi, n.thin = 10)
To solve the problem you need to renumber your countries so they only have the values 1 to 34. If you only have 34 countries and yet you are getting the error message you state then one of the countries must have the value 35. To solve this one could call the following R code before bundling the data:
x$country <- factor(x$country)
x$country <- droplevels(x$country)
x$country <- as.integer(x$country)
Hope this helps

Resources