Data file seemingly too large with wine and openBUGS - r

I have recently started trying to do some bayesian modelling with OpenBUGS and R using R2OpenBUGS.
I followed a great link for installing wine and OpenBUGS by David Eagles and the schools example posted there worked fine.
I then tried to run some code I had and kept getting an error after the the model is syntactically correct telling me the data was not loading properly.
After a few days of troubleshooting this it seems that when the datafile is above a certain length the top bit (where you traditionally highlight list in OpenBUGS to load data) is lost and the data cannot be loaded by OpenBUGS.
A screenshot of my OpenBUGS program with the loaded data where the list section has been cut off
This picture shows that at the top of my data.txt file there is no longer a list{ which is basically stopping all of my attempts to run the model.
To confirm this I have ran the example of a simple linear regression where I set the number of points to 1000 and then to 100:
# Load in paths for running OpenBUGS through wine
source("/Users/dp323/Desktop/R/scripts/Where_is_my_OpenBUGS.R")
# set up model
linemodel <- function() {
for (j in 1:N) {
Y[j] ~ dnorm(mu[j], tau) ## Response values Y are Normally distributed
mu[j] <- alpha + beta * (x[j] - xbar) ## linear model with x values centred
}
## Priors
alpha ~ dnorm(0, 0.001)
beta ~ dnorm(0, 0.001)
tau ~ dgamma(0.001, 0.001)
sigma <- 1/sqrt(tau)
}
# set up data ####
linedata <- list(Y = c(1:100), x = c(1:100), N = 100, xbar = 3)
# set initial values ####
lineinits <- function() {
list(alpha = 1, beta = 1, tau = 1)
}
# run bugs ####
lineout <- bugs(data = linedata, inits = lineinits, parameters.to.save = c("alpha", "beta", "sigma"), model.file = linemodel, n.chains = 1, n.iter = 10000, OpenBUGS.pgm = OpenBUGS.pgm, WINE = WINE, WINEPATH = WINEPATH, useWINE = T, debug = T)
The model runs great when linedata <- list(Y = c(1:100), x = c(1:100), N = 100, xbar = 3) but falls apart at the loading in data stage when linedata <- list(Y = c(1:1000), x = c(1:1000), N = 1000, xbar = 3).
I have exhausted a google search of limits of data file size on wine and OpenBUGS and not found anything helpful.
Anyone have any suggestions of what to try/where to start/experience of this before?

Related

Winbugs to Rjags beta binomial model translation

I am working through the textbook "Bayesian Ideas and Data Analysis" by Christensen et al.
There is a simple exercise in the book that involves cutting and pasting the following code to run in Winbugs:
model{ y ~ dbin(theta, n) # Model the data
ytilde ~ dbin(theta, m) # Prediction of future binomial
theta ~ dbeta(a, b) # The prior
prob <- step(ytilde - 20) # Pred prob that ytilde >= 20 }
list(n=100, m=100, y=10, a=1, b=1) # The data
list(theta=0.5, ytilde=10) # Starting/initial values
I am trying to translate the following into R2jags code and am running into some trouble. I thought I could fairly directly write my R2Jags code in this fashion:
model {
#Likelihoods
y ~ dbin(theta,n)
yt ~ dbin(theta,m)
#Priors
theta ~ dbeta(a,b)
prob <- step(yt - 20)
}
with the R code:
library(R2jags)
n <- 100
m <- 100
y <- 10
a <- 1
b <- 1
jags.data <- list(n = n,
m = m,
y = y,
a = a,
b = b)
jags.init <- list(
list(theta = 0.5, yt = 10), #Chain 1 init
list(theta = 0.5, yt = 10), #Chain 2 init
list(theta = 0.5, yt = 10) #Chain 3 init
)
jags.param <- c("theta", "yt")
jags.fit <- jags.model(data = jags.data,
inits = jags.inits,
parameters.to.save = jags.param,
model.file = "hw21.bug",
n.chains = 3,
n.iter = 5000,
n.burnin = 100)
print(jags.fit)
However, calling the R code brings about the following error:
Error in jags.model(data = jags.data, inits = jags.inits, parameters.to.save = jags.param, :
unused arguments (parameters.to.save = jags.param, model.file = "hw21.bug", n.iter = 5000, n.burnin = 100)
Is it because I am missing a necessary for loop in my R2Jags model code?
The error is coming from the R function jags.model (not from JAGS) - you are trying to use arguments parameters.to.save etc to the wrong function.
If you want to keep the model as similar to WinBUGS as possible, there is an easier way than specifying the data and initial values in R. Put the following into a text file called 'model.txt' in your working directory:
model{
y ~ dbin(theta, n) # Model the data
ytilde ~ dbin(theta, m) # Prediction of future binomial
theta ~ dbeta(a, b) # The prior
prob <- step(ytilde - 20) # Pred prob that ytilde >= 20
}
data{
list(n=100, m=100, y=10, a=1, b=1) # The data
}
inits{
list(theta=0.5, ytilde=10) # Starting/initial values
}
And then run this in R:
library('runjags')
results <- run.jags('model.txt', monitor='theta')
results
plot(results)
For more information on this method of translating WinBUGS models to JAGS see:
http://runjags.sourceforge.net/quickjags.html
Matt
This old blog post has an extensive example of converting BUGS to JAGS accessed via package rjags not R2jags. (I like the package runjags even better.) I know we're supposed to present self-contained answers here, not just links, but the post is rather long. It goes through each logical step of a script, including:
loading the package
specifying the model
assembling the data
initializing the chains
running the chains
examining the results

Intro to JAGS analysis

I am a student studying bayesian statistics and have just begun to use JAGS using a intro script written by my lecturer, with us (the students) having to only enter the data and the number of iterations. The following is the script with my data added into it:
setwd("C:\\Users\\JohnSmith\\Downloads")
rawdata = read.table("bwt.txt",header=TRUE)
Birthweight = rawdata$Birthweight
Age = rawdata$Age
model = "model
{
beta0 ~ dnorm(0, 1/1000^2)
beta1 ~ dnorm(0, 1/1000^2)
log_sigma ~ dunif(-10, 10)
sigma <- exp(log_sigma)
for(i in 1:N)
{
mu[i] <- beta0 + beta1 * Age[i]
Birthweight[i] ~ dnorm(mu[i], 1/sigma^2)
}
}
"
data = list(x=Birthweight, y=Age, N=24)
# Variables to monitor
variable_names = c('beta0','beta1')
# How many burn-in steps?
burn_in = 1000
# How many proper steps?
steps = 100000
# Thinning?
thin = 10
# Random number seed
seed = 2693795
# NO NEED TO EDIT PAST HERE!!!
# Just run it all and use the results list.
library('rjags')
# Write model out to file
fileConn=file("model.temp")
writeLines(model, fileConn)
close(fileConn)
if(all(is.na(data)))
{
m = jags.model(file="model.temp", inits=list(.RNG.seed=seed, .RNG.name="base::Mersenne-Twister"))
} else
{
m = jags.model(file="model.temp", data=data, inits=list(.RNG.seed=seed, .RNG.name="base::Mersenne-Twister"))
}
update(m, burn_in)
draw = jags.samples(m, steps, thin=thin, variable.names = variable_names)
# Convert to a list
make_list <- function(draw)
{
results = list()
for(name in names(draw))
{
# Extract "chain 1"
results[[name]] = as.array(draw[[name]][,,1])
# Transpose 2D arrays
if(length(dim(results[[name]])) == 2)
results[[name]] = t(results[[name]])
}
return(results)
}
results = make_list(draw)
However, when I run the following code I get the following error message:
Error in jags.model(file = "model.temp", data = data, inits = list(.RNG.seed = seed, :
RUNTIME ERROR:
Compilation error on line 11.
Unknown parameter Age
In addition: Warning messages:
1: In jags.model(file = "model.temp", data = data, inits = list(.RNG.seed = seed, :
Unused variable "x" in data
2: In jags.model(file = "model.temp", data = data, inits = list(.RNG.seed = seed, :
Unused variable "y" in data
But as far as I can see, line 11 is blank, which leaves me stumped as to where the error is coming from. If anyone can give me some tips as to solve this, it will be greatly appreciated.
The names of the elements of your list of data (data) should match the names of the variables in your model.
You have:
data = list(x=Birthweight, y=Age, N=24)
so JAGS is looking for variables called x and y in your model. However, in your model, you have:
mu[i] <- beta0 + beta1 * Age[i]
Birthweight[i] ~ dnorm(mu[i], 1/sigma^2)
That is, your variables are called Age and Birthweight.
So, either change your list to:
data <- list(Birthweight=Birthweight, Age=Age, N=24)
or change your model to:
mu[i] <- beta0 + beta1 * y[i]
x[i] ~ dnorm(mu[i], 1/sigma^2)
Had you done readLines('model.temp') (or opened model.temp in a text editor), you would have seen that line 11 of that file refers to the line that contains mu[i] <- beta0 + beta1 * Age[i], which is the first error that JAGS encountered due to the reference to Age, for which neither data nor a prior was provided.

How can I save a JAGS model object in R?

I am using the package rjags to do MCMC in R and I would like to save the output of the function jags.model for later use in another R session.
Here is a simple example for the mean of a Normal distribution:
library(rjags)
N <- 1000
x <- rnorm(N, 0, 5)
model.str <- 'model {for (i in 1:N) {
x[i] ~ dnorm(mu, 5)}
mu ~ dnorm(0, .0001)}'
jags <- jags.model(textConnection(model.str), data = list(x = x, N = N))
update(jags, 1000)
I can generate samples of mu like this:
coda.samples(model=jags,n.iter=1,variable.names="mu")
# [[1]]
# Markov Chain Monte Carlo (MCMC) output:
# Start = 2001
# End = 2001
# Thinning interval = 1
# mu
# [1,] 0.2312028
#
# attr(,"class")
# [1] "mcmc.list"
Now I would like to save the model object jags for later use in a new R session, so that I don't have to initialize and burn in the Markov Chain again:
save(file="/tmp/jags.Rdata", list="jags")
quit()
However, after starting a new R session and reloading the model I get an error message that the JAGS model must be recompiled:
load("/tmp/jags.Rdata")
coda.samples(model=jags,n.iter=1,variable.names="mu")
# Error in model$iter() : JAGS model must be recompiled
Why is that? How can I save the object jags in R for later use?
Note: The question has been asked before, but the OP was not very specific about the problem.
Maybe I am totally off track regarding what you really want to do, but I would set up a jags model like this, using R2jags instead of rjags (just something like a different wrapper):
library(R2jags)
N <- 1000
x <- rnorm(N, 0, 5)
sink("test.txt")
cat("
model{
for (i in 1:N) {
x[i] ~ dnorm(mu, 5)
}
mu ~ dnorm(0, .0001)
}
",fill = TRUE)
sink()
inits <- function() {
list(
mu = dnorm(1, 0, 0.01))
}
params <- c("mu")
chains <- 3
iter <- 1000
jags1 <- jags(model.file = "test.txt", data = list(x = x, N = N),
parameters.to.save = params, inits = inits,
n.chains = chains, n.iter = iter, n.burnin=floor(iter/2),
n.thin = ifelse(floor(iter/100) < 1, 1, floor(iter/100)))
jags2 <- update(jags1, 10000)
jags2
plot(jags2)
traceplot(jags2)
jags2.mcmc <- as.mcmc(jags2)
There is no difference in the results and I like this procedure because it's much more the way I used winbugs, so...
The last line of code converts the jags2-object to an mcmc-list which can be treated by package coda.
Good luck!
P.S. Here's a second answer:
After looking again on your code, the only thing after loading the jags-object that is missing to get the behavior you want, is:
jags$recompile()
coda.samples(model=jags,n.iter=1,variable.names="mu")
But if you really just want to use the already obtained posterior samples or maybe just want to update the chains for more iterations, you can also use the R2jags-procedure.

error message JAGS subset out of range

I am attempting to call the following jags model in R:
model{
# Main model level 1
for (i in 1:N){
ficon[i] ~ dnorm(mu[i], tau)
mu[i] <- alpha[country[i]]
}
# Priors level 1
tau ~ dgamma(.1,.1)
# Main model level 2
for (j in 1:J){
alpha[j] ~ dnorm(mu.alpha, tau.alpha)
}
# Priors level 2
mu.alpha ~ dnorm(0,.01)
tau.alpha ~ dgamma(.1,.1)
sigma.1 <- 1/(tau)
sigma.2 <- 1/(tau.alpha)
ICC <- sigma.2 / (sigma.1+sigma.2)
}
This is a hierarchical model, where ficon is a continuous variable 0-60, that may have a different mean or distribution by country. N = number of total observations (2244) and J = number of countries (34). When I run this model, I keep getting the following error message:
Compilation error on line 5.
Subset out of range: alpha[35]
This code worked earlier, but it's not working now. I assume the problem is that there are only 34 countries, and that's why it's getting stuck at i=35, but I'm not sure how to solve the problem. Any advice you have is welcome!
The R code that I use to call the model:
### input files JAGS ###
data <- list(ficon = X$ficon, country = X$country, J = 34, N = 2244)
inits1 <- list(alpha = rep(0, 34), mu.alpha = 0, tau = 1, tau.alpha = 1)
inits2 <- list(alpha = rep(1, 34), mu.alpha = 1, tau = .5, tau.alpha = .5)
inits <- list(inits1, inits2)
# call empty model
eqlsempty <- jags(data, inits, model.file = "eqls_emptymodel.R",
parameters = c("mu.alpha", "sigma.1", "sigma.2", "ICC"),
n.chains = 2, n.iter = itt, n.burnin = bi, n.thin = 10)
To solve the problem you need to renumber your countries so they only have the values 1 to 34. If you only have 34 countries and yet you are getting the error message you state then one of the countries must have the value 35. To solve this one could call the following R code before bundling the data:
x$country <- factor(x$country)
x$country <- droplevels(x$country)
x$country <- as.integer(x$country)
Hope this helps

OpenBUGS error undefined variable

I'm working on a binomial mixture model using OpenBUGS and R package R2OpenBUGS. I've successfully built simpler models, but once I add another level for imperfect detection, I consistently receive the error variable X is not defined in model or in data set. I've tried a number of different things, including changing the structure of my data and entering my data directly into OpenBUGS. I'm posting this in the hope that someone else has experience with this error, and perhaps knows why OpenBUGS is not recognizing variable X even though it is clearly defined as far as I can tell.
I've also gotten the error expected the collection operator c error pos 8 - this is not an error I've been getting previously, but I am similarly stumped.
Both the model and the data-simulation function come from Kery's Introduction to WinBUGS for Ecologists (2010). I will note that the data set here is in lieu of my own data, which is similar.
I am including the function to build the dataset as well as the model. Apologies for the length.
# Simulate data: 200 sites, 3 sampling rounds, 3 factors of the level 'trt',
# and continuous covariate 'X'
data.fn <- function(nsite = 180, nrep = 3, xmin = -1, xmax = 1, alpha.vec = c(0.01,0.2,0.4,1.1,0.01,0.2), beta0 = 1, beta1 = -1, ntrt = 3){
y <- array(dim = c(nsite, nrep)) # Array for counts
X <- sort(runif(n = nsite, min = xmin, max = xmax)) # covariate values, sorted
# Relationship expected abundance - covariate
x2 <- rep(1:ntrt, rep(60, ntrt)) # Indicator for population
trt <- factor(x2, labels = c("CT", "CM", "CC"))
Xmat <- model.matrix(~ trt*X)
lin.pred <- Xmat[,] %*% alpha.vec # Value of lin.predictor
lam <- exp(lin.pred)
# Add Poisson noise: draw N from Poisson(lambda)
N <- rpois(n = nsite, lambda = lam)
table(N) # Distribution of abundances across sites
sum(N > 0) / nsite # Empirical occupancy
totalN <- sum(N) ; totalN
# Observation process
# Relationship detection prob - covariate
p <- plogis(beta0 + beta1 * X)
# Make a 'census' (i.e., go out and count things)
for (i in 1:nrep){
y[,i] <- rbinom(n = nsite, size = N, prob = p)
}
# Return stuff
return(list(nsite = nsite, nrep = nrep, ntrt = ntrt, X = X, alpha.vec = alpha.vec, beta0 = beta0, beta1 = beta1, lam = lam, N = N, totalN = totalN, p = p, y = y, trt = trt))
}
data <- data.fn()
And here is the model:
sink("nmix1.txt")
cat("
model {
# Priors
for (i in 1:3){ # 3 treatment levels (factor)
alpha0[i] ~ dnorm(0, 0.01)
alpha1[i] ~ dnorm(0, 0.01)
}
beta0 ~ dnorm(0, 0.01)
beta1 ~ dnorm(0, 0.01)
# Likelihood
for (i in 1:180) { # 180 sites
C[i] ~ dpois(lambda[i])
log(lambda[i]) <- log.lambda[i]
log.lambda[i] <- alpha0[trt[i]] + alpha1[trt[i]]*X[i]
for (j in 1:3){ # each site sampled 3 times
y[i,j] ~ dbin(p[i,j], C[i])
lp[i,j] <- beta0 + beta1*X[i]
p[i,j] <- exp(lp[i,j])/(1+exp(lp[i,j]))
}
}
# Derived quantities
}
",fill=TRUE)
sink()
# Bundle data
trt <- data$trt
y <- data$y
X <- data$X
ntrt <- 3
# Standardise covariates
s.X <- (X - mean(X))/sd(X)
win.data <- list(C = y, trt = as.numeric(trt), X = s.X)
# Inits function
inits <- function(){ list(alpha0 = rnorm(ntrt, 0, 2),
alpha1 = rnorm(ntrt, 0, 2),
beta0 = rnorm(1,0,2), beta1 = rnorm(1,0,2))}
# Parameters to estimate
parameters <- c("alpha0", "alpha1", "beta0", "beta1")
# MCMC settings
ni <- 1200
nb <- 200
nt <- 2
nc <- 3
# Start Markov chains
out <- bugs(data = win.data, inits, parameters, "nmix1.txt", n.thin=nt,
n.chains=nc, n.burnin=nb, n.iter=ni, debug = TRUE)
Note: This answer has gone through a major revision, after I noticed another problem with the code.
If I understand your model correctly, you are mixing up the y and N from the simulated data, and what is passed as C to Bugs. You are passing the y variable (a matrix) to the C variable in the Bugs model, but this is accessed as a vector. From what I can see C is representing the number of "trials" in your binomial draw (actual abundances), i.e. N in your data set. The variable y (a matrix) is called the same thing in both the simulated data and in the Bugs model.
This is a reformulation of your model, as I understand it, and this runs ok:
sink("nmix1.txt")
cat("
model {
# Priors
for (i in 1:3){ # 3 treatment levels (factor)
alpha0[i] ~ dnorm(0, 0.01)
alpha1[i] ~ dnorm(0, 0.01)
}
beta0 ~ dnorm(0, 0.01)
beta1 ~ dnorm(0, 0.01)
# Likelihood
for (i in 1:180) { # 180 sites
C[i] ~ dpois(lambda[i])
log(lambda[i]) <- log.lambda[i]
log.lambda[i] <- alpha0[trt[i]] + alpha1[trt[i]]*X[i]
for (j in 1:3){ # each site sampled 3 times
y[i,j] ~ dbin(p[i,j], C[i])
lp[i,j] <- beta0 + beta1*X[i]
p[i,j] <- exp(lp[i,j])/(1+exp(lp[i,j]))
}
}
# Derived quantities
}
",fill=TRUE)
sink()
# Bundle data
trt <- data$trt
y <- data$y
X <- data$X
N<- data$N
ntrt <- 3
# Standardise covariates
s.X <- (X - mean(X))/sd(X)
win.data <- list(y = y, trt = as.numeric(trt), X = s.X, C= N)
# Inits function
inits <- function(){ list(alpha0 = rnorm(ntrt, 0, 2),
alpha1 = rnorm(ntrt, 0, 2),
beta0 = rnorm(1,0,2), beta1 = rnorm(1,0,2))}
# Parameters to estimate
parameters <- c("alpha0", "alpha1", "beta0", "beta1")
# MCMC settings
ni <- 1200
nb <- 200
nt <- 2
nc <- 3
# Start Markov chains
out <- bugs(data = win.data, inits, parameters, "nmix1.txt", n.thin=nt,
n.chains=nc, n.burnin=nb, n.iter=ni, debug = TRUE)
Overall, the results from this model looks ok, but there are long autocorrelation lags for beta0 and beta1. The estimate of beta1 also seems a bit off(~= -0.4), so you might want to recheck the Bugs model specification, so that it is matching the simulation model (i.e. that you are fitting the correct statistical model). At the moment, I'm not sure that it does, but I don't have the time to check further right now.
I got the same message trying to pass a factor to OpenBUGS. Like so,
Ndata <- list(yrs=N$yrs, site=N$site), ... )
The variable "site" was not passed by the "bugs" function. It simply was not in list passed
to OpenBUGS
I solved the problem by passing site as numeric,
Ndata <- list(yrs=N$yrs, site=as.numeric(N$site)), ... )

Resources