I have fitted following simple linear regression Bayesian model using rjags.
I was able to run the model by specifying all the predictors separately(like for a lm object). Now I want to learn how to specify the predictors by introducing them as a matrix instead of specifying them separately.
So I ran the following code, but it gave some errors.
I used tobbaco data set in rrr package to provide a reproducible example.
library(rrr)
require(dplyr)
library(rjags)
tobacco <- as_data_frame(tobacco)
N1 = length(tobacco$Y1.BurnRate)
x1 = model.matrix(Y1.BurnRate~X2.PercentChlorine+X3.PercentPotassium ,data = tobacco)
bayes_model_mul1=
"model {
for(i in 1:N1){
Y1.BurnRate[i]~dnorm(mu1[i],tau1)
for(j in 1:3){
mu1[i]=beta1[j]*x1[i,j]
}
}
for (l in 1:3) { beta1[l] ~dnorm(0, 0.001) }
tau1 ~ dgamma(.01,.01)
sigma_tau1 = 1/tau1
}"
model3 <- jags.model(textConnection(bayes_model_mul1),
data = list(Y1.BurnRate=tobacco$Y1.BurnRate, x1=x1, N1=N1),
n.chains=1)
After I run model3 , I got following error.
Error in jags.model(textConnection(bayes_model_mul1), data = list(Y1.BurnRate = tobacco$Y1.BurnRate, :
RUNTIME ERROR:
Compilation error on line 6.
Attempt to redefine node mu1[1]
Can anyone help me figure this out ?
Does this due to introducing predictors as a matrix ?
There are a few ways to do this, here are two:
Use matrix multiplication outside of the likelihood loop
m1 =
"model {
mu1 = x1 %*% beta1 # ---> this
for(i in 1:N1){
Y1.BurnRate[i] ~ dnorm(mu1[i], tau1)
}
for (l in 1:3) { beta1[l] ~ dnorm(0, 0.001) }
tau1 ~ dgamma(.01,.01)
sigma_tau1 = 1/tau1
}"
Use inprod to multiply the parameters with the design matrix
m2 =
"model {
for(i in 1:N1){
mu1[i] = inprod(beta1, x1[i,]) #----> this
Y1.BurnRate[i] ~ dnorm(mu1[i], tau1)
}
for (l in 1:3) { beta1[l] ~ dnorm(0, 0.001) }
tau1 ~ dgamma(.01,.01)
sigma_tau1 = 1/tau1
}"
You received an error with for(j in 1:3){ mu1[i] = beta1[j]* x1[i,j] } as every time you loop though the parameter index j you overwrite mu1[i]. It also doesn't sum up the individual terms. You may be able to index mu1 with j as well and then sum but untested ...
I am puzzled by a simple question in R JAGS. I have for example, 10 parameters: d[1], d[2], ..., d[10]. It is intuitive from the data that they should be increasing. So I want to put a constraint on them.
Here is what I tried to do but it give error messages saying "Node inconsistent with parents":
model{
...
for (j in 1:10){
d.star[j]~dnorm(0,0.0001)
}
d=sort(d.star)
}
Then I tried this:
d[1]~dnorm(0,0.0001)
for (j in 2:10){
d[j]~dnorm(0,0.0001)I(d[j-1],)
}
This worked, but I don't know if this is the correct way to do it. Could you share your thoughts?
Thanks!
If you are ever uncertain about something like this, it is best to just simulate some data to determine if the model structure you suggest works (spoiler alert: it does).
Here is the model that I used:
cat('model{
d[1] ~ dnorm(0, 0.0001) # intercept
d[2] ~ dnorm(0, 0.0001)
for(j in 3:11){
d[j] ~ dnorm(0, 0.0001) I(d[j-1],)
}
for(i in 1:200){
y[i] ~ dnorm(mu[i], tau)
mu[i] <- inprod(d, x[i,])
}
tau ~ dgamma(0.01,0.01)
}',
file = "model_example.R")```
And here are the data I simulated to use with this model.
library(run.jags)
library(mcmcplots)
# intercept with sorted betas
set.seed(161)
betas <- c(1,sort(runif(10, -5,5)))
# make covariates, 1 for intercept
x <- cbind(1,matrix(rnorm(2000), nrow = 200, ncol = 10))
# deterministic part of model
y_det <- x %*% betas
# add noise
y <- rnorm(length(y_det), y_det, 1)
data_list <- list(y = as.numeric(y), x = x)
# fit the model
mout <- run.jags('model_example.R',monitor = c("d", "tau"), data = data_list)
Following this, we can plot out the estimates and overlay the true parameter values
caterplot(mout, "d", reorder = FALSE)
points(rev(c(1:11)) ~ betas, pch = 18,cex = 0.9)
The black points are the true parameter values, the blue points and lines are the estimates. Looks like this set up does fine so long as there are enough data to estimate all of those parameters.
It looks like there is an syntax error in the first implementation. Just try:
model{
...
for (j in 1:10){
d.star[j]~dnorm(0,0.0001)
}
d[1:10] <- sort(d.star) # notice d is indexed.
}
and compare the results with those of the second implementation. According to the documentation, these are both correct, but it is advised to use the function sort.
I am now tring to test the goodness of fit of an ordianl model using lipsitz.test {generalhoslem}. According to the document, the function can deal with both polr and clm. However, when I try to use clm in the lipsitz.testfunction, an error occurs. Here is an example
library("ordinal")
library(generalhoslem)
data("wine")
fm1 <- clm(rating ~ temp * contact, data = wine)
lipsitz.test(fm1)
Error in names(LRstat) <- "LR statistic" :
'names' attribute [1] must be the same length as the vector [0]
In addition: Warning message:
In lipsitz.test(fm1) :
n/5c < 6. Running this test when n/5c < 6 is not recommended.
Is there any solution to fix this? Thanks a lot.
I'm not sure if this is off-topic and should be on CrossValidated. It's partly a problem with the coding of the test and partly about the statistics of the test itself.
There are two problems. I've just spotted a bug in the code when using clm and will push a fix to CRAN (corrected code below).
There does however appear to be a more fundamental problem with the example data. Basically, the Lipsitz test requires fitting a new model with dummy variables of the groupings. When fitting the new model with this example, the model fails and so some of the coefficients are not calculated. If using polr, the new model gets the warning that it is rank-deficient; if using clm, the new model gets a message that two coefficients are not fitted due to singularities. I think this example data set is just unsuitable for this kind of analysis.
The corrected code is below and I have used a larger example dataset on which the test runs.
lipsitz.test <- function (model, g = NULL) {
oldmodel <- model
if (class(oldmodel) == "polr") {
yhat <- as.data.frame(fitted(oldmodel))
} else if (class(oldmodel) == "clm") {
predprob <- oldmodel$model[, 2:ncol(oldmodel$model)]
yhat <- predict(oldmodel, newdata = predprob, type = "prob")$fit
} else warning("Model is not of class polr or clm. Test may fail.")
formula <- formula(oldmodel$terms)
DNAME <- paste("formula: ", deparse(formula))
METHOD <- "Lipsitz goodness of fit test for ordinal response models"
obs <- oldmodel$model[1]
if (is.null(g)) {
g <- round(nrow(obs)/(5 * ncol(yhat)))
if (g < 6)
warning("n/5c < 6. Running this test when n/5c < 6 is not recommended.")
}
qq <- unique(quantile(1 - yhat[, 1], probs = seq(0, 1, 1/g)))
cutyhats <- cut(1 - yhat[, 1], breaks = qq, include.lowest = TRUE)
dfobs <- data.frame(obs, cutyhats)
dfobsmelt <- melt(dfobs, id.vars = 2)
observed <- cast(dfobsmelt, cutyhats ~ value, length)
if (g != nrow(observed)) {
warning(paste("Not possible to compute", g, "rows. There might be too few observations."))
}
oldmodel$model <- cbind(oldmodel$model, cutyhats = dfobs$cutyhats)
oldmodel$model$grp <- as.factor(vapply(oldmodel$model$cutyhats,
function(x) which(observed[, 1] == x), 1))
newmodel <- update(oldmodel, . ~ . + grp, data = oldmodel$model)
if (class(oldmodel) == "polr") {
LRstat <- oldmodel$deviance - newmodel$deviance
} else if (class(oldmodel) == "clm") {
LRstat <- abs(-2 * (newmodel$logLik - oldmodel$logLik))
}
PARAMETER <- g - 1
PVAL <- 1 - pchisq(LRstat, PARAMETER)
names(LRstat) <- "LR statistic"
names(PARAMETER) <- "df"
structure(list(statistic = LRstat, parameter = PARAMETER,
p.value = PVAL, method = METHOD, data.name = DNAME, newmoddata = oldmodel$model,
predictedprobs = yhat), class = "htest")
}
library(foreign)
dt <- read.dta("http://www.ats.ucla.edu/stat/data/hsbdemo.dta")
fm3 <- clm(ses ~ female + read + write, data = dt)
lipsitz.test(fm3)
fm4 <- polr(ses ~ female + read + write, data = dt)
lipsitz.test(fm4)
While attempting to adapt a working WinBUGS model and mitigate it to R using R2WinBUGS, I got several error messages.
I believe is related to the specification of the inits, but have been unable resolve the issue.
The first was error message was using inits1:
list(A=1, d=c(NA,0,0,0,0), mu=c(0,0,0,0))
Error in bugs(mydata, inits = inits1, model.file = "mtcfe.txt", parameters = c("or"), :
Number of initialized chains (length(inits)) != n.chains
After reading the suggested fix by Uwe Liggers "List containing lists solution" I modified the the inits1 to inits2:
inits2 <- list(list(A=1, d=c(NA,0,0,0,0), mu=c(0,0,0,0)))
And received the error:
Error in bugs.run(n.burnin, bugs.directory, WINE = WINE, useWINE = useWINE, : Look at the log file and try again with 'debug=TRUE' to figure out what went wrong within Bugs.
I have also attempted the fix suggested by AndyC at this post "getting-winbugs-leuk-example-to-work-from-r-using-r2winbugs". By changing the inits to:
inits4 <- function(){list(list(A=1, d=c(NA,0,0,0,0), mu=c(0,0,0,0)))}
And received the error:
Error in bugs.run(n.burnin, bugs.directory, WINE = WINE, useWINE = useWINE, : Look at the log file and try again with 'debug=TRUE' to figure out what went wrong within Bugs.
This is my attempt on R2WinBUGS.
Note, I have included several inits in the code that did not work:
work.dir <- "removed from example"
setwd(work.dir)
getwd() # check working directory
# Load Package
library(R2WinBUGS)
# Read data
mydata <- list(nt=5,
ns=4,
r=structure(
.Data = c(2506,7834,6729,2139,
2548,7860,6710,4418),
.Dim = c(4,2)),
n=structure(
.Data = c(2697, 8212, 7266, 2333,
2701,8280,7257,4687),
.Dim= c(4,2)),
t=structure(
.Data = c( 1,1,1,1,
2,3,4,5),
.Dim = c(4,2)),
na=structure(
.Data = c(2,2,2,2))
)
bugs.data(mydata)
# Set initial values
inits <- function(){list(A=1, d=c(NA,0,0,0,0), mu=c(0,0,0,0))}
#inits2 <- list(A=1, d=c(NA,0,0,0,0), mu=c(0,0,0,0))
#inits3 <- function(){list(inits2,inits2)}
#inits4 <- function(){list(inits,inits)}
#inits5 <- list(inits,inits)
# CALL WinBUGS AND SAVE RESULTS IN VARIABLE out.re
out.re <- bugs(mydata,inits=inits1, # load data and initial values
model.file="mtcfe.txt", # file with model to run
parameters=c("or"),
n.thin=1,
n.chains=1, n.iter=1500, n.burnin=500,
bugs.directory=bd,
working.directory=work.dir,
debug=TRUE)
print(out.re,digits=4) #lists all results as in WinBUGS stats(*)
And this is the working WinBUGS code, adapted from a WinBUGS course held by the universities of Bristol and Leicester:
# Binomial likelihood, logit link, MTC
# Fixed effect model
model{ # *** PROGRAM STARTS
for(i in 1:ns){ # LOOP THROUGH STUDIES
mu[i] ~ dnorm(0,.0001) # vague priors for all trial baselines
for (k in 1:na[i]) { # LOOP THROUGH ARMS
r[i,k] ~ dbin(p[i,k],n[i,k]) # binomial likelihood
logit(p[i,k]) <- mu[i] + d[t[i,k]]-d[t[i,1]] # model for linear predictor
rhat[i,k] <- p[i,k] * n[i,k] # expected value of the numerators
dev[i,k] <- 2 * (r[i,k] * (log(r[i,k])-log(rhat[i,k])) #Deviance contribution
+ (n[i,k]-r[i,k]) * (log(n[i,k]-r[i,k]) - log(n[i,k]-rhat[i,k])))
}
resdev[i] <- sum(dev[i,1:na[i]]) # summed residual deviance contribution for this trial
}
totresdev <- sum(resdev[]) #Total Residual Deviance
d[1]<- 0 # treatment effect is zero for reference treatment
for (k in 2:nt) { d[k] ~ dnorm(0,.0001) } # vague priors for treatment effects
# pairwise ORs and LORs for all possible pair-wise comparisons
for (c in 1:(nt-1)) { for (k in (c+1):nt) {
or[c,k] <- exp(d[k] - d[c])
lor[c,k] <- (d[k]-d[c])
}
}
# ranking
for (k in 1:nt) {
rk[k] <- nt+1-rank(d[],k) # assumes events are “good”
# rk[k] <- rank(d[],k) # assumes events are “bad”
best[k] <- equals(rk[k],1) #calculate probability that treat k is best
}
# Absolute effects
A ~ dnorm(-2.6,precA)
precA <- pow(0.38,-2) # prior precision for Treatment A, sd=0.38 on logit scale
for (k in 1:nt) { logit(T[k]) <- A + d[k] }
} # *** PROGRAM ENDS
#Inits
list(A=1, d=c(NA,0,0,0,0), mu=c(0,0,0,0))
#Data
list(nt=5.00000E+00, ns=4.00000E+00, r= structure(.Data= c(2.50600E+03, 2.54800E+03, 7.83400E+03, 7.86000E+03, 6.72900E+03, 6.71000E+03, 2.13900E+03, 4.41800E+03), .Dim=c(4, 2)), n= structure(.Data= c(2.69700E+03, 2.70100E+03, 8.21200E+03, 8.28000E+03, 7.26600E+03, 7.25700E+03, 2.33300E+03, 4.68700E+03), .Dim=c(4, 2)), t= structure(.Data= c(1.00000E+00, 2.00000E+00, 1.00000E+00, 3.00000E+00, 1.00000E+00, 4.00000E+00, 1.00000E+00, 5.00000E+00), .Dim=c(4, 2)), na=c(2.00000E+00, 2.00000E+00, 2.00000E+00, 2.00000E+00))
I am attempting to call the following jags model in R:
model{
# Main model level 1
for (i in 1:N){
ficon[i] ~ dnorm(mu[i], tau)
mu[i] <- alpha[country[i]]
}
# Priors level 1
tau ~ dgamma(.1,.1)
# Main model level 2
for (j in 1:J){
alpha[j] ~ dnorm(mu.alpha, tau.alpha)
}
# Priors level 2
mu.alpha ~ dnorm(0,.01)
tau.alpha ~ dgamma(.1,.1)
sigma.1 <- 1/(tau)
sigma.2 <- 1/(tau.alpha)
ICC <- sigma.2 / (sigma.1+sigma.2)
}
This is a hierarchical model, where ficon is a continuous variable 0-60, that may have a different mean or distribution by country. N = number of total observations (2244) and J = number of countries (34). When I run this model, I keep getting the following error message:
Compilation error on line 5.
Subset out of range: alpha[35]
This code worked earlier, but it's not working now. I assume the problem is that there are only 34 countries, and that's why it's getting stuck at i=35, but I'm not sure how to solve the problem. Any advice you have is welcome!
The R code that I use to call the model:
### input files JAGS ###
data <- list(ficon = X$ficon, country = X$country, J = 34, N = 2244)
inits1 <- list(alpha = rep(0, 34), mu.alpha = 0, tau = 1, tau.alpha = 1)
inits2 <- list(alpha = rep(1, 34), mu.alpha = 1, tau = .5, tau.alpha = .5)
inits <- list(inits1, inits2)
# call empty model
eqlsempty <- jags(data, inits, model.file = "eqls_emptymodel.R",
parameters = c("mu.alpha", "sigma.1", "sigma.2", "ICC"),
n.chains = 2, n.iter = itt, n.burnin = bi, n.thin = 10)
To solve the problem you need to renumber your countries so they only have the values 1 to 34. If you only have 34 countries and yet you are getting the error message you state then one of the countries must have the value 35. To solve this one could call the following R code before bundling the data:
x$country <- factor(x$country)
x$country <- droplevels(x$country)
x$country <- as.integer(x$country)
Hope this helps