"non-conforming parameters in function :" in simple linear regression using JAGS - r

I am super new to JAGS and Bayesian statistics, and have simply been trying to follow the Chapter 22 on Bayesian statistics in Crawley's 2nd Edition R Book. I copy the code down exactly as it appears in the book for the simple linear model: growth = a + b *tannin, where there are 9 rows of two continuous variables: growth and tannins. The data and packages are this:
install.packages("R2jags")
library(R2jags)
growth <- c(12,10,8,11,6,7,2,3,3)
tannin <- c(0,1,2,3,4,5,6,7,8)
N <- c(1,2,3,4,5,6,7,8,9)
bay.df <- data.frame(growth,tannin,N)
The ASCII file looks like this:
model{
for(i in 1:N) {
growth[i] ~ dnorm(mu[i],tau)
mu[i] <- a+b*tannin[i]
}
a ~ dnorm(0.0, 1.0E-4)
b ~ dnorm(0.0, 1.0E-4)
sigma <- 1.0/sqrt(tau)
tau ~ dgamma(1.0E-3, 1.0E-3)
}
But then, when I use this code:
> practicemodel <- jags(data=data.jags,parameters.to.save = c("a","b","tau"),
+ n.iter=100000, model.file="regression.bugs.txt", n.chains=3)
I get an error message that says:
module glm loaded
Compiling model graph
Resolving undeclared variables
Deleting model
Error in jags.model(model.file, data = data, inits = init.values, n.chains = n.chains, :
RUNTIME ERROR:
Non-conforming parameters in function :

The problem has been solved!
Basically the change is from N <- (1,2...) to N <- 9, but there is one other solution as well, where no N is specified in the beginning. You can specify N inside the data.jags function as the number of rows in the data frame; data.jags = list(growth=bay.df$growth, tannin=bay.df$tannin, N=nrow(bay.df)).
Here is the new code:
# Make the data frame
growth <- c(12,10,8,11,6,7,2,3,3)
tannin <- c(0,1,2,3,4,5,6,7,8)
# CHANGED : This is for the JAGS code to know there are 9 rows of data
N <- 9 code
bay.df <- data.frame(growth,tannin)
library(R2jags)
# Now, write the Bugs model and save it in a text file
sink("regression.bugs.txt") #tell R to put the following into this file
cat("
model{
for(i in 1:N) {
growth[i] ~ dnorm(mu[i],tau)
mu[i] <- a+b*tannin[i]
}
a ~ dnorm(0.0, 1.0E-4)
b ~ dnorm(0.0, 1.0E-4)
sigma <- 1.0/sqrt(tau)
tau ~ dgamma(1.0E-3, 1.0E-3)
}
", fill=TRUE)
sink() #tells R to stop putting things into this file.
#tell jags the names of the variables containing the data
data.jags <- list("growth","tannin","N")
# run the JAGS function to produce the function:
practicemodel <- jags(data=data.jags,parameters.to.save = c("a","b","tau"),
n.iter=100000, model.file="regression.bugs.txt", n.chains=3)
# inspect the model output. Important to note that the output will
# be different every time because there's a stochastic element to the model
practicemodel
# plots the information nicely, can visualize the error
# margin for each parameter and deviance
plot(practicemodel)
Thanks for the help! I hope this helps others.

Related

Why is this JAGS error appearing in my R output?

So I'm trying to do this bayesian data modeling project and I went back to my notes from my bayesian stats class on rjags where my professor went over a really similar model. I was able to run that one without any issues, but when I adapt it to my model I'm getting an error as my output.
Here is my code:
library(rjags)
set.seed(12196)
# Number of bear reports
Y <- black_bears$Y # Number of surveys saying there was a reported bear sighting
N <- black_bears$N # Number of surveys submitted on whether a bear was seen or not
q <- Y/N # proportion of the non-bear sightings submitted
n <- length(Y)
X <- log(q)-log(1-q) # X = logit(q)
data <- list(Y=Y,N=N,X=X)
params <- c("beta")
model_string <- textConnection("model{
# Likelihood
for (i in 1:n){
Y[i] ~ dbinom(p[i], N[i])
logit(p[i] <- beta[1] + beta[2]*X[i])
}
# Priors
beta[1] ~ dnorm(0, 0.01)
beta[2] ~ dnorm(0, 0.01)
}")
model <- jags.model(model_string,data = data, n.chains=2,quiet=TRUE)
update(model, 10000, progress.bar="none")
samples1 <- coda.samples(model, variable.names=params, thin=5, n.iter=20000, progress.bar="none")
plot(samples1)
I'm getting presented with this error:
Any and all help is appreciated. Thank you!

How to deal with "Non-conforming parameters with inprod function" in JAGS model

I am trying to model the variance in overall species richness with the habitat covariates of a camera trapping station using R2jags. However, I keep getting the error:
"Error in jags.model(model.file, data = data, inits = init.values, n.chains = n.chains, :
RUNTIME ERROR:
Non-conforming parameters in function inprod"
I used a very similar function in my previous JAGS model (to find the species richness) so I am not sure why it is not working now...
I have already tried formatting the covariates within the inprod function in different ways, as a data frame and a matrix, to no avail.
Variable specification:
J=length(ustations) #number of camera stations
NSite=Global.Model$BUGSoutput$sims.list$Nsite
NS=apply(NSite,2,function(x)c(mean(x)))
###What I think is causing the problem:
COV <- data.frame(as.numeric(station.cov$NDVI), as.numeric(station.cov$TRI), as.numeric(station.cov$dist2edge), as.numeric(station.cov$dogs), as.numeric(station.cov$Leopard_captures))
###but I have also tried:
COV <- cbind(station.cov$NDVI, station.cov$TRI, station.cov$dist2edge, station.cov$dogs, station.cov$Leopard_captures)
JAGS model:
sink("Variance_model.txt")
cat("model {
# Priors
Y ~ dnorm(0,0.001) #Mean richness
X ~ dnorm(0,0.001) #Mean variance
for (a in 1:length(COV)){
U[a] ~ dnorm(0,0.001)} #Variance covariates
# Likelihood
for (i in 1:J) {
mu[i] <- Y #Hyper-parameter for station-specific all richness
NS[i] ~ dnorm(mu[i], tau[i]) #Likelihood
tau[i] <- (1/sigma2[i])
log(sigma2[i]) <- X + inprod(U,COV[i,])
}
}
", fill=TRUE)
sink()
var.data <- list(NS = NS,
COV = COV,
J=J)
Bundle data:
# Inits function
var.inits <- function(){list(
Y =rnorm(1),
X =rnorm(1),
U =rnorm(length(COV)))}
# Parameters to estimate
var.params <- c("Y","X","U")
# MCMC settings
nc <- 3
ni <-20000
nb <- 10000
nthin <- 10
Start Gibbs sampler:
jags(data=var.data,
inits=var.inits,
parameters.to.save=var.params,
model.file="Variance_model.txt",
n.chains=nc,n.iter=ni,n.burnin=nb,n.thin=nthin)
Ultimately, I get the error:
Compiling model graph
Resolving undeclared variables
Allocating nodes
Deleting model
Error in jags.model(model.file, data = data, inits = init.values, n.chains = n.chains, :
RUNTIME ERROR:
Non-conforming parameters in function inprod
In the end, I would like to calculate the mean and 95% credible interval (BCI) estimates of the habitat covariates hypothesized to influence the variance in station-specific (point-level) species richness.
Any help would be greatly appreciated!
It looks like you are using length to generate the priors for U. In JAGS this function will return the number of elements in a node array. In this case, that would be the number of rows ins COV multiplied by the number of columns.
Instead, I would supply a scalar to your data list that you supply to jags.model.
var.data <- list(NS = NS,
COV = COV,
J=J,
ncov = ncol(COV)
)
Following this, you can modify your JAGS code where you are generating your priors for U. The model would then become:
sink("Variance_model.txt")
cat("model {
# Priors
Y ~ dnorm(0,0.001) #Mean richness
X ~ dnorm(0,0.001) #Mean variance
for (a in 1:ncov){ # THIS IS THE ONLY LINE OF CODE THAT I MODIFIED
U[a] ~ dnorm(0,0.001)} #Variance covariates
# Likelihood
for (i in 1:J) {
mu[i] <- Y #Hyper-parameter for station-specific all richness
NS[i] ~ dnorm(mu[i], tau[i]) #Likelihood
tau[i] <- (1/sigma2[i])
log(sigma2[i]) <- X + inprod(U,COV[i,])
}
}
", fill=TRUE)
sink()

Outcome prediction using JAGS from R

[Code is updated and does not correspond to error messages anymore]
I am trying to understand how JAGS predicts outcome values (for a mixed markov model). I've trained the model on a dataset which includes outcome m and covariates x1, x2 and x3.
Predicting the outcome without fixing parameter values works in R, but the output seems completely random:
preds <- run.jags("model.txt",
data=list(x1=x1, x2=x2, x3=x3, m=m,
statealpha=rep(1,times=M), M=M, T=T, N=N), monitor=c("m_pred"),
n.chains=1, inits = NA, sample=1)
Compiling rjags model...
Calling the simulation using the rjags method...
Note: the model did not require adaptation
Burning in the model for 4000 iterations...
|**************************************************| 100%
Running the model for 1 iterations...
Simulation complete
Finished running the simulation
However, as soon as I try to fix parameters (i.e. use model estimates to predict outcome m, I get errors:
preds <- run.jags("model.txt",
data=list(x1=x1, x2=x2, x3=x3,
statealpha=rep(1,times=M), M=M, T=T, N=N, beta1=beta1), monitor=c("m"),
n.chains=1, inits = NA, sample=1)
Compiling rjags model...
Error: The following error occured when compiling and adapting the model using rjags:
Error in rjags::jags.model(model, data = dataenv, n.chains = length(runjags.object$end.state), :
RUNTIME ERROR:
Compilation error on line 39.
beta1[2,1] is a logical node and cannot be observed
beta1 in this case is a 2x2 matrix of coefficient estimates.
How is JAGS predicting m in the first example (no fixed parameters)? Is it just completely randomly choosing m?
How can I include earlier acquired model estimates to simulate new outcome values?
The model is:
model{
for (i in 1:N)
{
for (t in 1:T)
{
m[t,i] ~ dcat(ps[i,t,])
}
for (state in 1:M)
{
ps[i,1,state] <- probs1[state]
for (t in 2:T)
{
ps[i,t,state] <- probs[m[(t-1),i], state, i,t]
}
for (prev in 1:M){
for (t in 1:T) {
probs[prev,state,i,t] <- odds[prev,state,i,t]/totalodds[prev,i,t]
odds[prev,state,i,t] <- exp(alpha[prev,state,i] +
beta1[prev,state]*x1[t,i]
+ beta2[prev,state]*x2[t,i]
+ beta3[prev,state]*x3[t,i])
}}
alpha[state,state,i] <- 0
for (t in 1:T) {
totalodds[state,i,t] <- odds[state,1,i,t] + odds[state,2,i,t]
}
}
alpha[1,2,i] <- raneffs[i,1]
alpha[2,1,i] <- raneffs[i,2]
raneffs[i,1:2] ~ dmnorm(alpha.means[1:2],alpha.prec[1:2, 1:2])
}
for (state in 1:M)
{
beta1[state,state] <- 0
beta2[state,state] <- 0
beta3[state,state] <- 0
}
beta1[1,2] <- rcoeff[1]
beta1[2,1] <- rcoeff[2]
beta2[1,2] <- rcoeff[3]
beta2[2,1] <- rcoeff[4]
beta3[1,2] <- rcoeff[5]
beta3[2,1] <- rcoeff[6]
alpha.Sigma[1:2,1:2] <- inverse(alpha.prec[1:2,1:2])
probs1[1:M] ~ ddirich(statealpha[1:M])
for (par in 1:6)
{
alpha.means[par] ~ dt(T.constant.mu,T.constant.tau,T.constant.k)
rcoeff[par] ~ dt(T.mu, T.tau, T.k)
}
T.constant.mu <- 0
T.mu <- 0
T.constant.tau <- 1/T.constant.scale.squared
T.tau <- 1/T.scale.squared
T.constant.scale.squared <- T.constant.scale*T.constant.scale
T.scale.squared <- T.scale*T.scale
T.scale <- 2.5
T.constant.scale <- 10
T.constant.k <- 1
T.k <- 1
alpha.prec[1:2,1:2] ~ dwish(Om[1:2,1:2],2)
Om[1,1] <- 1
Om[1,2] <- 0
Om[2,1] <- 0
Om[2,2] <- 1
## Prediction
for (i in 1:N)
{
m_pred[1,i] <- m[1,i]
for (t in 2:T)
{
m_pred[t,i] ~ dcat(ps_pred[i,t,])
}
for (state in 1:M)
{
ps_pred[i,1,state] <- probs1[state]
for (t in 2:T)
{
ps_pred[i,t,state] <- probs_pred[m_pred[(t-1),i], state, i,t]
}
for (prev in 1:M)
{
for (t in 1:T)
{
probs_pred[prev,state,i,t] <- odds_pred[prev,state,i,t]/totalodds_pred[prev,i,t]
odds_pred[prev,state,i,t] <- exp(alpha[prev,state,i] +
beta1[prev,state]*x1[t,i]
+ beta2[prev,state]*x2[t,i]
+ beta3[prev,state]*x3[t,i])
}}
for (t in 1:T) {
totalodds_pred[state,i,t] <- odds_pred[state,1,i,t] + odds_pred[state,2,i,t]
}
}
}
TL;DR: I think you're just missing a likelihood.
Your model is complex, so perhaps I'm missing something, but as far as I can tell there is no likelihood. You are supplying the predictors x1, x2, and x3 as data, but you aren't giving any observed m. So in what sense can JAGS be "fitting" the model?
To answer your questions:
Yes, it appears that m is drawn as random from a categorical distribution conditioned on the rest of the model. Since there are no m supplied as data, none of the parameter distributions have cause for update, so your result for m is no different than you'd get if you just did random draws from all the priors and propagated them through the model in R or whatever.
Though it still wouldn't constitute fitting the model in any sense, you would be free to supply values for beta1 if they weren't already defined completely in the model. JAGS is complaining because currently beta1[i] = rcoeff[i] ~ dt(T.mu, T.tau, T.k), and the parameters to the T distribution are all fixed. If any of (T.mu, T.tau, T.k) were instead given priors (identifying them as random), then beta1 could be supplied as data and JAGS would treat rcoeff[i] ~ dt(T.mu, T.tau, T.k) as a likelihood. But in the model's current form, as far as JAGS is concerned if you supply beta1 as data, that's in conflict with the fixed definition already in the model.
I'm stretching here, but my guess is if you're using JAGS you have (or would like to) fit the model in JAGS too. It's a common pattern to include both an observed response and a desired predicted response in a jags model, e.g. something like this:
model {
b ~ dnorm(0, 1) # prior on b
for(i in 1:N) {
y[i] ~ dnorm(b * x[i], 1) # Likelihood of y | b (and fixed precision = 1 for the example)
}
for(i in 1:N_pred) {
pred_y[i] ~ dnorm(b * pred_x[i], 1) # Prediction
}
}
In this example model, x, y, and pred_x are supplied as data, the unknown parameter b is to be estimated, and we desire the posterior predictions pred_y at each value of pred_x. JAGS knows that the distribution in the first for loop is a likelihood, because y is supplied as data. Posterior samples of b will be constrained by this likelihood. The second for loop looks similar, but since pred_y is not supplied as data, it can do nothing to constrain b. Instead, JAGS knows to simply draw pred_y samples conditioned on b and the supplied pred_x. The values of pred_x are commonly defined to be the same as observed x, giving a predictive interval for each observed data point, or as a regular sequence of values along the x axis to generate a smooth predictive interval.

five-fold cross-validation with the use of linear regression

I would like to perform a five-fold cross validation for a regression model of degree 1
lm(y ~ poly(x, degree=1), data).
I generated 100 observations with the following code
set.seed(1)
GenData <- function(n){
x <- seq(-2,2,length.out=n)
y <- -4 - 3*x + 1.5*x^2 + 2*x^3 + rnorm(n,0,0.5)
return(cbind(x,y))
}
GenData(100)
D<-GenData(100)
and my code for this goal is
ind<-sample(1:100)
re<-NULL
k<-20
teams<- 5
t<-NULL
for (i in 1:teams) {
te<- ind[ ((i-1)*k+1):(i*k)]
train <- D[-te,1:2]
test <- D[te,1:2]
cl <- D[-te,2]
lm1 <- lm(cl ~train[,1] , data=train)
pred <- predict(lm1,test)
t<- c(t, sum(D[te,2] == pred) /dim(test)[1])
}
re<-c(re,mean(t))
where I split my data into training and test.With the training data I run a regression with purpose to make a prediction and comperate it with my test data.But I have the following error
"Error in predict(mult, test)$class :
$ operator is invalid for atomic vectors
In addition: Warning message:
'newdata' had 20 rows but variables found have 80 rows "
So I understand that I have to change something on the line
pred<-predict(lm1,test)
but I dont know what .
Thanks in advance!
lm requires a data frame as input data. Also trying to validate the model by just verifying if the result matches the expected value will not work. You are simulating the irreducible error using normal error.
Here is the updated code:
ind<-sample(1:100)
re<-NULL
k<-20
teams<- 5
t<-NULL
for (i in 1:teams) {
te<- ind[ ((i-1)*k+1):(i*k)]
train <- data.frame(D[-te,1:2])
test <- data.frame(D[te,1:2])
lm1 <- lm(y~x , data=train)
pred <- predict(lm1,test)
t<- c(t, sum(abs(D[te,2] - pred)) /dim(test)[1])
}
re<-c(re,mean(t))
In the lm() function, your y variable is cl, a vector not included in the data = argument:
cl <- D[-te,2]
lm1 <- lm(cl ~train[,1] , data=train)
No need to include the cl at all. Rather, simply specify x and y by their names in the dataset train, in this case the names are x and y:
names(train)
[1] "x" "y"
So your for loop would then look like:
for (i in 1:teams) {
te<- ind[ ((i-1)*k+1):(i*k)]
train <- D[-te,1:2]
test <- D[te,1:2]
lm1 <- lm(y ~x , data=train)
pred <- predict(lm1,test)
t[i]<- sum(D[te,2] == pred)/dim(test)[1]
}
Also, note that I have added the for loop index i so that values can be added to the object. Lastly, I had to make the D object a dataframe in order for the code to work:
D<-as.data.frame(GenData(100))
Your re object ends up being 0 because your model does not predict any numbers correctly. I would suggest using RMSE as a performance measure for continuous data.

Passing variable to WinBugs model in R

I am using the R2WinBugs package. I would like to pass two parameter that are calculated previously in the R script to the model function
c0yy <- 0.1
syy <- 0.0001
#Model
model <- function(c0yy,syy){
#Likelihood
for(i in 1:n){
y[i] ~ dnorm(mu[i],cyy)
}
#Regression formula
for(i in 1:n){
mu[i] <- alpha + gamma * x[i]
}
#Priors for the regression parameters
alpha ~ dnorm(0,0.000001)
gamma ~ dnorm(0,0.000001)
#Priors for the precision parameter
cyy ~ dnorm(c0yy,syy)
#Monitored variables
beta <- gamma/(alpha-1)
}
filename <- file.path(tempdir(), "Olm.txt")
write.model(model, filename)
but I get this error
made use of undefined node c0yy
while if I substitute the values for c0yy and syy inside the model function it works.. Any help?
Thanks
The values you are tying to pass to the model are data. In BUGS (and R2WinBUGS) data is passed to the program as a separate entity from the model that you have defined. In order to include the data you can put them into a list, something like;
my.mcmc <- bugs(data = list(c0yy = 0.1, syy= 0.0001), params = "beta', model.file = "Olm.txt", n.iter=10000)
You will also need to drop the <- function(c0yy,syy) from your model script.

Resources