JAGS - apply function to all parameter nodes - r

I'm new to JAGS and I'm running a model in R via R2jags package.
The model code is based on a code taken from Kéry & Schaub 2012 ('Bayesian Population Analysis using WinBUGS"), pg 399.
Chi-square discrepancy measure is computed
model {
....
for(g in 1:G) {
for (t in 1:T) {
...
E[g,t] <- pow((y[g,t] - eval[g,t]),2) / eval[g,t]
...
}#t
}#g
fit <- sum(E[,])
}#model
where g and t are site and time indices and G and T are then the number of sites and the number of years
I get an error though
Error in jags.model(model.file, data = data, inits = init.values, n.chains = n.chains, :
RUNTIME ERROR:
Compilation error on line 140.
Cannot evaluate subset expression for fit
Is it caused by different syntax used by JAGS relative to WinBUGS? The code is the same used in the book, except for I have 2 dimensions instead of three as in the book example.

To answer the last part of your question, no that error isn't caused by different syntax in JAGS (although the error message might look different in BUGS).
In fact I can't see anything wrong with the code snippet that you have posted, and the following reproducible example shows that it works at least when y and eval are given in data:
m <- 'model {
for(g in 1:G) {
for (t in 1:T) {
E[g,t] <- pow((y[g,t] - eval[g,t]),2) / eval[g,t]
}#t
}#g
fit <- sum(E[,])
#data# G, T, y, eval
#monitor# fit
}#model
'
library('runjags')
G=T <- 10
y <- matrix(rnorm(100), nrow=G, ncol=T)
eval <- matrix(rnorm(100), nrow=G, ncol=T)
results <- run.jags(m)
Have you verified what line 140 refers to? Either line 140 is something that you haven't shown, or maybe you have specified either fit or E somewhere else in the model with a different number of dimensions?
If this isn't the case and you still get an error then please add a minimal reproducible example to your question that shows the problem (preferably underneath an ---EDIT--- line below what you have already written) and we can try to help with that.
Matt

Related

slicing a vector shows error our of range

I have this part of my jags code. I really can't see where the code gets out of the range. Can anyone please see any error that I can't recognize? These are the data sizes.
N = 96
L = c(4,4,4,4,4)
length(media1) = 96
length(weights1) = 4
for(t in 1:N){
current_window_x <- ifelse(t <= L[1], media1[1:t], media1[(t - L[1] + 1):t])
t_in_window <- length(current_window_x)
new_media1[t] <- ifelse(t <= L[1], inprod(current_window_x, weights1[1:t_in_window]),
inprod(current_window_x, weights1))
}
The error is (where line 41 correspond to the first line in the loop)
Error in jags.model(model.file, data = data, inits = init.values, n.chains = n.chains, :
RUNTIME ERROR:
Compilation error on line 41.
Index out of range taking subset of media1
I actually just happened on to the answer here earlier today for something I was working on. The answer is in this post. The gist is that ifelse() in jags is not a control flow statement, it is a function and both the TRUE and FALSE conditions are evaluated. So, even though you are saying to use media1[1:t] if t<=L[1], the FALSE condition is also being evaluated which produces the error.
The other problem once you're able to fix that is that you're re-defining the parameter current_window_x, which will throw an error. I think the easiest way to deal with the variable window width is just to hard code the first few observations of new_media and then calculate the remaining ones in the loop, like this:
new_media[1] <- media1[1]*weights1[1]
new_media[2] <- inprod(media1[1:2], weights1[1:2])
new_media[3] <- inprod(media1[1:3], weights1[1:3])
for(t in 4:N){
new_media[t] <- inprod(media1[(t - L[1] + 1):t], weights1)
}

JAGS: variable number of clusters

I am trying to run a Bayesian clustering model where the number of clusters is random with binomial distribution.
This is my Jags model:
model{
for(i in 1:n){
y[ i ,1:M] ~ dmnorm( mu[z[i] , 1:M] , I[1:M, 1:M])
z[i] ~ dcat(omega[1:M])
}
for(j in 1:M){
mu[j,1:M] ~ dmnorm( mu_input[j,1:M] , I[1:M, 1:M] )
}
M ~ dbin(p, Mmax)
omega ~ ddirich(rep(1,Mmax))
}
to run it, we need to define the parameters anche the initial values for the variables, which is done in this R script
Mmax=10
y = matrix(0,100,Mmax)
I = diag(Mmax)
y[1:50,] = mvrnorm(50, rep(0,Mmax), I)
y[51:100,] = mvrnorm(50, rep(5,Mmax), I)
plot(y[,1:2])
z = 1*((1:100)>50) + 1
n = dim(y)[1]
M=2
mu=matrix(rnorm(Mmax^2),nrow=Mmax)
mu_input=matrix(2.5,Mmax,Mmax) ### prior mean
p=0.5
omega=rep(1,Mmax)/Mmax
data = list(y = y, I = I, n = n, mu_input=mu_input, Mmax = Mmax, p = p)
inits = function() {list(mu=mu,
M=M,
omega = omega) }
require(rjags)
modelRegress=jags.model("cluster_variabile.txt",data=data,inits=inits,n.adapt=1000,n.chains=1)
however, running the last command, one gets
Error in jags.model("cluster_variabile.txt", data = data, inits = inits,
: RUNTIME ERROR: Compilation error on line 6.
Unknown variable M Either supply values
for this variable with the data or define it on the left hand side of a relation.
which for me makes no sense, since the error is at line 6 even if M already appears at line 4 of the model! What is the actual problem in running this script?
So JAGS is not like R or other programming procedural languages in that it doesn't actually run line by line, it is a declarative language meaning the order of commands doesn't actually matter at least in terms of how the errors pop up. So just because it didn't throw an error on line 4 doesn't mean something isn't also wrong there. Im not positive, but I believe the error is occuring because JAGS tries to build the array first before inputting values, so M is not actually defined at this stage, but nothing you can do about that on your end.
With that aside, there should be a fairly easy work around for this, it is just less efficient. Instead of looping from 1:M make the loop iterate from 1:MMax that way the dimensions don't actually change, it is always an MMax x MMax. Then line 7 just assigns 1:M of those positions to a value. The downside of this is that it will require you to do some processing after the model is fit. So on each iteration, you will need to pull the sampled M and filter the matrix mu to be M x M, but that shouldn't be too tough. Let me know if you need more help.
So, I think the main problem is that you can't change the dimensionality of the stochastic node you're updating. This seems like a problem for reversible jump MCMC, though I don't think you can do this in JAGS.

Error related to randomisation test within lapply() function in R

I have 30 datasets that are conbined in a data list. I wanted to analyze spatial point pattern by L function along with randomisation test. Codes are following.
The first code works well for a single dataset (data1) but once it is applied to a list of dataset with lapply() function as shown in 2nd code, it gives me a very long error like so,
"Error in Kcross(X, i, j, ...) : No points have mark i = Acoraceae
Error in envelopeEngine(X = X, fun = fun, simul = simrecipe, nsim =
nsim, : Exceeded maximum number of errors"
Can anybody tell me what is wrong with 2nd code?
grp <- factor(data1$species)
window <- ripras(data1$utmX, data1$utmY)
pp.grp <- ppp(data1$utmX, data1$utmY, window=window, marks=grp)
L.grp <- alltypes(pp.grp, Lest, correlation = "Ripley")
LE.grp <- alltypes(pp.grp, Lcross, nsim = 100, envelope = TRUE)
plot(L.grp)
plot(LE.grp)
L.LE.sp <- lapply(data.list, function(x) {
grp <- factor(x$species)
window <- ripras(x$utmX, x$utmY)
pp.grp <- ppp(x$utmX, x$utmY, window = window, marks = grp)
L.grp <- alltypes(pp.grp, Lest, correlation = "Ripley")
LE.grp <- alltypes(pp.grp, Lcross, envelope = TRUE)
result <- list(L.grp=L.grp, LE.grp=LE.grp)
return(result)
})
plot(L.LE.sp$LE.grp[1])
This question is about the R package spatstat.
It would help if you could add a minimal working example including data which demonstrate this problem.
If that is not available, please generate the error on your computer, then type traceback() and capture the output and post it here. This will trace the location of the error.
Without this information, my best guess is the following:
The error message says No points have mark i=Acoraceae. That means that the code is expecting a point pattern to include points of type Acoraceae but found that there were none. This can happen because in alltypes(... envelope=TRUE) the code generates random point patterns according to complete spatial randomness. In the simulated patterns, the number of points of type Acoraceae (say) will be random according to a Poisson distribution with a mean equal to the number of points of type Acoraceae in the observed data. If the number of Acoraceae in the actual data is small then there is a reasonable chance that the simulated pattern will contain no Acoraceae at all. This is probably what is causing the error message No points have mark i=Acoraceae.
If this interpretation is correct then you should be able to suppress the error by including the argument fix.marks=TRUE, that is,
alltypes(pp.grp, Lcross, envelope=TRUE, fix.marks=TRUE, nsim=99)
I'm not suggesting this is necessarily appropriate for your application, but this should remove the error message if my guess is correct.
In the latest development version of spatstat, available on github, the code for envelope has been tweaked to detect this error.

'Invalid parent values' error when running JAGS from R

I am running a simple generalized linear model, calling JAGS from R. The model is negatively binomially distributed. The model is being fitted to data on counts of fish, with the majority of individual counts ('C' in the data set below) being zeros.
I initially ran the model with one covariate, temperature ('Temp'). About half of the time the model ran and the other half of the time the model gave me the error, 'Error in node C[###] Invalid parent values.' The value for C[###] in the error message changes with each successive attempt to run the model.
Since my success at running the model was inconsistent, I tried adding another covariate, salinity ('Salt'). Then the model would not run at all, with the same error message as above.
Any ideas or suggestions on the source of the error are greatly appreciated.
I am suspecting that the initial values for the dispersion parameter, r, may be the issue. Ideally I add several more covariates into model fitting if this error can be addressed.
The data set and code are immediately below. For sake of getting the data to load properly on this website, I have omitted 662 of the 672 total values; even with the reduced data set (n = 10 instead of n = 672) the problem remains.
Thank you.
setwd("C:/Users/John/Desktop")
library('coda')
library('rjags')
library('R2jags')
set.seed(1000000000)
#data
n=10
C=c(0,0,0,0,0,1,0,0,0,1)
Temp=c(0,29.3,25.3,28.7,28.7,24.4,25.1,25.1,24.2,23.3)
Salt=c(6,6,0,6,6,0,12,12,6,12)
sink("My Model.txt")
cat("
model {
r~dunif(0,10)
beta0~dunif (-20,20)
beta1~dunif (-20,20)
beta2~dunif (-20,20)
for (i in 1:n) {
C[i] ~ dnegbin(p[i], r)
p[i] <- r/(r+lambda[i])
log(lambda[i]) <- mu[i]
mu[i] <- beta0 + beta1*Temp[i] + beta2*Salt[i]
}
}
", fill=TRUE)
sink()
n=n
C=C
Temp=Temp
Salt=Salt
#bundle data
bugs.data = list(
"n",
"C",
"Temp",
"Salt")
#parameters to monitor
params<-c(
"r",
"beta0",
"beta1",
"beta2")
#initial values
inits <- function(){list(
r=floor(runif(1,0,5)),
beta0=runif(1,-5,5),
beta1=runif(1,-5,5),
beta2=runif(1,-5,5))}
model.file <- 'My Model.txt'
jagsfit <- jags(data=bugs.data, inits=inits, params, n.iter=1000, n.thin=10, n.burnin=100, model.file)
print(jagsfit, digits=5)
This works fine for me most of the time, but it would fail with the error you describe if the inits function samples a value of r of 0 - which you have made more likely by using floor() in the inits function (not sure why you did that - r is not restricted to integers but is strictly positive). Also, every time you run the model you will get different initial values (unless setting a random seed in R) which is making your life more complicated that it needs to be. I generally recommend picking fixed (and probably over dispersed) initial values, such as r=0.01 and r=10 for the two chains in your example.
However, JAGS picks usable initial values for this model as you can see by not providing your own inits e.g.:
library('runjags')
listdata <- lapply(bugs.data, get)
names(listdata) <- unlist(bugs.data)
run.jags(model.file, params, listdata)
I would also have a think about the prior you are using for r - it could well be that this will have a bigger effect on your posterior than intended. Another (not necessarily better) option is something like a gamma prior.
Matt

How to get confidence intervals by bootstrapping for quantile regressions by default

In my statistics class we use Stata and since I'm an R user I want to do the same things in R. I've gotten the right results but it seems like a somewhat awkward way of getting something as simple as confidence intervals.
Here's my crude solution:
library(quantreg)
na = round(runif(100, min=127, max=144))
f <- rq(na~1, tau=.5, data=ds)
s <- summary.rq(f, se="boot", R=1000)
coef(s)[1]
coef(s)[1]+ c(-1,1)*1.96*coef(s)[2]
I've also experimented a little at the boot package but I haven't gotten it to work:
library(boot)
b <- boot(na, function(w, i){
rand_bootstrap_sample = w[i]
f <- rq(rand_bootstrap_sample~1, tau=.5)
return(coef(f))
}, R=100)
boot.ci(b)
Gives an error:
Error in bca.ci(boot.out, conf, index[1L], L = L, t = t.o, t0 = t0.o, :
estimated adjustment 'a' is NA
My questions:
What I wan't is to know if there is another better way of getting the confidence interval
why is the bootstrap code complaining?
Your example does not give an error message for me (Windows 7/64,R 2.14.2), so it could be a problem of random seeds. So if you post an example using some random method, better add a line set.seed; see example.
Note that the error message refers to the bca type of boot.ci; since this one often complains, deselect it by giving type explicitly.
I do not know exactly why you use the rather complex rq in the bootstrap. If you really wanted to profile rq, forget the simple example below, but please give some more details.
library(boot)
set.seed(4711)
na = round(runif(100, min=127, max=144))
b <- boot(na, function(w, i) median(w[i]), R=1000)
boot.ci(b,type=c("norm","basic","perc"))

Resources