slicing a vector shows error our of range - r

I have this part of my jags code. I really can't see where the code gets out of the range. Can anyone please see any error that I can't recognize? These are the data sizes.
N = 96
L = c(4,4,4,4,4)
length(media1) = 96
length(weights1) = 4
for(t in 1:N){
current_window_x <- ifelse(t <= L[1], media1[1:t], media1[(t - L[1] + 1):t])
t_in_window <- length(current_window_x)
new_media1[t] <- ifelse(t <= L[1], inprod(current_window_x, weights1[1:t_in_window]),
inprod(current_window_x, weights1))
}
The error is (where line 41 correspond to the first line in the loop)
Error in jags.model(model.file, data = data, inits = init.values, n.chains = n.chains, :
RUNTIME ERROR:
Compilation error on line 41.
Index out of range taking subset of media1

I actually just happened on to the answer here earlier today for something I was working on. The answer is in this post. The gist is that ifelse() in jags is not a control flow statement, it is a function and both the TRUE and FALSE conditions are evaluated. So, even though you are saying to use media1[1:t] if t<=L[1], the FALSE condition is also being evaluated which produces the error.
The other problem once you're able to fix that is that you're re-defining the parameter current_window_x, which will throw an error. I think the easiest way to deal with the variable window width is just to hard code the first few observations of new_media and then calculate the remaining ones in the loop, like this:
new_media[1] <- media1[1]*weights1[1]
new_media[2] <- inprod(media1[1:2], weights1[1:2])
new_media[3] <- inprod(media1[1:3], weights1[1:3])
for(t in 4:N){
new_media[t] <- inprod(media1[(t - L[1] + 1):t], weights1)
}

Related

JAGS: variable number of clusters

I am trying to run a Bayesian clustering model where the number of clusters is random with binomial distribution.
This is my Jags model:
model{
for(i in 1:n){
y[ i ,1:M] ~ dmnorm( mu[z[i] , 1:M] , I[1:M, 1:M])
z[i] ~ dcat(omega[1:M])
}
for(j in 1:M){
mu[j,1:M] ~ dmnorm( mu_input[j,1:M] , I[1:M, 1:M] )
}
M ~ dbin(p, Mmax)
omega ~ ddirich(rep(1,Mmax))
}
to run it, we need to define the parameters anche the initial values for the variables, which is done in this R script
Mmax=10
y = matrix(0,100,Mmax)
I = diag(Mmax)
y[1:50,] = mvrnorm(50, rep(0,Mmax), I)
y[51:100,] = mvrnorm(50, rep(5,Mmax), I)
plot(y[,1:2])
z = 1*((1:100)>50) + 1
n = dim(y)[1]
M=2
mu=matrix(rnorm(Mmax^2),nrow=Mmax)
mu_input=matrix(2.5,Mmax,Mmax) ### prior mean
p=0.5
omega=rep(1,Mmax)/Mmax
data = list(y = y, I = I, n = n, mu_input=mu_input, Mmax = Mmax, p = p)
inits = function() {list(mu=mu,
M=M,
omega = omega) }
require(rjags)
modelRegress=jags.model("cluster_variabile.txt",data=data,inits=inits,n.adapt=1000,n.chains=1)
however, running the last command, one gets
Error in jags.model("cluster_variabile.txt", data = data, inits = inits,
: RUNTIME ERROR: Compilation error on line 6.
Unknown variable M Either supply values
for this variable with the data or define it on the left hand side of a relation.
which for me makes no sense, since the error is at line 6 even if M already appears at line 4 of the model! What is the actual problem in running this script?
So JAGS is not like R or other programming procedural languages in that it doesn't actually run line by line, it is a declarative language meaning the order of commands doesn't actually matter at least in terms of how the errors pop up. So just because it didn't throw an error on line 4 doesn't mean something isn't also wrong there. Im not positive, but I believe the error is occuring because JAGS tries to build the array first before inputting values, so M is not actually defined at this stage, but nothing you can do about that on your end.
With that aside, there should be a fairly easy work around for this, it is just less efficient. Instead of looping from 1:M make the loop iterate from 1:MMax that way the dimensions don't actually change, it is always an MMax x MMax. Then line 7 just assigns 1:M of those positions to a value. The downside of this is that it will require you to do some processing after the model is fit. So on each iteration, you will need to pull the sampled M and filter the matrix mu to be M x M, but that shouldn't be too tough. Let me know if you need more help.
So, I think the main problem is that you can't change the dimensionality of the stochastic node you're updating. This seems like a problem for reversible jump MCMC, though I don't think you can do this in JAGS.

Error when running PerformanceAnalytics function in R

I am getting a Error in 1:T : argument of length 0 when running the Performance Analytics package in R. am I missing a package? Below is my code with error.
#clean z, all features, alpha = .01, run below
setwd("D:/LocalData/casaler/Documents/R/RESULTS/PLOTS_PCA/CLN_01")
PGFZ_ALL <- read.csv("D:/LocalData/casaler/Documents/R/PG_DEUX_Z.csv", header=TRUE)
options(max.print = 100000) #Sets ability to view all dealer records
pgfzc_all <- PGFZ_ALL
#head(pgfzc_all,10)
library("PerformanceAnalytics")
library("RGraphics")
Loading required package: grid
pgfzc_elev <- pgfzc_all$ELEV
#head(pgfzc_elev,5)
#View(pgfzc_elev)
set.seed(123) #for replication purposes; always use same seed value
cln_elev <- clean.boudt(pgfzc_elev, alpha = 0.01) #set alpha .001 to give the most extreme outliers
Error in 1:T : argument of length 0
It's hard to answer your question without knowing what your data looks like. But I can tell you what throws that error. Looking into the source code of the clean.boudt function I find the following cause of your error:
T = dim(R)[1]
...
for (t in c(1:T)) {
d2t = as.matrix(R[t, ] - mu) %*% invSigma %*% t(as.matrix(R[t,
] - mu))
vd2t = c(vd2t, d2t)
}
...
The dim(R)[1] extracts the number of rows in the data supplied to the R argument in the function. It appears that your data has no rows, so check the data type of pgfzc_elev
The cause of the error is likely from your use of $ to subset pgfzc_all.
pgfzc_elev <- pgfzc_all$ELEV
I reckon it is of class integer, which is why dim(R)[1] does not work in the function.
Rather subset your object like this:
pgfzc_elev <- pgfzc_all[, ELEV, drop = F]
Try that and see if it works.

R: Profile-likelihood based confidence intervals

I am using the function plkhci from library Bhat to construct Profile-likelihood based confidence intervals and I got this warning:
Warning message: In dqstep(list(label = x$label, est = btrf(xt, x$low,
x$upp), low = x$low, : oops: unable to find stepsize, use default
when i run
r <- dfp(x,f=nlogf)
Can I ignore this warning as I still can get the output?
Following is the complete coding:
library(Bhat)
beta0<--8
beta1<-0.03
gamma<-0.0105
alpha<-0.05
n<-100
u<-runif(n)
u
x<-rnorm(n)
x
c<-rexp(100,1/1515)
c
t1<-(1/gamma)*log(1-((gamma/(exp(beta0+beta1*x)))*(log(1-u))))
t1
t<-pmin(t1,c)
t
delta<-1*(t1>c)
delta
length(delta)
cp<-length(delta[delta==1])/n
cp
delta[delta==1]<-ifelse(rbinom(length(delta[delta==1]),1,0.5),1,2)
delta
deltae<-ifelse(delta==0, 1,0)
deltar<-ifelse(delta==1, 1,0)
deltai<-ifelse(delta==2, 1,0)
dat=data.frame(t,delta, deltae,deltar,deltai,x)
dat$interval[delta==2] <- as.character(cut(dat$t[delta==2], breaks=seq(0, 600, 100)))
labs <- cut(dat$t[delta==2], breaks=seq(0, 600, 100))
dat$lower[delta==2]<-as.numeric( sub("\\((.+),.*", "\\1", labs) )
dat$upper[delta==2]<-as.numeric( sub("[^,]*,([^]]*)\\]", "\\1", labs) )
data0<-dat[which(dat$delta==0),]#uncensored data
data1<-dat[which(dat$delta==1),]#right censored data
data2<-dat[which(dat$delta==2),]#interval censored data
nlogf<-function(para)
{
b0<-para[1]
b1<-para[2]
g<-para[3]
e<-sum((b0+b1*data0$x)+g*data0$t+(1/g)*exp(b0+b1*data0$x)*(1-exp(g*data0$t)))
r<-sum((1/g)*exp(b0+b1*data1$x)*(1-exp(g*data1$t)))
i<-sum(log(exp((1/g)*exp(b0+b1*data2$x)*(1-exp(g*data2$lower)))-exp((1/g)*exp(b0+b1*data2$x)*(1-exp(g*data2$upper)))))
l<-e+r+i
return(-l)
}
x <- list(label=c("beta0","beta1","gamma"),est=c(-8,0.03,0.0105),low=c(-10,0,0),upp=c(10,1,1))
r <- dfp(x,f=nlogf)
x$est <- r$est
plkhci(x,nlogf,"beta0")
plkhci(x,nlogf,"beta1")
plkhci(x,nlogf,"gamma")
I am giving you a super long answer, but it will help you see that you can chase down your own error messages (most of the time, sometimes this means of looking at functions will not work). It is good to see what is happening inside a method when it throws an warning because sometimes it is fine and sometimes you need to fix your data.
This function is REALLY involved! You can look at it by typing dfp into the R command line (NO TRAILING PARENTHESES) and it will print out the whole function.
17 lines from the end, you will see an assignment:
del <- dqstep(x, f, sens = 0.01)
You can see that this calls the function dqstep, which is reflected in your warning.
You can see this function by typing dqstep into the command line of R again. In reading through this function, also long but not so tedious, there is this section of boolean logic:
if (r < 0 | is.na(r) | b == 0) {
warning("oops: unable to find stepsize, use default")
cat("problem with ", x$label[i], "\n")
break
}
This is the culprit, it returns the message you are getting. The line right above it spells out how r is calculated. You are feeding this function your default x from the prior function plus a sensitivity equations (which I assume dfp generates, it is huge and ugly, so I did not untangle all of it). When the previous nested function returns either an r value lower than Zero, and r value of NA or a b value of ZERO, that message is displayed.
The second error tells you that it was likely b==0 because b is in the denominator and it returned and infinity value, so NO STEP SIZE IS RETURNED FROM THIS NESTED FUNCTION to the variable del in dfp.
The step is fed into THIS equation:
h <- logit.hessian(x, f, del, dapprox = FALSE, nfcn)
which you can look into by typing logit.hessian into the R commandline.
When you do, you see that del is a step size in a logit scale, with a default value of del=rep(0.002, length(x$est))...which the function set for you because running the function dqstep returned no value.
So, you now get to decide if using that step size in the calculation of your confidence interval seems right or if there is a problem with your data which needs resolving to make this work better for you.
When I ran it, line by line, I got this message:
Error in if (denom <= 0) { : missing value where TRUE/FALSE needed
at this line of code:
r <- dfp(x,f=nlogf(x))
Which makes me think I was correct.
That is how I chase down issues I have with messages from packages when I get a message like yours.

JAGS - apply function to all parameter nodes

I'm new to JAGS and I'm running a model in R via R2jags package.
The model code is based on a code taken from Kéry & Schaub 2012 ('Bayesian Population Analysis using WinBUGS"), pg 399.
Chi-square discrepancy measure is computed
model {
....
for(g in 1:G) {
for (t in 1:T) {
...
E[g,t] <- pow((y[g,t] - eval[g,t]),2) / eval[g,t]
...
}#t
}#g
fit <- sum(E[,])
}#model
where g and t are site and time indices and G and T are then the number of sites and the number of years
I get an error though
Error in jags.model(model.file, data = data, inits = init.values, n.chains = n.chains, :
RUNTIME ERROR:
Compilation error on line 140.
Cannot evaluate subset expression for fit
Is it caused by different syntax used by JAGS relative to WinBUGS? The code is the same used in the book, except for I have 2 dimensions instead of three as in the book example.
To answer the last part of your question, no that error isn't caused by different syntax in JAGS (although the error message might look different in BUGS).
In fact I can't see anything wrong with the code snippet that you have posted, and the following reproducible example shows that it works at least when y and eval are given in data:
m <- 'model {
for(g in 1:G) {
for (t in 1:T) {
E[g,t] <- pow((y[g,t] - eval[g,t]),2) / eval[g,t]
}#t
}#g
fit <- sum(E[,])
#data# G, T, y, eval
#monitor# fit
}#model
'
library('runjags')
G=T <- 10
y <- matrix(rnorm(100), nrow=G, ncol=T)
eval <- matrix(rnorm(100), nrow=G, ncol=T)
results <- run.jags(m)
Have you verified what line 140 refers to? Either line 140 is something that you haven't shown, or maybe you have specified either fit or E somewhere else in the model with a different number of dimensions?
If this isn't the case and you still get an error then please add a minimal reproducible example to your question that shows the problem (preferably underneath an ---EDIT--- line below what you have already written) and we can try to help with that.
Matt

Invalid length error for beginner

I have checked previous questions but couldn't find an answer
columnmean<-function(y){
n<-ncol(y)
means<-numeric(n)
for(i in 1:n){
means[i]<-mean(y[,i])
}
means}
I simply cannot understand the error, even the code seems right. Also, i get some dimension error if i input the value of n in this line
means[i]<-mean(y[,i])
Here is a reproduction of the error:
columnmean<-function(y){
n <- ncol(y)
means <- numeric(n)
for(i in 1:n) {
means[i] <- mean(y[,i])
}
means
}
columnmean(1:10)
If y is a vector the result of ncol(y) is NULL. The following calculation in your function raises the error.
Also colMeans(1:10) will cause an error (another error because of better internal checking of the argument).
So, your code is correct for twodimensional data, e.g.:
columnmean(BOD)
# [1] 3.666667 14.833333
The error depends from y (y with only one dimension, i.e. a vector ~~> error).

Resources