I try to calculating sobol index using Sensobol package in R.
So I make a sobol matrix using:
input.df <- sobol_matricies(N=50, params = c('x1','x2','x3','x4','x5')).
And then I calculate 'y' with this matrix.
y <- output.df$MY
And then try to calculating sobol index as follow:
MY.sobol <- sobol_indices(matrices = input.df, Y = y, N = 50, params = c('x1','x2','x3','x4','x5'))
But R studio returned error as follow:
Error in sobol_boot(d = d, N = N, params = params, first = first, total = total, :
object 'Y_A' not found
I am wondering what is causing this error.
Thank you.
Related
I am currently using the R package ParBayesianOptimization to tune parameters for ML methods. While searching for an optimal cost parameter for the svmLinear2 model (contained in caret), the optimization stopped with a sudden error after successfully completing 15 iterations.
Here is the error traceback:
Error in rbindlist(l, use.names, fill, idcol) :
Item 2 has 9 columns, inconsistent with item 1 which has 10 columns. To fill missing columns use fill=TRUE.
7.
rbindlist(l, use.names, fill, idcol)
6.
rbind(deparse.level, ...)
5.
rbind(scoreSummary, data.table(Epoch = rep(Epoch, nrow(NewResults)),
Iteration = 1:nrow(NewResults) + nrow(scoreSummary), inBounds = rep(TRUE,
nrow(NewResults)), NewResults))
4.
addIterations(optObj, otherHalting = otherHalting, iters.n = iters.n,
iters.k = iters.k, parallel = parallel, plotProgress = plotProgress,
errorHandling = errorHandling, saveFile = saveFile, verbose = verbose,
...)
3.
ParBayesianOptimization::bayesOpt(FUN = ...
So somehow the data tables storing the summary information each iteration suddenly differ in the number of columns present. Is this a common bug with the ParBayesianOptimization package? Has anyone else encountered a similar problem? Did you find a fix - other than rewriting the addIterations function to fill the missing columns?
EDIT:I don't have an explanation for why the error may suddenly occur after a number of successful iterations. However, this issue has reoccurred when using svmLinear and svmRadial. I was able to reconstruct a similar case with the same error on the iris dataset:
library(data.table)
library(caret)
library(ParBayesianOptimization)
set.seed(1234)
bayes.opt.bounds = list()
bayes.opt.bounds[["svmRadial"]] = list(C = c(0,1000),
sigma = c(0,500))
svmRadScore = function(...){
grid = data.frame(...)
mod = caret::train(Species~., data=iris, method = "svmRadial",
trControl = trainControl(method = "repeatedcv",
number = 7, repeats = 5),
tuneGrid = grid)
return(list(Score = caret::getTrainPerf(mod)[, "TrainAccuracy"], Pred = 0))
}
bayes.create.grid.par = function(bounds, n = 10){
grid = data.table()
params = names(bounds)
grid[, c(params) := lapply(bounds, FUN = function(minMax){
return(runif(n, minMax[1], minMax[2]))}
)]
return(grid)
}
prior.grid.rad = bayes.create.grid.par(bayes.opt.bounds[["svmRadial"]])
svmRadOpt = ParBayesianOptimization::bayesOpt(FUN = svmRadScore,
bounds = bayes.opt.bounds[["svmRadial"]],
initGrid = prior.grid.rad,
iters.n = 100,
acq = "ucb", kappa = 1, parallel = FALSE,plotProgress = TRUE)
Using this example, the error occurred on the 9th epoch.
Thanks!
It appears that the scoring function returned NAs in place of accuracy measures leading to the error later downstream. This has been described by the library's creator at
https://github.com/AnotherSamWilson/ParBayesianOptimization/issues/33.
It looks like the SVM is trying a cost of 0 during the 9th iteration. Given the problem statement the SVM is solving, the cost parameter should probably be positive.
According to AnotherSamWilson, this error may commonly occur when the scoring function "returns something unexpected".
I'm trying to specify a cluster variable after plm using vcovCR() in clubSandwich package for my simulated data (which I use for power simulation), but I get the following error message:
"Error in [.data.frame(eval(mf$data, envir), , index_names) : undefined columns selected"
I'm not sure if this is specific to vcovCR() or something general about R, but could anyone tell me what's wrong with my code? (I saw a related post here How to cluster standard errors of plm at different level rather than id or time?, but it didn't solve my problem).
My code:
N <- 100;id <- 1:N;id <- c(id,id);gid <- 1:(N/2);
gid <- c(gid,gid,gid,gid);T <- rep(0,N);T = c(T,T+1)
a <- qnorm(runif(N),mean=0,sd=0.005)
gp <- qnorm(runif(N/2),mean=0,sd=0.0005)
u <- qnorm(runif(N*2),mean=0,sd=0.05)
a <- c(a,a);gp = c(gp,gp,gp,gp)
Ylatent <- -0.05*T + a + u
Data <- data.frame(
Y = ifelse(Ylatent > 0, 1, 0),
id = id,gid = gid,T = T
)
library(clubSandwich)
library(plm)
fe.fit <- plm(formula = Y ~ T, data = Data, model = "within", index = "id",effect = "individual", singular.ok = FALSE)
vcovCR(fe.fit,cluster=Data$id,type = "CR2") # doesn't work, but I can run this by not specifying cluster as in the next line
vcovCR(fe.fit,type = "CR2")
vcovCR(fe.fit,cluster=Data$gid,type = "CR2") # I ultimately want to run this
Make your data a pdata.frame first. This is safer, especially if you want to have the time index created automatically (seems to be the case looking at your code).
Continuing what you have:
pData <- pdata.frame(Data, index = "id") # time index is created automatically
fe.fit2 <- plm(formula = Y ~ T, data = pData, model = "within", effect = "individual")
vcovCR(fe.fit2, cluster=Data$id,type = "CR2")
vcovCR(fe.fit2, type = "CR2")
vcovCR(fe.fit2,cluster=Data$gid,type = "CR2")
Your example does not work due to a bug in clubSandwich's data extraction function get_index_order (from version 0.3.3) for plm objects. It assumes both index variables are in the original data but this is not the case in your example where the time index is created automatically by only specifying the individual dimension by the index argument.
I am using Max-min Markov blanket algorithm for variable selection in R from MXM package. Following is my code:
library(MXM)
dataset = read.table('data.txt', na.string = c("", "NA"), sep = '\t', header = FALSE)
dataset = dataset[, colSums(is.na(dataset)) == 0]
D = as.matrix(as.data.frame(lapply(dataset, as.numeric)))
target = read.table('class_num.txt')
target = c(target)
aa = mmmb(target, D, max_k = 3, threshold = 0.05, test = "testIndFisher", user_test = NULL, robust = FALSE, ncores = 2)
I am getting the following error:
Error in unique(as.numeric(target)) :
(list) object cannot be coerced to type 'double'
According to the mmmb manual page my dataset D is a matrix of continuous value of dimension (95933 x 85) and my target is a vector of [0, 1] of size 95933.
Can someone help me understand the error?
Got the solution:
The target is a list instead of an array. The following line solved the issue:
target = array(as.numeric(unlist(target)))
Thanks!
I am trying to make predictions using knn.reg() from the FNN package, but I'm encountering an unusual error. When y gets passed as a data frame to knn.reg() and I try to predict using the 2 nearest neighbors, I get the following error message:
Error in as.matrix(x)[i] : subscript out of bounds
However, when y is a data frame and k is any number other than 2, the function works. I've figured out that passing y as a vector works for k = 2 (and also produces the same predictions as when y is a data frame), but I'm not sure why the error keeps popping for when y is a data frame.
Code sample here:
x = 1:10
y = 10:1
df = data.frame(x, y)
k1vec = FNN::knn.reg(train = df['x'], test = df['x'], y = df$y, k = 1)$pred
k1df = FNN::knn.reg(train = df['x'], test = df['x'], y = df['y'], k = 1)$pred
identical(k1vec, k1df)
[1] TRUE
k2vec = FNN::knn.reg(train = df['x'], test = df['x'], y = df$y, k = 2)$pred
k2df = FNN::knn.reg(train = df['x'], test = df['x'], y = df['y'], k = 2)$pred
`Error in as.matrix(x)[i] : subscript out of bounds`
k3vec = FNN::knn.reg(train = df['x'], test = df['x'], y = df$y, k = 3)$pred
k3df = FNN::knn.reg(train = df['x'], test = df['x'], y = df['y'], k = 3)$pred
identical(k3vec, k3df)
[1] TRUE
#rossdrucker9 I ran into the same error by only when in put knn.reg inside for loop. The error appear on the second loop. I spend half a day on FNN::knn.reg to get optimal k because I gave me better results than caret's knn. However, knn from caret package gave no out of bound error.
I know this isn't the exact same problem, but I ended up here when searching and trying to resolve. I was using FNN::knn.reg and I got the following error message (this is only the start of the error message, not the entire message):
r Error: Subscript `Z$nn.index` is a matrix, it must be of type logical.
Then I ran:
train <- as.data.frame(train)
y <- as.data.frame(y)
and then I once again ran knn.reg:
knn1 <- knn.reg(train, test, y, k = 12)
and it worked!
(I tried to comment instead of answer but the system won't allow it).
I am attempting to fit a truncated normal distribution to a dataset of 5000 claim sizes using maximum likelihood:
l1 = function(theta)
{
-sum(dtruncnorm(x=size, a=0, b=Inf, mean = theta[1], sd=theta[2]))
}
mle1=optim(par=c(4,4), fn=l1)
When I run the optim(par=c(4, 2), fn=l1) line however, I get the error:
Error in dtruncnorm(x = size, a = 0, b = Inf, mean = theta[1], sd = theta[2]) :
Argument 's_x' is not a real vector.
I know it has something to do with the size variable but as far as I can tell it is a vector of integers since when I run typeof(size) I get "integer" as the output.
Any help is appreciated!
For some reason the function does not accept sequences. This worked for me:
-sum(sapply(size, function(v){
dtruncnorm(x=as.numeric(v), a=0, b=Inf, mean = theta[1], sd=theta[2])
}))