Imputation using robComputation gives error message - r

I have a dataset with a rather large portion of missing values. I'm trying to do imputation using robComposition. But I keep getting the error message: "Error in quantile.default(d, k/length(d)) : missing values and NaN's not allowed if 'na.rm' is FALSE". This does not make sense to me. Why would missing values not be allowed if I'm trying to impute missing values? Here's a small subset of the data and code to reproduce the error
library(robCompositions)
p <- c(1.000000,2.083333,1.333333,1.166667,4.250000,1.083333,2.083333,1.166667,1.000000,1.000000)
i <- c(1101.25,1675.00,2500.00,1612.50,NA,1750.0,600.00,0.00,1530.00,3158.50)
s <- c(34000,1550,NA,2750,375,1750,30000,20000,NA,NA)
x <- data.frame(p,i,s)
imp <- impCoda(x)

After contact with one of the authors of robComposite it became apparant that I need to use imri() in the package VIM to impute data for non-composite models.

Related

prcomp() function is giving me error "infinite or missing values in 'x'"

Eset is an expression matrix and I'm trying to make a PCA and I keep on getting this error and I was thinking maybe it was because "exprs" is not numeric but I'm checking and it is double. how can I solve this?
Eset<-ExpressionSet(as.matrix(exp))
pData(Eset)<-meta
featureData(Eset) <- as(feat,"AnnotatedDataFrame")
exprs<- Biobase::exprs(Eset)
exprs<-t(exprs)
exprs<- as.numeric(exprs)
type(exprs)
PCA <- prcomp(exprs, scale = FALSE)
I was not expecting this error because I made sure that "exprs" was numeric but it's still not working

Error when trying to make confusion matrix

cm = table(obs = test[,14], pred)
Error in if (xi > xj) 1L else -1L: missing value where TRUE/FALSE needed
I am trying to output the confusion matrix of my random forest model on the testing data, but I'm getting this error. Any ideas what the issue might be?
Thank you in advance!
The error function tells us that one of the items in test[,14] or pred is missing (NA), and the table() function you are using cannot handle missing values. I expect you can get a confusion matrix by first eliminating elements of both vectors where either vector is NA.
Note that the table() function you are using does not seem to be the base R table() function. I expect it is part of a package you have loaded.

rankFD() returning Error in svd(X) : infinite or missing values in 'x'

I'm conducting an experiment with have the following data:
I'm trying to run a non-parametric Anova-Type Statistic test. However, I keep getting this error:
rankFD(Tardiness~Jobs*Stages*ShopCondition*PD*CV, data = Datos)
Error in svd(X) : infinite or missing values in 'x'
rankFD(StdDev~Jobs*Stages*ShopCondition*PD*CV, data = Datos)
Error in svd(X) : infinite or missing values in 'x'
I've already adressed the obvious concern of checking if I have any null or infinite values on my data, but it appears that everything is in order according to functions is.na() and is.infinite() (both returning FALSE meaning 0 values in the data are NA or Infinite).
If needed, here is the data set used in the experiment:
Data sample
Can someone help me find where is the mistake in this.
Any help is appreciated!

K-Means clustering in R error NA/NaN/Inf in foreign function call

I have a dataset that I have created in R. It is structured as follows:
I am trying to cluster the observations using k-means. However, I get the following error message:
> cl <- kmeans(sample, 3)
Error in do_one(nmeth) : NA/NaN/Inf in foreign function call (arg 1)
In addition: Warning message:
In storage.mode(x) <- "double" : NAs introduced by coercion
What does this mean? Am I prepocessing the data incorrectly? What can I do to fix it?
In the documentation of kmeans (pass ?kmeans in the console to see it), it is stipulated that the argument x has to be:
numeric matrix of data, or an object that can be coerced to such a matrix (such as a numeric vector or a data frame with all numeric columns).
Here, you have the first row that is preventing to be used for kmeans. Basically, I believed that your first row is supposed to be your colnames.
Moreover, you can't make clustering with your second columns genre as it is character and I believed that the first column does not have to be used also, am I right ?
So, if your dataset is called samples, try to do:
colnames(samples) <- samples[1,]
samples_cluster <- samples[-1,3:ncol(samples)]
cl <- kmeans(samples_cluster,3)
Does it answer your question ?
If not, can you provide a reproducible example of your dataset in order we can verify the dataframe for kmeans clustering. To do this, please see: How to make a great R reproducible example

"Error in 1:ncol(x) : argument of length 0" when using Amelia in R

I am working with panel data. I have well over 6,000 country-year observations, and have specified my Amelia imputation as follows:
(CountDependentVariable, m=5, ts="year", cs="cowcode",
sqrts=c("OtherCountVariable2", "OtherCount3", "OtherCount4"),
ords=c("OrdinalVar1", "Ordinal Variable 2"),
lgstc=c("ProportionVariale"),
noms=c("NominalVar1"),p2s = 0, idvars = c("country"))
When I run those lines of code, I continue to receive the following error:
Error in 1:ncol(x) : argument of length 0
I've seen people get a similar error, but in different contexts. Importantly, there are several continuous independent variables I left out of the Amelia code, because I am under the impression that they get imputed WITHOUT having to do so. Does anyone know:
1) What this error means?
2) How to correct this error?
Update #1: Provided more context, in terms of the types of variables in my count panel data, in the above sample code.
Update #2: I did some research, and ran into an R file containing a function that diagnoses possible errors for Amelia code. After running the code, I got the following error message first (and many more thereafter):
AMn<-nrow(x)
Error in nrow(x) : object 'x' not found
AMp<-ncol(x)
Error in ncol(x) : object 'x' not found
subbedout<-c(idvars,cs,ts)
Error: object 'idvars' not found
Error Code: 4
if (any(colSums(!is.na(x)) <= 1)) {
all.miss <- colnames(x)[colSums(!is.na(x)) <= 1]
if (is.null(all.miss)) {
all.miss <- which(colSums(!is.na(x)) <= 1)
}
all.miss <- paste(all.miss, collapse = ", ")
error.code<-4
error.mess<-paste("The data has a column that is completely missing or only has one,observation. Remove these columns:", all.miss)
return(list(code=error.code,mess=error.mess))
}
Error in is.data.frame(x) : object 'x' not found
Error codes: 5-6
Errors in one of the list variables
idout<-listcheck(idvars,"One of the 'idvars'")
Error in identical(vars, NULL) : object 'idvars' not found
Currently, there are no missing values for the country variable I place in the idvars argument. However, the very first "chunk" of errors wants me to believe that this is so.
Am I not properly specifying the Amelia code I have above?
I had forgotten to specify the dataframe in the original Amelia code (slaps hand on forehead). So now, after resolving the whacky issue above, I am getting the following error from Amelia:
Amelia Error Code: 44
One of the variable names in the options list does not match a variable name in the data.
I've checked the variable names, and they match, verbatim, to what I named them in the dataframe.

Resources