factors levels and zero variances - r

I'm encountering this error when running Naive Bayes with klaR package.
I want to share data in order to replicate but I have some constraints on doing so and since I'm unsure of what's going on am unable to create a dataset that will recreate myself. I'm hoping someone who reads this may have encountered and overcome this error before.
Here is the error:
Error in if (any(temp)) stop("Zero variances for at least one class in variables: ", :
missing value where TRUE/FALSE needed
I found some posts online on this already:
here and here
From what I can gather I have some levels that have 1 or zero instances within my data.
The trouble is I cannot find any. I tried this:
sapply(df, function(x) table(x))
to see if any returned tables showed an instance of zero or one but with nearly 400 dummy variables I cannot see any - all have at least several instances of 0 or 1 factor levels that I can see.
Is it possible to tell R to highlight which levels are causing the problem? I'm not sure of my next course of action since I cannot find any levels that might be culprits.

The problem is in the condition being tested, you can reproduce the error with:
if (NA) {
print("ERROR")
}
You could correct it to anyNA(temp) or any(is.na(temp)).
If the error is really on the variance message you could test it with sapply(df, function(x){length(table(x)) == 1}.

Related

Error-The variable must be numeric in cocor

I ran a simple function cocor for the difference in correlation, but I got the error message: one of the variables (temporality) must be numeric. So I checked the data type of the variable and it is double/numeric. I do not have the issue to calculate partial correlation or confidence interval using the same database.
cocor(~temporality+expectability|temporality+positive,data =data2)
is.numeric(data2$temporality) # True
Data2 is a database with 5 variables (gender and 4 numeric measures).
So what is the real reason behind the issue? Thank you
I had the same problem with "The variable 'x' must be numeric." for the cocor function. I found somewhere that cocor does not seem to work with tibbles, but when the data is converted to data.frame it works.
Your script would go like this:
cocor(~temporality+expectability|temporality+positive, data = as.data.frame(data2))
Finally, I used cocor.indep.groups and cocor.dep.groups.overlap to deal with numeric issues.

R: pwfdtest (plm package) produces error message

I used a fixed effects and a first difference estimation. To decide which is more efficient Wooldridge proposes a specific test that is incorporated in the plm package via the following function:
pwfdtest(Y~X1+X2+..., data=...)
However, running this results in an error message for me stating:
> pwfdtest(DepVar~ExplVar1+ExplVar2, data = data)
Error in `$<-.data.frame`(`*tmp*`, "FDres", value = c(-1.18517291896221, :
replacement has 521293 rows, data has 621829
In addition: Warning message:
In id - c(NA, id[1:(N - 1)]) :
longer object length is not a multiple of shorter object length
I tried to look up if anyone has experienced this error before posting, but I couldn't find the answer.
Sometimes, I came across people asking for a minimum working example, but mine is not working at all. However, the example from the plm package does work.
Please note that this is my first research conducted as well as the first time I have used R. So bear with me.
Best wishes
Alex
EDIT:
I read that the traceback() function might be somewhat useful. However it mostly just spat out various number of which I can not even reach the top (?) Anyway,
last lines of these numbers are:
-1.65868856541809, 2.89084861854684, -1.68650260853188, 0.655681663187397,
-0.677329685017227, 0.993684102310348, 1.33441048058398, -2.0526651614649,
-1.64392358708552, 2.58673448155514, 0.952616064091869, -0.909754051474562,
0.815593306056627, -0.0542364686765445, 0.0184515528912868))
2: pwfdtest.panelmodel(fd1)
1: pwfdtest(fd1)
EDIT2:
My first guess was that the NA might be troubling, so I reduced my panel only to the dependent variable and one explanatory variable. Beforehand, I checked if there were any NA, which were not. Yet a smiliar error message:
Error in `$<-.data.frame`(`*tmp*`, FDres, value = c(-1.18517291896221, :
replacement has 521293 rows, data has 621829
In addition: Warning message:
In id - c(NA, id[1:(N - 1)]) :
longer object length is not a multiple of shorter object length
EDIT3:
I think I might have found the problem: unbalanced panel. And it makes somewhat sense I guess... Yet there does not seem to be a solution for it in the traditional sense, it simply does not work.
So if anyone is interested what I did:
I further reduced my panel to only 300 individuals and less years. I named the individuals 1-300 and drumroll it worked. However, after changing some of the individuals names to, for example 555 or 556 it gave me the same error as before.
I am not very proficient with these things, but I my uneducated guess is that the test simply does not work on unbalanced panels.

Error in asMethod(object), Discretize columns first error with Apriori

The only clue as to what I needed to fix the this error was to make the levels of the 3 features factors. I tried that but still doesn't work.
Then does the error saying my columns are not logical have anything to do with it? What does not logical mean in this case?
So an image of the error and what the data looks like for those columns is included here:
Solved
Found my problem!, the code for discretizing my columns created new variables, and didn't change the columns in my data set. So that is why I kept getting the error.

ImpulseDE2, matrix counts contains non-integer elements

Possibly it's a stupid question (but be patient, I'm a beginner in R's word)... I'm working with ImpulseDE2, a package designed to RNAseq data analysis along different times (see article for more information).
The running function (runImpulseDE2) requires a matrix counts and a annotation data frame. I've created both but it appears this error message:
Error in checkCounts(matCountData, "matCountData"): ERROR: matCountData contains non-integer elements. Requires count data.
I have tried some solutions and nothing seems to work (and I've not found any solution in the Internet)...
as.matrix(data)
(data + 1) > and there isn't NAs nor zero values that originate this error ($ which(is.na(data)) and $ which(data < 1), but both results are integer(0))
as.numeric(data) > and appears another error: ERROR: [Rownames of matCountData] was not given as input.
I think that's something I'm not realizing, but I'm totally locked. Every tip will be welcome!
And here is the (silly) solution! This function seems not to accept float numbers... so applying a simple round is enough to solve this error.
Thanks for your help!

R Error in `row.names<-.data.frame`(`*tmp*`, value = value) while using tell of the sensitivity package

I am conducting a sensitivity study using the Sensitivity package. When trying to calculate the sensitivity indices with the output data of the external model I get the error specified in the titel.
The output is a three column table stored in a csv file which I read in as follows:
day1 <- read.csv("day_1_outputs.csv",header=FALSE)
Now when I try to calculate sensitivity indices with the ouput of the first column:
tell(sob.pars,day1[,1])
I get:
Error in `row.names<-.data.frame`(`*tmp*`, value = value) :
invalid 'row.names' length
At first I thought I should use a matrix like object because in another study I conducted I generated the ouput from a raster image read in as a matrix which worked fine, but that didn't help.
The help page for tell states using a vector for the model results but even if I store the column of the dataframe before using tell the problem persists.
I guess my main problem is that I don't understand the error message in conjunction with the tell function, sob.pars is a list returned by sensitivity analyses objects constructors from the same package so I don't know to which rownames of that object the message is refering.
Any hint is appreciated.
Finally found out what the problem was. The error is kind of missleading.
The problem was not the row names since these were identical, that's what irritated me in the first place. There was obviously nothing wrong with them.
The actual problem was the column names in sob.pars. These were missing. Once I added these everything worked fine. Thanks rawr anyways (I just only now noticed someone had commented on the question, I thought I would be notified when this happens, but I guess not).

Resources