R: pwfdtest (plm package) produces error message - r

I used a fixed effects and a first difference estimation. To decide which is more efficient Wooldridge proposes a specific test that is incorporated in the plm package via the following function:
pwfdtest(Y~X1+X2+..., data=...)
However, running this results in an error message for me stating:
> pwfdtest(DepVar~ExplVar1+ExplVar2, data = data)
Error in `$<-.data.frame`(`*tmp*`, "FDres", value = c(-1.18517291896221, :
replacement has 521293 rows, data has 621829
In addition: Warning message:
In id - c(NA, id[1:(N - 1)]) :
longer object length is not a multiple of shorter object length
I tried to look up if anyone has experienced this error before posting, but I couldn't find the answer.
Sometimes, I came across people asking for a minimum working example, but mine is not working at all. However, the example from the plm package does work.
Please note that this is my first research conducted as well as the first time I have used R. So bear with me.
Best wishes
Alex
EDIT:
I read that the traceback() function might be somewhat useful. However it mostly just spat out various number of which I can not even reach the top (?) Anyway,
last lines of these numbers are:
-1.65868856541809, 2.89084861854684, -1.68650260853188, 0.655681663187397,
-0.677329685017227, 0.993684102310348, 1.33441048058398, -2.0526651614649,
-1.64392358708552, 2.58673448155514, 0.952616064091869, -0.909754051474562,
0.815593306056627, -0.0542364686765445, 0.0184515528912868))
2: pwfdtest.panelmodel(fd1)
1: pwfdtest(fd1)
EDIT2:
My first guess was that the NA might be troubling, so I reduced my panel only to the dependent variable and one explanatory variable. Beforehand, I checked if there were any NA, which were not. Yet a smiliar error message:
Error in `$<-.data.frame`(`*tmp*`, FDres, value = c(-1.18517291896221, :
replacement has 521293 rows, data has 621829
In addition: Warning message:
In id - c(NA, id[1:(N - 1)]) :
longer object length is not a multiple of shorter object length
EDIT3:
I think I might have found the problem: unbalanced panel. And it makes somewhat sense I guess... Yet there does not seem to be a solution for it in the traditional sense, it simply does not work.
So if anyone is interested what I did:
I further reduced my panel to only 300 individuals and less years. I named the individuals 1-300 and drumroll it worked. However, after changing some of the individuals names to, for example 555 or 556 it gave me the same error as before.
I am not very proficient with these things, but I my uneducated guess is that the test simply does not work on unbalanced panels.

Related

SuperLearner Error in R - Object 'All' not found

I am trying to fit a model with the SuperLearner package. However, I can't even get past the stage of playing with the package to get comfortable with it....
I use the following code:
superlearner<-SuperLearner::SuperLearner(Y=y, X=as.data.frame(data_train[1:30]), family =binomial(), SL.library = list("SL.glmnet"), obsWeights = weights)
y is a numeric vector of the same length as my dataframe "data_train", containing the correct labels with 9 different classes. The dataframe "data_train" contains 30 columns with numeric data.
When i run this, i get the Error:
Error in get(library$screenAlgorithm[s], envir = env) :
Objekt 'All' not found
I don't really know what the problem could be and i can't really wrap my head around the source code. Please note that the variable obsWeights in the function contains a numeric vector of the same length as my data with weights i calculated for the model. This shouldn't be the problem, as it doesn't work either way.
Unfortunately i can't really share my data on here, but maybe someone had this error before...
Thanks!
this seems to happen if you do not attach SuperLearner, you can fix via library(SuperLearner)

ImpulseDE2, matrix counts contains non-integer elements

Possibly it's a stupid question (but be patient, I'm a beginner in R's word)... I'm working with ImpulseDE2, a package designed to RNAseq data analysis along different times (see article for more information).
The running function (runImpulseDE2) requires a matrix counts and a annotation data frame. I've created both but it appears this error message:
Error in checkCounts(matCountData, "matCountData"): ERROR: matCountData contains non-integer elements. Requires count data.
I have tried some solutions and nothing seems to work (and I've not found any solution in the Internet)...
as.matrix(data)
(data + 1) > and there isn't NAs nor zero values that originate this error ($ which(is.na(data)) and $ which(data < 1), but both results are integer(0))
as.numeric(data) > and appears another error: ERROR: [Rownames of matCountData] was not given as input.
I think that's something I'm not realizing, but I'm totally locked. Every tip will be welcome!
And here is the (silly) solution! This function seems not to accept float numbers... so applying a simple round is enough to solve this error.
Thanks for your help!

factors levels and zero variances

I'm encountering this error when running Naive Bayes with klaR package.
I want to share data in order to replicate but I have some constraints on doing so and since I'm unsure of what's going on am unable to create a dataset that will recreate myself. I'm hoping someone who reads this may have encountered and overcome this error before.
Here is the error:
Error in if (any(temp)) stop("Zero variances for at least one class in variables: ", :
missing value where TRUE/FALSE needed
I found some posts online on this already:
here and here
From what I can gather I have some levels that have 1 or zero instances within my data.
The trouble is I cannot find any. I tried this:
sapply(df, function(x) table(x))
to see if any returned tables showed an instance of zero or one but with nearly 400 dummy variables I cannot see any - all have at least several instances of 0 or 1 factor levels that I can see.
Is it possible to tell R to highlight which levels are causing the problem? I'm not sure of my next course of action since I cannot find any levels that might be culprits.
The problem is in the condition being tested, you can reproduce the error with:
if (NA) {
print("ERROR")
}
You could correct it to anyNA(temp) or any(is.na(temp)).
If the error is really on the variance message you could test it with sapply(df, function(x){length(table(x)) == 1}.

Correctly setting up Shannon's Entropy Calculation in R

I was trying to run some entropy() calculations on Force Platform data and i get a warning message:
> library(entropy)
> d2 <- read.csv("c:/users/SLA9DI/Documents/data2.csv")
> entropy(d2$CoPy, method="MM")
[1] 10.98084
> entropy(d2$CoPx, method="MM")
[1] 391.2395
Warning message:
In log(freqs) : NaNs produced
I am sure it is because the entropy() is trying to take the log of a negative number. I also know R can do complex numbers using complex(), however i have not been successful in getting it to work with my data. I did not get this error on my CoPy data, only the CoPx data, since a force platform gets Center of Pressure data in 2 dimensions. Does anyone have any suggestions on getting complex() to work on my data set or is there another function that would work better to try and get a proper entropy calculation? Entropy shouldn't be that much greater in CoPx compared to CoPy. I also tried it with some more data sets from other subjects and the same thing was popping up, CoPx entropy measures were giving me warning messages and CoPy measurements were not. I am attaching a data set link so anyone can try it out for themselves and see if they can figure it out, as the data is a little long to just post into here.
Data
Edit: Correct Answer
As suggested, i tried the table(...) function and received no warning/error message and the entropy output was also in the expected range as well. However, i apparently overlooked a function in the package discretize() and that is what you are supposed to use to correctly setup the data for entropy calculation.
I think there's no point in applying the entropy function on your data. According to ?entropy, it
estimates the Shannon entropy H of the random variable Y from the corresponding observed counts y
(emphasis mine). This means that you need to convert your data (which seems to be continuous) to count data first, for instance by binning it.

R Error in `row.names<-.data.frame`(`*tmp*`, value = value) while using tell of the sensitivity package

I am conducting a sensitivity study using the Sensitivity package. When trying to calculate the sensitivity indices with the output data of the external model I get the error specified in the titel.
The output is a three column table stored in a csv file which I read in as follows:
day1 <- read.csv("day_1_outputs.csv",header=FALSE)
Now when I try to calculate sensitivity indices with the ouput of the first column:
tell(sob.pars,day1[,1])
I get:
Error in `row.names<-.data.frame`(`*tmp*`, value = value) :
invalid 'row.names' length
At first I thought I should use a matrix like object because in another study I conducted I generated the ouput from a raster image read in as a matrix which worked fine, but that didn't help.
The help page for tell states using a vector for the model results but even if I store the column of the dataframe before using tell the problem persists.
I guess my main problem is that I don't understand the error message in conjunction with the tell function, sob.pars is a list returned by sensitivity analyses objects constructors from the same package so I don't know to which rownames of that object the message is refering.
Any hint is appreciated.
Finally found out what the problem was. The error is kind of missleading.
The problem was not the row names since these were identical, that's what irritated me in the first place. There was obviously nothing wrong with them.
The actual problem was the column names in sob.pars. These were missing. Once I added these everything worked fine. Thanks rawr anyways (I just only now noticed someone had commented on the question, I thought I would be notified when this happens, but I guess not).

Resources