"Undefined columns selected" error message but head(data) shows names? - r

I am trying to work through some example code from a published paper and uploaded dataset. The data and code is from the package drcSeedGerm
I am trying to run this sample code:
library(devtools)
install_github("OnofriAndreaPG/drcSeedGerm")
library(drc)
library(drcSeedGerm)
library(lmtest)
library(sandwich)
data(rape)
head(rape)
That runs without issue. The next example is this:
modHTE<-drm(nSeeds~timeBef+timeAf+Psi,data=rape,fct=HTE1(),type="event")
colnames(rape)
I get the error message:
Error in [.data.frame`(temp, , 3) : undefined columns selected
In the paper, they say:
We can see that the data need to be grouped by assessment interval: ’timeBef’ and ’timeAf’ are respectively the beginning and ending of the scoring intervals (in days),’nSeeds’ is the number of germinated seeds. The ’propCum’ columns contains the cumulative proportions of germinated seeds and it is not necessary for time-to-event models. The HTE model is fit by using the functionHTE1(). No starting values are necessary,as a self-starting routine has been built into the model definition.
Am I missing something needed to run the drm code? I am confused why head(rape) or colnames(rape) will show me the names of the columns in the data, but the code doesn't recognize them?
colnames(rape)
[1] "Psi" "Dish" "timeBef" "timeAf" "nSeeds" "nCum" "propCum"
Thank you!

I found the same error using the function drm() from the drc() package using
install_github("OnofriAndreaPG/drcSeedGerm")
after reading through :
https://www.statforbiology.com/2022/stat_drcte_6-ht1step/
the solution is use function drmte() from the drcte() package by doing
install_github("OnofriAndreaPG/drcte")
instead of using drm() as given in the appendix of the Onofri et al 2018 paper
this solved the error and analysis at the moment (13/10/2022)
all the best
Arne

Related

How to block bootstrap in R?

I'm trying to run a block bootstrapping function on some time series data (monthly interest rates for ~15 years).
My data is in a csv file with no header, all comprising one column and going down by row.
I installed the package bootstrap because tsboot wouldn't work for me.
Here is my code:
testFile = read.csv("\\Users\\unori/sample_data.csv")
theta <- function(x){mean(x)}
results = bootstrap(testFile,100,theta)
It tells me there are at least 50 errors. All of them say "In mean.default(x) : argument is not numeric or logical: returning NA"
What to do? It runs when I use the example in the documentation. I think it must be how my data is stored/imported?
Thanks in advance.
Try to supply a working, minimal example that reproduces your problem! Check here to see how to make a minimal reproducible example.
The error messages tells you that the thing you want to calculate the mean of, is not a number! So R will just return NA.
Suggestions for debugging:
Does the object 'testFile' exist?
What is the output of
str(testFile)
This works for me:
library(bootstrap)
testFile <- cars[,1]
theta <- function(x){mean(x)}
results = bootstrap(testFile,100,theta)

Estimation to plot person-item map not feasible because items "have no 0-responses" in data matrix

I am trying to create a person item map that organizes the questions from a dataset in order of difficulty. I am using the eRm package and the output should looks like follows:
[person-item map] (https://hansjoerg.me/post/2018-04-23-rasch-in-r-tutorial_files/figure-html/unnamed-chunk-3-1.png)
So one of the previous steps, before running the function that outputs the map, I have to fit the data set to have a matrix which is the object that the plotting functions uses to create the actual map, but I am having an error when creating that matrix
I have already tried to follow and review some documentation that might be useful if you want to have some extra-information:
[Tutorial] https://hansjoerg.me/2018/04/23/rasch-in-r-tutorial/#plots
[Ploting function] https://rdrr.io/rforge/eRm/man/plotPImap.html
[Documentation] https://eeecon.uibk.ac.at/psychoco/2010/slides/Hatzinger.pdf
Now, this is the code that I am using. First, I install and load the respective libraries and the data:
> library(eRm)
> library(ltm)
Loading required package: MASS
Loading required package: msm
Loading required package: polycor
> library(difR)
Then I fit the PCM and generate the object of class Rm and here is the error:
*the PCM function here is specific for polytomous data, if I use a different one the output says that I am not using a dichotomous dataset
> res <- PCM(my.data)
>Warning:
The following items have no 0-responses:
AUT_10_04 AUN_07_01 AUN_07_02 AUN_09_01 AUN_10_01 AUT_11_01 AUT_17_01
AUT_20_03 CRE_05_02 CRE_07_04 CRE_10_01 CRE_16_02 EFEC_03_07 EFEC_05
EFEC_09_02 EFEC_16_03 EVA_02_01 EVA_07_01 EVA_12_02 EVA_15_06 FLX_04_01
... [rest of items]
>Responses are shifted such that lowest
category is 0.
Warning:
The following items do not have responses on
each category:
EFEC_03_07 LC_07_03 LC_11_05
Estimation may not be feasible. Please check
data matrix
I must clarify that all the dataset has a range from 1 to 5. Is a Likert polytomous dataset
Finally, I try to use the plot function and it does not have any output, the system just keep loading ad-infinitum with no answer
>plotPImap(res, sorted=TRUE)
I would like to add the description of that particular function and the arguments:
>PCM(X, W, se = TRUE, sum0 = TRUE, etaStart)
#X
Input data matrix or data frame with item responses (starting from 0);
rows represent individuals, columns represent items. Missing values are
inserted as NA.
#W
Design matrix for the PCM. If omitted, the function will compute W
automatically.
#se
If TRUE, the standard errors are computed.
#sum0
If TRUE, the parameters are normed to sum-0 by specifying an appropriate
W.
If FALSE, the first parameter is restricted to 0.
#etaStart
A vector of starting values for the eta parameters can be specified. If
missing, the 0-vector is used.
I do not understand why is necessary to have a score beginning from 0, I think that that what the error is trying to say but I don't understand quite well that output.
I highly appreciate any hint that you can provide me
Feel free to ask for any information that could be useful to reach the solution to this issue
The problem is not caused by the fact that there are no items with 0-responses. The model automatically corrects this by centering the response scale categories on zero. (You'll notice that the PI-map that you linked to is centered on zero. Also, I believe the map you linked to is of dichotomous data. Polytomous data should include the scale categories on the PI-map, I believe.)
Without being able to see your data, it is impossible to know the exact cause though.
It may be that the model is not converging. That may be what this error was alluding to: Estimation may not be feasible. Please check data matrix. You could check by entering > res at the prompt. If the model was able to converge you should see something like:
Conditional log-likelihood: -2.23709
Number of iterations: 27
Number of parameters: 8
...
Does your data contain answers with decimal numbers? I found the same error, I solved it by using dplyr::dense_rank() function:
df_ranked <- sapply(df_decimal_data, dense_rank)
Worked.

Exponential smoothing not recognizing my data as time series

I have a data set that includes t(time) which ranges from 1-243 and 5 other variables which are separate company stock prices each also containing 243 data points. I want to run exponential smoothing on my variable "HD". I am trying to run the following command:
library(smooth)
smoothhd <- es(mydata$HD, h=10, holdout=TRUE, silent=FALSE, cfTYPE=MSE)
However, when I do I receive the following error:
The provided data is not ts object. Only non-seasonal models are available.
Forming the pool of models based on... ANN, AAN, Estimation progress: 100%... Done!
Error in .External.graphics(C_layout, num.rows, num.cols, mat, as.integer(num.figures), :
invalid graphics state.
Does anyone have any insight as to what is wrong with my command or what might need to be changed with my data file in order for this command to give me the smoothed data?
It just seems that your mydata$HD is not a time series object.
Try run is.ts(mydata$HD) and if it is not just coerce it to it with as.ts(mydata$HD).

Arguments length error when trying to test model using party package in R

I have divided my data set into two groups:
transactions.Train (80% of the data)
transactions.test (20% of the data)
Then I built the decision tree using ctree method from party package as follow:
transactions.Tree <- ctree(dt_formula, data=transactions.train)
And I can successfully apply predict method on the training set and use table function to output the result as follow:
table(predict(transactions.Tree), transactions.train$Satisfaction)
But my problem occurs when I try to output the table based on the testing set as follow:
testPred <- predict(transactions.Tree, newdata=transactions.test)
table(testPred, transactions.test$Satisfaction)
And the error is as follow:
Error in table(predict(pred = svm.pred, transactions.Tree), transactions.test$Satisfaction) :
all arguments must have the same length
I have done research on similar cases which suggested omitting any NA values which I did without changing the error outcome.
Can anyone help me by poniting out what's the problem here?

Metafor measure argument error

I have calculated effect size and pooled SE in the way that I wanted. Only thing is drawing a forest plot and let metafor calculate the summary effect size. I have over 30 .csv data files to plot separately. When I do that with the following data (below), it plots and calculates summary effect smoothly.
DeltaPI Spooled
-75.35224985 7.618629848
-51.85221078 7.513461236
-37.77455275 7.164279414
The line I use is:
meta1<-rma(yi=mydata$DeltaPI, sei=mydata$Spooled)
forest(meta1,slab=paste(mydata$Study,mydata$Genotype..Experimental.),showweight=TRUE,alim=c(-100,25),at=c(-100,-50,0,25),xlab="Percentage Change of PI Score",cex=0.7,cex.lab=1,col="red")
However, when I try to do same thing with some other .csv files I have, rma gives an error and asks for 'measure' argument to plot the output. And since the measure is already DeltaPI i calculated manually, I don't want to use.
Weirdly, even if I change the data in those don't working files with the one that working properly(3 data rows above), it still gives the same error. Although, the same data works properly in some other .csv file.
So I'm not clear why I am getting the error and what is the solution.
Any comment would be appreciated!
My guess is that this has nothing to do with the plotting, but occurs when the rma() command is run. And it sounds to me that there are issues with how variables are named in the data that you are reading in. Now you are reading in data from .csv files, but this is probably what is happening:
> library(metafor)
> dat <- data.frame(DeltaP1 = c(.2,.4), Spooled=c(.1,.1))
> rma(dat$DeltaPI, sei=dat$Spooled)
Error in rma(dat$DeltaPi, sei = dat$s) :
Specify the desired outcome measure via the 'measure' argument.
So, in essence, you should carefully check the variable names.

Resources