Trouble increasing sample size for power analysis in simr - r

I want to increase the sample size being considering in a power analysis I'm running using simr. With my pilot data of 5 participants, I am able to run the power analysis, but when I use the extend function to increase the number of subjects to 20, I am getting: Error in (function (classes, fdef, mtable): unable to find an inherited method for function ‘extend’ for signature ‘"lmerModLmerTest"’. The extend function does not seem to be working on my model.
I get the same error using the following code, taken from an example online:
#load in the data
sleep_df = lme4::sleepstudy %>%
clean_names()
#set up the model
y_var = "reaction"
fixed_effect = "days"
random_effect = "subject"
model_form = as.formula(paste0(y_var, " ~ ", fixed_effect, " + ", "(1|", random_effect, ")"))
print(model_form)
#run simulation
set.seed(1)
sleep_fit = lmer(model_form,
data = sleep_df)
model_form2 <- extend(sleep_fit, along="subject", n=20)
model_form2
Any insight would be appreciated!

At the top of my head, I can think of two possible errors:
Your subject variable is not specified as an integer, but as a factor. extend() only works on linear variables. However, since you reproduce the error with an example known to work, I think we can disregard this.
The problem is not with the data, but with your R session. For example, if you load after simr another package that has a function that is also named extend() then the function simr::extend() will be masked by the second one. This should show up when you load the package, a message like The following object is masked from 'package:simr' would be printed in the terminal. To solve this, either specify simr::extend() when you want to use this function, or change the order in which you load your packages.
Hope that helps somehow.

Related

runMI function error: 'no slot of name "internalList" for this object of class "lavaanList"'

I have run a multiple imputation (m=45, 10 iterations) using MICE and am attempting to fit a series of confirmatory factor analysis and structural equation models on the imputed datasets using the runMI function from semTools. Nearly all of my variables are Likert scales, coded as ordered/ordinal. Here is my code for the first CFA, where mi.res.train is the mice-generated mids object:
ipc_c_model <- '
IPC_C =~ t2IPC6_1 + t2IPC6_2 + t2IPC6_3 + t2IPC6_4 + t2IPC6_5 + t2IPC6_6 + t2IPC6_7'
ipc_c_fit <- runMI(ipc_c_model, mi.res.train, fun = "cfa", ordered = TRUE)
The model does not fit and returns the following error:
Error in slot(value, what) :
no slot of name "internalList" for this object of class "lavaanList"
As far as I can see, the lavaan.mi object that this is supposed to create is a special type of lavaanList object. Any ideas as to what may be causing this error?
Thanks!
Hi all: thanks for this feedback--unfortunately am using a restricted-use dataset so could not share much data without some extra steps. Fortunately, I updated a few packages and the code now appears to be working. I'd tried that previously but apparently missed the lavaan package itself.

omega function (Psych package R) not working with plot=TRUE

I have to first say that I am not an R user, but I want to apply a certain function I could only find in R.
My purpose is to get a bifactor factor-analysis model using the omega function from the Psych package. I have a data frame with 33 columns and about 100,000 observations and when I call the function (omega(df))
I get the following error:
Error in nchar(tv[1, 21]) : 'nchar()' requires a character vector
I have no idea what it means. If I follow the example in this doc with their data (named "bifact") it works fine, but the example uses a correlation matrix, while I want to use the entire data to be able to extract the factor scores. When I try to call the function with omega(cor(df))
I still get the same error.
Attached is a randomly generated data set that produces the same error.
Any help would be highly appreciated.
A clue to the solution could be the fact that with set.seed(0) I get a different error than with set.seed(100):
set.seed(100)
s_df = as.data.frame(cbind(matrix(seq_len(10000), ncol=1), matrix(rnorm(n=6*10000, mean = 20, sd = 10), ncol=6)))[2:7]
omega(s_df)
Error in nchar(tv[1, 21]) : 'nchar()' requires a character vector
while:
set.seed(0)
s_df = as.data.frame(cbind(matrix(seq_len(10000), ncol=1), matrix(rnorm(n=6*10000, mean = 20, sd = 10), ncol=6)))[2:7]
omega(s_df)
Error in omega.diagram(omega, main = title, sl = sl, labels = labels, :
object 'd.arrow' not found
EDIT: everything works when I call the function with plot=FALSE.
However, I still would like the plot to work. (plot=TRUE) throughs the unwanted error.
Ok, so it turns out I just didn't have the 'Rgraphviz' package installed.
After installing it everything worked great.

Error in eval(parse()) - r unable to find argument input

I am very new to R, and this is my first time of encountering the eval() function. So I am trying to use the med and boot.med function from the following package: mma. I am using it to conduct mediation analysis. med and boot.med take in models such as linear models, and dataframes that specify mediators and predictors and then estimate the mediation effect of each mediator.
The author of the package gives the flexible option of specifying one's own custom.function. From the source code of med, it can be seen that the custom.function is passed to the eval(). So I tried insert the gbmt function as the custom function. However, R kept giving me error message: Error during wrapup: Number of trees to be used in prediction must be provided. I have been searching online for days and tried many ways of specifying the number of trees parameter n.trees, but nothing works (I believe others have raised similar issues: post 1, post 2).
The following codes are part of the source code of the med function:
cf1 = gsub("responseY", "y[,j]", custom.function[j])
cf1 = gsub("dataset123", "x2", cf1)
cf1 = gsub("weights123", "w", cf1)
full.model[[j]] <- eval(parse(text = cf1))
One custom function example the author gives in the package documentation is as follows:
temp1<-med(data=data.bin,n=2,custom.function = 'glm(responseY~.,data=dataset123,family="quasibinomial",
weights=weights123)')
Here the glm is the custom function. This example code works and you can replicate it easily (if you have mma installed and loaded). However when I am trying to use the gbmt function on a survival object, I got errors and here is what my code looks like:
temp1 <- med(data = data.surv,n=2,type = "link",
custom.function = 'gbmt(responseY ~.,
data = dataset123,
distribution = dist,
train_params = start_stop,
cv_folds=10,
keep_gbm_data = TRUE,
)')
Anyone has any idea how the argument about number of trees n.trees can be added somewhere in the above code?
Many thanks in advance!
Update: in order to replicate the example code, please install mma and try the following:
library("mma")
data("weight_behavior") ##binary x #binary y
x=weight_behavior[,c(2,4:14)]
pred=weight_behavior[,3]
y=weight_behavior[,15]
data.bin<-data.org(x,y,pred=pred,contmed=c(7:9,11:12),binmed=c(6,10), binref=c(1,1),catmed=5,catref=1,predref="M",alpha=0.4,alpha2=0.4)
temp1<-med(data=data.bin,n=2) #or use self-defined final function
temp1<-med(data=data.bin,n=2, custom.function = 'glm(responseY~.,data=dataset123,family="quasibinomial",
weights=weights123)')
I changed the custom.function to gbmt and used a survival object as responseY and the error occurs. When I use the gbmt function on my data outside the med function, there is no error.

Error related to randomisation test within lapply() function in R

I have 30 datasets that are conbined in a data list. I wanted to analyze spatial point pattern by L function along with randomisation test. Codes are following.
The first code works well for a single dataset (data1) but once it is applied to a list of dataset with lapply() function as shown in 2nd code, it gives me a very long error like so,
"Error in Kcross(X, i, j, ...) : No points have mark i = Acoraceae
Error in envelopeEngine(X = X, fun = fun, simul = simrecipe, nsim =
nsim, : Exceeded maximum number of errors"
Can anybody tell me what is wrong with 2nd code?
grp <- factor(data1$species)
window <- ripras(data1$utmX, data1$utmY)
pp.grp <- ppp(data1$utmX, data1$utmY, window=window, marks=grp)
L.grp <- alltypes(pp.grp, Lest, correlation = "Ripley")
LE.grp <- alltypes(pp.grp, Lcross, nsim = 100, envelope = TRUE)
plot(L.grp)
plot(LE.grp)
L.LE.sp <- lapply(data.list, function(x) {
grp <- factor(x$species)
window <- ripras(x$utmX, x$utmY)
pp.grp <- ppp(x$utmX, x$utmY, window = window, marks = grp)
L.grp <- alltypes(pp.grp, Lest, correlation = "Ripley")
LE.grp <- alltypes(pp.grp, Lcross, envelope = TRUE)
result <- list(L.grp=L.grp, LE.grp=LE.grp)
return(result)
})
plot(L.LE.sp$LE.grp[1])
This question is about the R package spatstat.
It would help if you could add a minimal working example including data which demonstrate this problem.
If that is not available, please generate the error on your computer, then type traceback() and capture the output and post it here. This will trace the location of the error.
Without this information, my best guess is the following:
The error message says No points have mark i=Acoraceae. That means that the code is expecting a point pattern to include points of type Acoraceae but found that there were none. This can happen because in alltypes(... envelope=TRUE) the code generates random point patterns according to complete spatial randomness. In the simulated patterns, the number of points of type Acoraceae (say) will be random according to a Poisson distribution with a mean equal to the number of points of type Acoraceae in the observed data. If the number of Acoraceae in the actual data is small then there is a reasonable chance that the simulated pattern will contain no Acoraceae at all. This is probably what is causing the error message No points have mark i=Acoraceae.
If this interpretation is correct then you should be able to suppress the error by including the argument fix.marks=TRUE, that is,
alltypes(pp.grp, Lcross, envelope=TRUE, fix.marks=TRUE, nsim=99)
I'm not suggesting this is necessarily appropriate for your application, but this should remove the error message if my guess is correct.
In the latest development version of spatstat, available on github, the code for envelope has been tweaked to detect this error.

R implementation of kohonen SOMs: prediction error due to data type.

I have been trying to run an example code for supervised kohonen SOMs from https://clarkdatalabs.github.io/soms/SOM_NBA . When I tried to predict test set data I got the following error:
pos.prediction <- predict(NBA.SOM3, newdata = NBA.testing)
Error in FUN(X[[i]], ...) :
Data type not allowed: should be a matrix or a factor
I tried newdata = as.matrix(NBA.testing) but it did not help. Neither did as.factor().
Why does it happen? And how can I fix that?
You should put one more argument to the predict function, i.e. "whatmap", then set its value to 1.
The code would be like:
pos.prediction <- predict(NBA.SOM3, newdata = NBA.testing, whatmap = 1)
To verify the prediction result, you can check using:
table(NBA$Pos[-training_indices], pos.prediction$predictions[[2]], useNA = 'always')
The result may be different from that of the tutorial, since it did not declare the use of set.seed() function.
I suggest that the set.seed() with an arbitrary number in it was declared somewhere before the training phase.
For simplicity, put it once on the top most of your script, e.g.
set.seed(12345)
This will guarantee a reproducible result of your model next time you re-run your script.
Hope that will help.

Resources