I'm trying to conduct a 3 way Anova in R, using the WRS2 package.
My data is heteroskedastic so i need to do a robust version e.g. trimmed means.
I have my data arranged in long form (csv with 4 columns - 3 factors and 1 numeral). My input looks like this:
t3way(happiness ~ money*job*relationship, data = Dataset)
I get the following error: "Incomplete design! It needs to be full factorial!"
Thank you in advance!
I solved the same problem by re-factoring the variable with empty levels.
Dataset$myvar <- factor(Dataset$myvar)
Thus only the levels having data remain. Then the t3way worked fine.
Related
I try to use R on my laptop to run the HLM regression of a large dataset of about 2GB(500,000 lines), and the format of this dataset is spss(.sav).Sorry I could not share the data, as required by my professor, but I would try my best to provide as many details as possible. Here is some codes of mine.
data<- spss.get("Stanford Dataset .sav")
result1 <- lmer(SCIENCE ~ GDP + Individualism+ Gender+ Gender*GDP+
Individualism*Gender + (1+Gender|Country/School),data = data)
summary(result1)
And the problem is, it takes me about 5 minutes to run a regression and print the summary. Is there any faster way to deal with this large memory model?
Actually I have tried some of the following methods:
1) use data.table in data.table package. data <- data.table(data) before run the regression . Howevern I wait for the results with more mins than before.
2) use as.big.matrix in package bigmemory, and it shows the error:
Error in list2env(data) : first argument must be a named list
Seems that the matrix is not working in the function lmer.
So I am really lack of thoughts now, any relative idea would be helpful.
Thanks a lot !
I am trying to do an anova anaysis in R on a data set with one within factor and one between factor. The data is from an experiment to test the similarity of two testing methods. Each subject was tested in Method 1 and Method 2 (the within factor) as well as being in one of 4 different groups (the between factor). I have tried using the aov, the Anova(in car package), and the ezAnova functions. I am getting wrong values for every method I try. I am not sure where my mistake is, if its a lack of understanding of R or the Anova itself. I included the code I used that I feel should be working. I have tried a ton of variations of this hoping to stumble on the answer. This set of data is balanced but I have a lot of similar data sets and many are unblanced. Thanks for any help you can provide.
library(car)
library(ez)
#set up data
sample_data <- data.frame(Subject=rep(1:20,2),Method=rep(c('Method1','Method2'),each=20),Level=rep(rep(c('Level1','Level2','Level3','Level4'),each=5),2))
sample_data$Result <- c(4.76,5.03,4.97,4.70,5.03,6.43,6.44,6.43,6.39,6.40,5.31,4.54,5.07,4.99,4.79,4.93,5.36,4.81,4.71,5.06,4.72,5.10,4.99,4.61,5.10,6.45,6.62,6.37,6.42,6.43,5.22,4.72,5.03,4.98,4.59,5.06,5.29,4.87,4.81,5.07)
sample_data[, 'Subject'] <- as.factor(sample_data[, 'Subject'])
#Set the contrats if needed to run type 3 sums of square for unblanaced data
#options(contrats=c("contr.sum","contr.poly"))
#With aov method as I understand it 'should' work
anova_aov <- aov(Result ~ Method*Level + Error(Subject/Method),data=test_data)
print(summary(anova_aov))
#ezAnova method,
anova_ez = ezANOVA(data=sample_data, wid=Subject, dv = Result, within = Method, between=Level, detailed = TRUE, type=3)
print(anova_ez)
Also, the values I should be getting as output by SAS
SAS Anova
Actually, your R code is correct in both cases. Running these data through SPSS yielded the same result. SAS, like SPSS, seems to require that the levels of the within factor appear in separate columns. You will end up with 20 rows instead of 40. An arrangmement like the one below might give you the desired result in SAS:
Subject Level Method1 Method2
I'm using lme4 package to run mixed model. I want to extract fixed effect result and random effect result in seperate dataset, so that we can use it for further analysis. But unfortunately I could not.
E.g.
mixed_result<- lmer(Reaction ~ Days + (1|Subject), data = sleepstudy)
I tried to extract fixed effect and random effect using the following method:
fixEffect<-fixef(mixed_result)
randEffect<-ranef(mixed_result)
View(fixEffect)
I tried fixef and ranef for fixed effect and random effect respectively and try to create the dataset using the result of it. But it was giving me the following error:
Error in View : cannot coerce class ""ranef.mer"" to a data.frame
I actually want output as we get in SAS , solutionF and solutionR. But in case if it's not possible to get output like that, the coeffs of fixed and random will do.
I'll be grateful if someone can help me.
Thanks and Regards,
Use str to see the structure of an object.
str(fixEffect)
# named vector, can probably be coerced to data.frame
View(as.data.frame(fixEffect))
# works just fine
str(randEffect)
# list of data frames (well, list of one data frame in this case)
View(randEffect$Subject)
If you had, say, slopes that also varied by Subject, they would go in the same Subject data frame as the Subject level intercepts. However, if intercepts also varied by some other variable group, with a different number of level than Subject, they obviously couldn't go in the same data frame. This is why a list of data frames is used, so that the same structure can generalize up for more complex models.
I'm trying to run a series of GLMM's on a large dataset to explore relationships between plant traits and environmental factors for each of several plant species at different research sites using plots and years as random factors in my models. I'm using plyr and I keep getting the following error message:
Error in eval.quoted(.variables, data) :
envir must be either NULL, a list, or an environment.
My data set is in the following format:
Site Plot Species FlowerDate Year Factor FactorValue
1 AD ADC01 CTETB 179 1999 numJulSF 160
And here is the code I am using:
data.list <- dlply(data,c("Species","Site","FlowerDate","Year", "Factor"),
function(df){lmer(FlowerDate~FactorValue+(1|Plot)+(1|Year),
data=df)})
I have seen that others have this issue, but I'm still having difficulty resolving it.
It seems to me that the main problem is that you are splitting the data based on some of the variables that are actually included in the model ('FlowerData' and 'Year'), which does not make sense in principle (no point in including an input variable that does not variable, or modeling an output variable that is constant).
Other than that, the combination of dlply + lmer should work; in fact, I use it quite often without problems...
I have a dataset with about 11,500 rows and 15 factors. I only need to impute values for 3 of the factors, with only 2 of the factors having any significant number of missing values. I have been trying to use mice to create imputed datasets, and I am using the following code:
dataset<-read.csv("filename.csv",header=TRUE)
model<-success~1+course+medium+ethnicity+gender+age+enrollment+HSGPA+GPA+Pell+ethnicity*medium
library(mice)
vempty<-c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0)
v12<-c(0,0,0,0,0,0,0,1,1,1,1,0,1,1,1)
v13<-c(0,0,0,0,0,0,0,1,1,1,1,1,0,1,1)
v14<-c(0,0,0,0,0,0,0,1,1,1,1,1,1,0,1)
list<-list(vempty,vempty,vempty,vempty,vempty,vempty,vempty,vempty,vempty,vempty,vempty,v12,v13,v14,vempty)
predmatrix<-do.call(rbind,list)
MIdataset<-mice(dataset,m=2,predictorMatrix=predmatrix)
MIoutput<- pool(glm(model, data=MIdataset, family=binomial))
After this code, I get the error message:
Error in as.data.frame.default(data) :
cannot coerce class '"mids"' into a data.frame
I'm totally at a loss as to what this means. I had no trouble doing this same analysis just deleting the missing data and using regular glm. I'd also like to do a multilvel logistic model on imputed datasets using lmer (that's the next step after I get this to work with glm), so if there is anything I am doing wrong that will also impact that next step, that would be good to know, too. I've tried to search this error on the internet, and I'm not getting anywhere. I'm just really learning R, so I'm also not that familiar with the environment yet.
Thanks for your time!
You need to apply the with.mids function. I think the last line in your code should look like this:
pool(with(MIdataset, glm(formula(model), family = binomial)))
You could also try this:
expr <- 'glm(success ~ course, family = binomial)'
pool(with(MIdataset, parse(text = expr)))