I've noticed that sometimes when I use ezANOVA from package ez I get columns stating what the Greenhouse-Geiser corrected df values are, and other times the tables with the sphericity corrections do not include the new df values, even though there are violations. For example, I just ran a 2-way repeated measures anova, and my table output looks like this:
I wish I could give repeatable data, but I genuinely don't know why it does or doesn't do it sometimes. Does anyone else know? I'll show my code below in case there's something I'm missing regarding the actual ezANOVA function. I could do the Df values by hand, but haven't found a good resource online to show me how to correct them using the epsilon value and I unfortunately was never taught that.
ez::ezANOVA(data = IntA2, wid = rat, dv = numReinforcers, within = .(component, minute))
Edit: A very nice person on the internet has explained to me how to calculate the new df values by hand (multiplying the GG epsilon by the old Dfs, in case any one else was wondering!) but I'm still unclear on why sometimes the function does it for you and other times it does not.
I am new to processing RNA seq data and am now practicing to reproduce a published figure related to RNA seq. This os the paper and Fig2A is what I'm trying to achieve.
In brief, I downloaded the code with recount3 and subset the sample for groups that I want (control vs condition 1, control vs condition 2, etc). Then I performed the following code:
dds_4uM_30min <- DESeqDataSetFromMatrix(countData = ha_4uM_30min_data,
colData = ha_4uM_30min_meta,
design = ~ type)
dds2_4uM_30min <- DESeq(dds_4uM_30min)
res_4uM_30min <- results(dds2_4uM_30min, tidy=F)
(type is the column that I made to contain the information of whether it's control or condition 1)
This is the figure I get, which confuses me since it is nowhere near the original figure.
I thought that they might do additional processing of the data, but have no idea what are the common or reasonable ways to do.
Furthermore, there seems to be datapoints that form lines (as can seen in the above figure), which is not seen by in the original figure. I am wondering what causes this kind of distribution and how to adjust for getting rid of it.
Thanks in advance for any opinion or suggestion.
I have been trying to use the function lfcShrink but the figure still has this weird line.
Any suggestions on how to further process RNA seq data?
I'm trying to run a generalised linear mixed effects (binomial-normal) meta-analysis for 7 randomised studies, where each study records the presence of an adverse event within the treatment and placebo populations (exposure and control).
To do this, I'm hoping to use the metabin function (meta package). However, I'm getting an error and I'm not sure why. E.g. running this code:
install.packages('meta')
# Data
data<-data.frame(exposure.events=c(11,34,152,4,60,3,25), exposure.population=c(184,152,9500,77,2012,15,60), control.events=c(3,33,4729,133,1441,1,25), control.population=c(184,375,613978,15865,480485,105,238), Study=c("1","2","3","4","5","6","7"))
# Calling metabin
metabin(event.e=exposure.events, n.e=exposure.population, event.c=control.events, n.c=control.population, studlab=Study, data=data, method="GLMM",model.glmm = "CM.AL",method.tau = "ML")
I get this output:
Error in metafor::rma.glmm(ai = event.e[!exclude], n1i = n.e[!exclude], :
Cannot fit ML model.
I've also tried calling the rma.glmm function directly (instead of doing this via metabin), but get the same error message. I've also tried reading the source code for rma.glmm but I'm not sure I understand what's going on. However, I think the issue is related to the third study (the largest), and in particular the size of the control population, as both of the following run smoothly:
# Modifying 3rd row's control population
data<-data.frame(exposure.events=c(11,34,152,4,60,3,25), exposure.population=c(184,152,9500,77,2012,15,60), control.events=c(3,33,4729,133,1441,1,25), control.population=c(184,375,61378,15865,480485,105,238), Study=c("1","2","3","4","5","6","7"))
metabin(event.e=exposure.events, n.e=exposure.population, event.c=control.events, n.c=control.population, studlab=Study, data=data, method="GLMM",model.glmm = "CM.AL",method.tau = "ML")
# Deleting 3rd row
data<-data.frame(exposure.events=c(11,34,4,60,3,25), exposure.population=c(184,152,77,2012,15,60), control.events=c(3,33,133,1441,1,25), control.population=c(184,375,15865,480485,105,238), Study=c("1","2","3","4","5","6"))
metabin(event.e=exposure.events, n.e=exposure.population, event.c=control.events, n.c=control.population, studlab=Study, data=data, method="GLMM",model.glmm = "CM.AL",method.tau = "ML")
Is this a convergence problem, and does anyone know if there is any way around this? The only other thing I can find about this error message is for a problem (and thus solution) which does not apply to me.
Any help would be really appreciated :)
I am new to R. I am trying to run Arules for titanic data. I am using the following code:
library(arules)
library(plyr)
rules<-apriori(titanic.raw)
inspect(rules)
rules <- apriori(titanic.raw,
parameter = list(minlen=2, supp=0.005, conf=0.8),
appearance = list(rhs=c("Survived=No", "Survived=Yes"),
default="lhs"),
control = list(verbose=F))
inspect(rules)
quality(rules)$rec<- support(rules,titanic.raw)/count(rhs)
I am able to get the result. However, I need to introduce one more measure apart from support,confidence & lift i.e. rec = Support/(total no of target instances), written in the last piece of code.
I think that I am going wrong in writing it. Could any of the folks help and guide me through it? would really appreciate it
yesterday i found good R code for classification emotion and took some part
happy = readLines("./happy.txt")
sad = readLines("./sad.txt")
happy_test = readLines("./happy_test.txt")
sad_test = readLines("./sad_test.txt")
tweet = c(happy, sad)
tweet_test= c(happy_test, sad_test)
tweet_all = c(tweet, tweet_test)
sentiment = c(rep("happy", length(happy) ),
rep("sad", length(sad)))
sentiment_test = c(rep("happy", length(happy_test) ),
rep("sad", length(sad_test)))
sentiment_all = as.factor(c(sentiment, sentiment_test))
library(RTextTools)
mat= create_matrix(tweet_all, language="english",
removeStopwords=FALSE, removeNumbers=TRUE,
stemWords=FALSE, tm::weightTfIdf)
container = create_container(mat, as.numeric(sentiment_all),
trainSize=1:160, testSize=161:180,virgin=FALSE)
models = train_models(container, algorithms=c("MAXENT",
"SVM",
#"GLMNET", "BOOSTING",
"SLDA","BAGGING",
"RF", # "NNET",
"TREE"
))
# test the model
results = classify_models(container, models)
table(as.numeric(as.numeric(sentiment_all[161:180])), results[,"FORESTS_LABEL"])
all is good, but one question. Here, we work with data where we self indicate to the machine what is sad and what is happy text. If i have new documents without indicating what sad, what happy or whats positive and what's negative(suppose, path one of this document n=read.csv("C:/1/ttt.csv")), how to do, that built model can define what phrase is negative and what positive?
Well, What was all the purpose of building a model to detect what is sad and what is happy? What is you want to achieve? And this does not look like a SO question/answer.
So you are using Supervised Learning in a labeled data (you already know is sad or happy) to learn what defines those classes, so later on you can use the models built for predicting new content where you do not have the label.
So, any transformations done to the data for training you have to do it for the new data coming in, and you ask your model to predict (evaluate) based on this new input data. So you use the prediction as a result. This does not change your model, it is just evaluating it in new data.
Another scenario is that you come with new labeled data, so you want to update your model, so you can retrain it based on the new data you might learn new models that have maybe more features.
In your case you should look at classify_model or classify_models functions in that package.