I am conducting a meta-analysis across 196 studies. Each study reports one or more of "6 outcomes", which are measured using one or more of "3 factors".
Now I am unsure how to code the random effects. I was thinking about two alternatives:
A multivariate model such as:
res <- rma.mv(yi, V,slab = study, random=~Outcome_Factor|study, data=dat, method="ML")
A three-level model such as:
res <- rma.mv(yi, vi,slab = study, random=~1|study/Outcome_Factor,data=dat, method="ML")
What would be more appropriate?
Thank you very much in advance.
Kind regards
Lisa
These are different 'parameterizations' of the same model. See here for an extensive discussion of this:
https://www.metafor-project.org/doku.php/analyses:konstantopoulos2011
So in the end, the two models are identical, the results are just expressed in different ways.
Related
I'm trying to understand why my code has taken several days to process and how I can improve the next iteration. I'm on my third day and continue to have outputs with marginal improvements in AIC. The last couple of AIC's have been 18135.38, 18187.43, and 18243.13. I currently have 33 covariates in the model. The "none" option is 12th from the bottom, so there are still many covariates to run.
The data is ~610K observations with ~1600 variables. Outcome variables and covariates are mostly binary. My covariates were chosen after doing univariate logistical regression and P-value adjustment using Holm procedure (alpha=0.05). No interaction terms are included.
The code I've written is here:
intercept_only <- glm(outcome ~ 1, data=data, family="binomial")
full.model <- glm(outcome ~ 157 covariates, data=data, family = "binomial")
forward_step_model <- step(intercept_only, direction = "forward", scope = formula(full.model))
I'm hoping to run the same code on a different outcome variable with double the number of covariates identified in the same way as above but am worried it will take even longer to process. I see there are both the step and stepAIC functions to perform stepwise regression. Is there an appreciable difference between these functions? Are any other ways of doing this? Is there any way to speed up the processing?
I am analysing a 2-way repeated-measures dataset, modelled as follows:
model= lmer(result ~ treatment * time + (1|subject), data=df)
... where every subject receives every treatment and is tested at every time. However when analysing the contrasts, there appears to be no correction for sphericity. Here I am using emmeans to test for a difference between each treatment and the control-treatment, at each level of "time"...
emm <- emmeans(model,c("treatment","time")
contrast(emm, "trt.vs.ctrl", ref="Control", by="time")
When I look at the output from contrast(), I confirmed there is no G-G correction by comparing with the output from GraphPad Prism for the same dataset.
Is there a simple way of achieving a sphericity-corrected analysis of the contrasts?
Thank's to the commentors for identifying this solution. the afex() packages is specifically designed for repeated measures factorial designs, and allows the appropriate corrections.
aov_ez (in the afex package) automatically applies corrections for non-sphericity
emmeans should specify that the model is multivariate
contrast (in the emmeans package) should specify the appropriate adjustment to "p"
The solution therefore would look like this...
library(afex)
library(emmeans)
model= aov_ez("subject","result",df,within=c("treatment","time"),type="III")
emm= emmeans(model,c("treatment","time"),model="multivariate")
contrast(emm,"trt.vs.ctrl",ref="Control",by="time",adjust="dunn")
Hope this is helpful for anyone else with a similar question!
I am trying to compare differences between coefficients in different regression equations.
Specifically, I have 2 regressions looking at the effect of Importance to Donate on Guilt, Feeling, and Responsibility
aov_I <- aov(newdata_I$AV_importance_to_donate~newdata_I$AV_guilty+newdata_I$AV_percieved_resp+feeling_I)
summary(aov_I)
aov_S <- aov(newdata_S$AV_importance_to_donate~newdata_S$AV_guilty+newdata_S$AV_percieved_resp+feeling_S)
summary(aov_S)
I would like to compare the differences between the coefficients in these two different regression equations.
How can I do this??
Thank you so much in advance!
You can view just the coefficients by doing aovI$coefficients[2] and aovS$coefficients[2] and then combine them into a dataframe using cbind, then just view with a bar graph if you don't need to do a real statistical comparison
(I am using R and the lqmm package)
I was wondering how to consider autocorrelation in a Linear Quantile mixed models (LQMM).
I have a data frame that looks like this:
df1<-data.frame( Time=seq(as.POSIXct("2017-11-13 00:00:00",tz="UTC"),
as.POSIXct("2017-11-13 00:1:59",tz="UTC"),"sec"),
HeartRate=rnorm(120, mean=60, sd=10),
Treatment=rep("TreatmentA",120),
AnimalID=rep("ID01",120),
Experiment=rep("Exp01",120))
df2<-data.frame( Time=seq(as.POSIXct("2017-08-11 00:00:00",tz="UTC"),
as.POSIXct("2017-08-11 00:1:59",tz="UTC"),"sec"),
HeartRate=rnorm(120, mean=62, sd=14),
Treatment=rep("TreatmentB",120),
AnimalID=rep("ID02",120),
Experiment=rep("Exp02",120))
df<-rbind(df1,df2)
head(df)
With:
The heart rates (HeartRate) that are measured every second on some animals (AnimalID). These measures are carried during an experiment (Experiment) with different treatment possible (Treatment). Each animal (AnimalID) was observed for multiple experiments with different treatments. I wish to look at the effect of the variable Treatment on the 90th percentile of the Heart Rates but including Experiment as a random effect and consider the autocorrelation (as heart rates are taken every second). (If there is a way to include AnimalID as random effect as well it would be even better)
Model for now:
library(lqmm)
model<-lqmm(fixed= HeartRate ~ Treatment, random= ~1| Exp01, data=df, tau=0.9)
Thank you very much in advance for your help.
Let me know if you need more information.
For resources on thinking about this type of problem you might look at chapters 17 and 19 of Koenker et al. 2018 Handbook of Quantile Regression from CRC Press. Neither chapter has nice R code to go from, but they discuss different approaches to the kind of data you're working with. lqmm does use nlme machinery, so there may be a way to customize the covariance matrices for the random effects, but I suspect it would be easiest to either ask for help from the package author or to do a deep dive into the package code to figure out how to do that.
Another resource is the quantile regression model for mixed effects models accounting for autocorrelation in 'Quantile regression for mixed models with an application to examine blood pressure trends in China' by Smith et al. (2015). They model a bivariate response with a copula, but you could do the simplified version with univariate response. I think their model only at this points incorporates lag-1 correlation structure within subjects/clusters. The code for that model does not seem to be available online either though.
I am working on Telecom Churn problem and here is my dataset.
http://www.sgi.com/tech/mlc/db/churn.data
Names - http://www.sgi.com/tech/mlc/db/churn.names
I'm new to survival analysis.Given the training data,my idea to build a survival model to estimate the survival time along with predicting churn/non churn on test data based on the independent factors.Could anyone help me with the code or pointers on how to go about this problem.
To be precise,say my train data has got
customer call usage details,plan details,tenure of his account etc and whether did he churn or not.
Using general classification models,I can predict churn or not on test data.Now using Survival analysis,I want to predict the tenure of the survival in test data.
Thanks,
Maddy
If you're still interested (or for the benefit of those coming later), I've written a few guides specifically for conducting survival analysis on customer churn data using R. They cover a bunch of different analytical techniques, all with sample data and R code.
Basic survival analysis: http://daynebatten.com/2015/02/customer-churn-survival-analysis/
Basic cox regression: http://daynebatten.com/2015/02/customer-churn-cox-regression/
Time-dependent covariates in cox regression: http://daynebatten.com/2015/12/survival-analysis-customer-churn-time-varying-covariates/
Time-dependent coefficients in cox regression: http://daynebatten.com/2016/01/customer-churn-time-dependent-coefficients/
Restricted mean survival time (quantify the impact of churn in dollar terms): http://daynebatten.com/2015/03/customer-churn-restricted-mean-survival-time/
Pseudo-observations (quantify dollar gain/loss associated with the churn effects of variables): http://daynebatten.com/2015/03/customer-churn-pseudo-observations/
Please forgive the goofy images.
Here is some code to get you started:
First, read the data
nm <- read.csv("http://www.sgi.com/tech/mlc/db/churn.names",
skip=4, colClasses=c("character", "NULL"), header=FALSE, sep=":")[[1]]
dat <- read.csv("http://www.sgi.com/tech/mlc/db/churn.data", header=FALSE, col.names=c(nm, "Churn"))
Use Surv() to set up a survival object for modeling
library(survival)
s <- with(dat, Surv(account.length, as.numeric(Churn)))
Fit a cox proportional hazards model and plot the result
model <- coxph(s ~ total.day.charge + number.customer.service.calls, data=dat[, -4])
summary(model)
plot(survfit(model))
Add a stratum:
model <- coxph(s ~ total.day.charge + strata(number.customer.service.calls <= 3), data=dat[, -4])
summary(model)
plot(survfit(model), col=c("blue", "red"))