Create funnel plot with residuals using metafor - metafor

Using metafor in R:
How can you plot residuals on the x-axis of a funnel plot for a model that does not contain any moderators?

Although this is not intended, you could override the logical inside rma objects that indicates that you have fitted an 'intercept-only' model. To illustrate:
library(metafor)
dat <- escalc(measure="RR", ai=tpos, bi=tneg, ci=cpos, di=cneg, data=dat.bcg)
res <- rma(yi, vi, data=dat)
funnel(res)
So this gives you a standard funnel plot. Now force int.only to be FALSE. Then you get a funnel plot of the residuals:
res$int.only <- FALSE
funnel(res)
Another way to do this is to extract the residuals and corresponding variances manually and pass them to the funnel() function directly:
ei <- resid(res)
vei <- diag(vcov(res, type="resid"))
funnel(ei, vei, xlab="Residual Value")

Related

Plotting the predictions of a mixed model as a line in R

I'm trying to plot the predictions (predict()) of my mixed model below such that I can obtain my conceptually desired plot as a line below.
I have tried to plot my model's predictions, but I don't achieve my desired plot. Is there a better way to define predict() so I can achieve my desired plot?
library(lme4)
dat3 <- read.csv('https://raw.githubusercontent.com/rnorouzian/e/master/dat3.csv')
m4 <- lmer(math~pc1+pc2+discon+(pc1+pc2+discon|id), data=dat3)
newdata <- with(dat3, expand.grid(pc1=unique(pc1), pc2=unique(pc2), discon=unique(discon)))
y <- predict(m4, newdata=newdata, re.form=NA)
plot(newdata$pc1+newdata$pc2, y)
More sjPlot. See the parameter grid to wrap several predictors in one plot.
library(lme4)
library(sjPlot)
library(patchwork)
dat3 <- read.csv('https://raw.githubusercontent.com/rnorouzian/e/master/dat3.csv')
m4 <- lmer(math~pc1+pc2+discon+(pc1+pc2+discon|id), data=dat3) # Does not converge
m4 <- lmer(math~pc1+pc2+discon+(1|id), data=dat3) # Converges
# To remove discon
a <- plot_model(m4,type = 'pred')[[1]]
b <- plot_model(m4,type = 'pred',title = '')[[2]]
a + b
Edit 1: I had some trouble removing the dropcon term within the sjPlot framework. I gave up and fell back on patchwork. I'm sure Daniel could knows the correct way.
As Magnus Nordmo suggest, this is very simple with sjPlot which has some predefined functions for these types of plot.
library(lme4)
dat3 <- read.csv('https://raw.githubusercontent.com/rnorouzian/e/master/dat3.csv')
m4 <- lmer(math~pc1+pc2+discon+(pc1+pc2+discon|id), data=dat3)
plot_model(m4, type = 'pred', terms = c('pc1', 'pc2'),
ci.lvl = 0)
which gives the following result.
This plot is designed to include different quantiles of the second term in terms over the axes of pc1 and pred. You could split up these plots and combine them using patchwork and the interval can be changed by using square brackets after the term in terms (eg pc1 [-10:1] for interval between -10 and 1).

Add raw data points to jp.int (sjPlot)

For my manuscript, I plotted a lme with an interaction of two continuous variables:
Create data
mydata <- data.frame( SID=sample(1:150,400,replace=TRUE),age=sample(50:70,400,replace=TRUE), sex=sample(c("Male","Female"),200, replace=TRUE),time= seq(0.7, 6.2, length.out=400), Vol =rnorm(400),HCD =rnorm(400))
mydata$time <- as.numeric(mydata$time)
Run the model:
model <- lme(HCD ~ age*time+sex*time+Vol*time, random=~time|SID, data=mydata)
Make plot:
sjp.int(model, swap.pred=T, show.ci=T, mdrt.values="meansd")
The reviewer now wants me to add the raw data points to this plot. How can I do this? I tried adding geom_point() referring to mydata, but that is not possible.
Any ideas?
Update:
I thought that maybe I could extract the random slope of HCD and then residuals HCD for the covariates and also residuals Vol for the covariates and plot those two to make things easier (then I could plot the points in a 2D plot).
So, I tried to extract the slopes and use these to fit a linear regression, but the results are different (in the reproducible example less significant, but in my data: the interaction became non-significant (and was significant in the lme)). Not sure what that means or whether this just shows that I should not try to plot it this way.
get the slopes:
model <- lme(HCD ~ time, random=~time|SID, data=mydata)
slopes <- rbind(row.names(model$coefficients$random$SID), model$coef$random$SID[,2])
slopes2 <- data.frame(matrix(unlist(slopes), nrow=144, byrow=T))
names(slopes2)[1] <- "SID"
names(slopes2)[2] <- "slopes"
(save the slopes2 and reopen, because somehow R sees it as a factor)
Then create a cross-sectional dataframe and merge the slopes:
mydata$time2 <- round(mydata$time)
new <- reshape(mydata,idvar = "SID", timevar="time2", direction="wide")
newdata <- dplyr::left_join(new, slop, by="SID")
The lm:
modelw <- lm(slop$slopes ~ age.1+sex.1+Vol.1, data=newdata)
Vol now has a p-value of 0.8 (previously this was 0.14)

What is the method for pooling when Paule-Mandel estimator is used in package metafor?

Consider the code below which determines a random effect model with a Paule-Mandel estimator for heterogeneity:
library(metafor)
res = rma(measure = "RD", ai = Ai, bi = Bi, ci = Ci, di = Di, data = data1, method="PM")
In package metafor manual the method for pooling is mentioned in the case Hunter-Schmidt or DerSimonian-Laird estimators are used for pooling results, but not mentioned for Paule-Mandel estimator. Any hints?
The Paule-Mandel (PM) estimator is a method for estimating the amount of heterogeneity (usually denoted tau^2 in the meta-analytic literature). Once this variance component has been estimated, nothing different happens than with any of the other methods: We just compute the weighted average of the estimates, using 1/(sampling variance + tau^2) as the weights. To illustrate:
library(metafor)
dat <- escalc(measure="RR", ai=tpos, bi=tneg, ci=cpos, di=cneg, data=dat.bcg)
res <- rma(yi, vi, data = dat, method="PM")
res
coef(res)
weighted.mean(dat$yi, 1/(dat$vi + res$tau2))
The last two lines give you the same value: -0.7149682.
Edit: The Mantel-Haenszel method also computes a weighted average. In the example above, escalc() computes the log risk ratios (and corresponding sampling variances) and we then compute the weighted mean based on the log risk ratios. The MH method works a bit different in that it computes a weighted average based on the risk ratio values directly. To illustrate:
res <- rma.mh(measure="RR", ai=tpos, bi=tneg, ci=cpos, di=cneg, data=dat.bcg)
res
exp(coef(res))
weighted.mean(exp(dat$yi), weights(res))
The last two lines both give the same value: 0.6352672.

Predict out of sample using flexsurvreg in R

I have the following model in R
library(flexsurv)
data(ovarian)
model = flexsurvreg(Surv(futime, fustat) ~ ecog.ps + rx, data = ovarian, dist='weibull')
model
predict(model,data = ovarian, type = 'response')
The model summary looks like this flexsurvreg model output
I am trying to predict the survival time using the predict function in R and get the following error
error while trying to predict
How can I predict expected lifetime using this flexsurvreg model?
I understand that the documentation mentions a totlos.fs function, but this data does not seem to have a trans variable that totlos.fs requires to provide an output.
If there is no other alternative to totlos.fs how can I create a trans variable in this data and handle it along with existing covariates?
Please advise.
Section 3 of the supplementary examples doc for the flexsurv documentation has an example in which the predicted values are calculated directly using the model equation. As you are using the Weibull distribution (with n=2 parameters) I believe this should work:
pred.model <- model.matrix(model) %*% model$res[-(1:n),"est"]
Cheers
Nik,
I know your question is an old one, but see below how I hacked a way to do it. It involves retrieving the shape and rate parameters from your fit of test data, then instead of predict, you use the qgompertz() from flexsurv. Please excuse the use of my own encapsulated example code, but you should be able to follow along.
# generate the training data "lung1" from data(lung) in survival package
# hacked way for truncating the lung data to 2 years of follow up
require(survival)
lung$yrs <- lung$time/365
lung1 <- lung[c("status", "yrs")]
lung1$status[ lung1$yrs >2] <- 1
lung1$yrs[ lung1$yrs >2] <- 2
# from the training data build KM to obtain survival %s
s <- Surv(time=lung1$yrs, event=lung1$status)
km.lung <- survfit(s ~ 1, data=lung1)
plot(km.lung)
# generate dataframe to use later for plotting
cut.length <- sum((km.lung$time <= 2)) # so I can create example test data
test.data <- data.frame(yrs = km.lung$time[1:cut.length] , surv=round(km.lung$surv[1:cut.length], 3))
##
## doing the same as above with gompertz
##
require(flexsurv) #needed to run gompertz model
s <- Surv(time=lung1$yrs, event=lung1$status)
gomp <- flexsurvreg(s ~ 1, data=lung1, dist="gompertz") # run this to get shape and rate estimates for gompertz
gomp # notice the shape and rate values
# create variables for these values
g.shape <- 0.5866
g.rate <- 0.5816
##
## plot data and vizualize the gomperts
##
# vars for plotting
df1 <- test.data
xvar <- "yrs"
yvar <- "surv"
extendedtime <- 3 #
ylim1 <- c(0,1)
xlim1 <- c(0, extendedtime)
# plot the survival % for training data
plot(df1[,yvar]~df1[,xvar], type="S", ylab="", xlab="", lwd=3, xlim=xlim1, ylim=ylim1)
# Nik--here is where the magic happens... pay special attention to: qgompertz(seq(.01,.99,by=.01), shape=0.58656, rate = .5816)
lines (qgompertz(seq(.01,.99,by=.01), shape=0.58656, rate = .5816) , seq(.99,.01,by=-.01) , col="red", lwd=2, lty=2 )
# generate a km curve from the testing data
s <- Surv(time=lung$yrs, event=lung$status)
km.lung <- survfit(s ~ 1, data=lung)
par(new=T)
# now draw remaining survival curve from the testing section
plot(km.lung$surv[(cut.length+1):length(km.lung$time)]~km.lung$time[(cut.length+1):length(km.lung$time)], type="S", col="blue", ylab="", xlab="", lwd=3, xlim=xlim1, ylim=ylim1)

Meta-analysis: Forest plot of summary estimates using metafor package

I am meta-analysing data from ~90 studies. This presents some challenges in how to display the data in an accessible format for publication. I would like to display only the overall effect size estimates of the different meta-analyses and exclude the study-specific estimates. I am able to do this in Stata using the metan package and adding the summaryonly command. Is it possible to suppress the study-level effect sizes in the forest plot outputs using the metafor package (or any other meta-analysis R package)?
I've been using the addpoly command to add the effect size estimates for sub-samples as described in the package documentation, e.g.:
res.a <- rma(n1i = Intervention_n, n2i = Control_n, m1i = intervention_d, m2i = control_d, sd1i = intervention_d_sd,
sd2i = control_d_sd, measure="MD", intercept=TRUE, data = Dataset.a, vtype="LS", method="DL", level=95,
digits=4, subset = (exclude==0 & child=="No"), slab=paste(Dataset.a$Label, Dataset.a$Year, sep=", "))
addpoly(res.a, row=7.5, cex=.75, font=3, mlab="Random effects model for subgroup")
If I understand you correctly, you are conducting several analyses with these ~90 studies (e.g., based on different subsets) and your goal is to show only the summary estimates (as based on these analyses) in a forest plot. Then the easiest approach would be to just collect the estimates and corresponding variances of the various analyses in a vector and then pass that to the forest() function. Let me give a simple example:
### load metafor package
library(metafor)
### load BCG vaccine dataset
data(dat.bcg)
### calculate log relative risks and corresponding sampling variances
dat <- escalc(measure="RR", ai=tpos, bi=tneg, ci=cpos, di=cneg, data=dat.bcg)
### fit random-effects models to some subsets
res.r <- rma(yi, vi, data=dat, subset=alloc=="random")
res.s <- rma(yi, vi, data=dat, subset=alloc=="systematic")
res.a <- rma(yi, vi, data=dat, subset=alloc=="alternate")
### collect model estimates and corresponding variances
estimates <- c(coef(res.r), coef(res.s), coef(res.a))
variances <- c(vcov(res.r), vcov(res.s), vcov(res.a))
### create vector with labels
labels <- c("Random Allocation", "Systematic Allocation", "Alternate Allocation")
### forest plot
forest(estimates, variances, slab=labels)
If you don't like that the point sizes differ (by default, they are drawn inversely proportional to the variances), you could use:
forest(estimates, variances, slab=labels, psize=1)
A couple other improvements:
forest(estimates, variances, slab=labels, psize=1, atransf=exp, xlab="Relative Risk (log scale)", at=log(c(.2, .5, 1, 2)))
ADDENDUM
In case you prefer polygon shapes for the estimates, you could do the following. First draw the plot as above, but use efac=0 to hide the vertical lines on the CIs. Then just draw over the summary polygons with addpoly():
forest(estimates, variances, slab=labels, psize=1, atransf=exp, xlab="Relative Risk (log scale)", at=log(c(.2, .5, 1, 2)), efac=0)
addpoly(estimates, variances, atransf=exp, rows=3:1, col="white", annotate=FALSE)
You can also use efac=1.5 in addpoly() to stretch the polygons vertically. Adjust the factor to your taste.

Resources