I'm trying to plot the predictions (predict()) of my mixed model below such that I can obtain my conceptually desired plot as a line below.
I have tried to plot my model's predictions, but I don't achieve my desired plot. Is there a better way to define predict() so I can achieve my desired plot?
library(lme4)
dat3 <- read.csv('https://raw.githubusercontent.com/rnorouzian/e/master/dat3.csv')
m4 <- lmer(math~pc1+pc2+discon+(pc1+pc2+discon|id), data=dat3)
newdata <- with(dat3, expand.grid(pc1=unique(pc1), pc2=unique(pc2), discon=unique(discon)))
y <- predict(m4, newdata=newdata, re.form=NA)
plot(newdata$pc1+newdata$pc2, y)
More sjPlot. See the parameter grid to wrap several predictors in one plot.
library(lme4)
library(sjPlot)
library(patchwork)
dat3 <- read.csv('https://raw.githubusercontent.com/rnorouzian/e/master/dat3.csv')
m4 <- lmer(math~pc1+pc2+discon+(pc1+pc2+discon|id), data=dat3) # Does not converge
m4 <- lmer(math~pc1+pc2+discon+(1|id), data=dat3) # Converges
# To remove discon
a <- plot_model(m4,type = 'pred')[[1]]
b <- plot_model(m4,type = 'pred',title = '')[[2]]
a + b
Edit 1: I had some trouble removing the dropcon term within the sjPlot framework. I gave up and fell back on patchwork. I'm sure Daniel could knows the correct way.
As Magnus Nordmo suggest, this is very simple with sjPlot which has some predefined functions for these types of plot.
library(lme4)
dat3 <- read.csv('https://raw.githubusercontent.com/rnorouzian/e/master/dat3.csv')
m4 <- lmer(math~pc1+pc2+discon+(pc1+pc2+discon|id), data=dat3)
plot_model(m4, type = 'pred', terms = c('pc1', 'pc2'),
ci.lvl = 0)
which gives the following result.
This plot is designed to include different quantiles of the second term in terms over the axes of pc1 and pred. You could split up these plots and combine them using patchwork and the interval can be changed by using square brackets after the term in terms (eg pc1 [-10:1] for interval between -10 and 1).
I'm trying to make a predicted probability plot for a logit model, using clustered robust standard errors. Supposedly the margins package should let you do this, using cplot(), but there seems to be a bug, such that cplot() doesn't recognize the optional vcov input. Below is a minimum working example. Does anyone know how to fix the bug or do this another way?
require("margins")
require("sandwich")
##Generating random numbers
set.seed(10)
y<-factor(rbinom(n=1000,size=1,prob=.5))
x <- rnorm(n=1000, mean=100,sd=1)
z<- rbinom(n=1000,size=3,prob=.5)
#creating a "dataset"
dta<-data.frame(x,y,z)
##Basic logit model
model <-glm(y~x,family="binomial"(link="logit"),data=dta)
##Creating variance-covariance matrix, clustered by z
vcov <- vcovCL(model, cluster=z)
##Making a plot
cplot(model,"x",vcov=vcov,what="prediction")
#can see below that vcov has no effect (if not obvious from plot)
print(cplot(model,"x",vcov=vcov,what="prediction",draw=FALSE))
print(cplot(model,"x",what="prediction",draw=FALSE))
You could use the following code:
# Predict values
pred.dta <- ggeffects::ggpredict(
model=model,
terms="x [all]",
vcov.fun="vcovCL",
vcov.type="HC1",
vcov.args=list(cluster=z)
)
# Plot predictions
ggplot2::ggplot(data=pred.dta,
ggplot2::aes(x=x, y=predicted))+
ggplot2::geom_line()+
ggplot2::geom_errorbar(ggplot2::aes(ymin=conf.low, ymax=conf.high), width=.1)
For comparison, this is the same code but without the clustered errors:
# Predict values
pred.dta <- ggeffects::ggpredict(
model=model,
terms="x [all]" )
# Plot predictions
ggplot2::ggplot(data=pred.dta,
ggplot2::aes(x=x, y=predicted))+
ggplot2::geom_line()+
ggplot2::geom_errorbar(ggplot2::aes(ymin=conf.low, ymax=conf.high), width=.1)
i want to find the correlation of the adjuste of the curve plotted and the data, anyone knows how to do that?
library(drc)
S.alba.m1 <- drm(DryMatter~Dose, Herbicide, data = S.alba, fct = LL.4())
plot(S.alba.m1)
Use fitted:
cor(S.alba$DryMatter, fitted(S.alba.m1))
Also check out the modelFit function: ?modelFit for various tests.
I have these dose response data:
df <- data.frame(viability=c(14,81,58,78,71,83,64,16,32,100,100,81,86,83,100,90,15,100,38,100,91,84,92,100),
dose=c(10,0.62,2.5,0.16,0.039,0.0024,0.0098,0.00061,10,0.62,2.5,0.16,0.039,0.0024,0.0098,0.00061,10,0.62,2.5,0.16,0.039,0.0024,0.0098,0.00061),
stringsAsFactors=F)
I then use the drc package's drm function to fit a log-logistic curve to these data:
library(drc)
fit <- drm(viability~dose,data=df,fct=LL.4(names=c("slope","low","high","ED50")),type="continuous")
> summary(fit)
Model fitted: Log-logistic (ED50 as parameter) (4 parms)
Parameter estimates:
Estimate Std. Error t-value p-value
slope:(Intercept) 5.15328 18.07742 0.28507 0.7785
low:(Intercept) 20.19430 12.61122 1.60130 0.1250
high:(Intercept) 83.33181 4.96736 16.77586 0.0000
ED50:(Intercept) 2.98733 1.99685 1.49602 0.1503
Residual standard error:
21.0743 (20 degrees of freedom)
I then generate predictions so I'll be able to plot the curve:
pred.df <- expand.grid(dose=exp(seq(log(max(df$dose)),log(min(df$dose)),length=100)))
pred <- predict(fit,newdata=pred.df,interval="confidence")
pred.df$viability <- pmax(pred[,1],0)
pred.df$viability <- pmin(pred.df$viability,100)
pred.df$viability.low <- pmax(pred[,2],0)
pred.df$viability.low <- pmin(pred.df$viability.low,100)
pred.df$viability.high <- pmax(pred[,3],0)
pred.df$viability.high <- pmin(pred.df$viability.high,100)
I also use the PharmacoGx Bioconductor package to compute AUC and IC50 for both the curve and its high and low bounds:
library(PharmacoGx)
auc.mid <- computeAUC(rev(pred.df$dose),rev(pred.df$viability))/((max(pred.df$viability)-min(pred.df$viability))*(max(pred.df$dose)-min(pred.df$dose)))
auc.low <- computeAUC(rev(pred.df$dose),rev(pred.df$viability.low))/((max(pred.df$viability.low)-min(pred.df$viability.low))*(max(pred.df)-min(pred.df$dose)))
auc.high <- computeAUC(rev(pred.df$dose),rev(pred.df$viability.high))/((max(pred.df$viability.high)-min(pred.df$viability.high))*(max(pred.df$dose)-min(pred.df$dose)))
ic50.mid <- computeIC50(rev(pred.df$dose),rev(pred.df$viability))
ic50.low <- computeIC50(rev(pred.df$dose),rev(pred.df$viability.low))
ic50.high <- computeIC50(rev(pred.df$dose),rev(pred.df$viability.high))
Ceating a table with all the parameters so I can plot everything together:
ann.df <- data.frame(param=c("slope","low","high","ED50","auc.mid","auc.high","auc.low","ic50.mid","ic50.high","ic50.low"),value=signif(c(summary(fit)$coefficient[,1],auc.mid,auc.high,auc.low,ic50.mid,ic50.high,ic50.low),2),stringsAsFactors=F)
And finally plotting it all:
library(ggplot2)
library(grid)
library(gridExtra)
pl <- ggplot(df,aes(x=dose,y=viability))+geom_point()+geom_ribbon(data=pred.df,aes(x=dose,y=viability,ymin=viability.low,ymax=viability.high),alpha=0.2)+labs(y="viability")+
geom_line(data=pred.df,aes(x=dose,y=viability))+coord_trans(x="log")+theme_bw()+scale_x_continuous(name="dose",breaks=sort(unique(df$dose)),labels=format(signif(sort(unique(df$dose)),3),scientific=T))
ggdraw(pl)+draw_grob(tableGrob(ann.df,rows=NULL),x=0.1,y=0.175,width=0.3,height=0.4)
Which gives:
My questions are:
I thought that slope should be negative. How come it's 5.2?
the auc.mid, auc.high, and auc.lowcumputed as:
auc.mid <- computeAUC(rev(pred.df$dose),rev(pred.df$viability))
auc.low <- computeAUC(rev(pred.df$dose),rev(pred.df$viability.low))
auc.high <- computeAUC(rev(pred.df$dose),rev(pred.df$viability.high))
give 21.47818, 37.52389, and 2.678228, respectively.
Since these are not in the [0,1] range I thought that divinding them by the area under the highest corresponding viability will give what I'm looking for, i.e., relative AUC, but these values seem too low relative to what the figure shows. What are these AUCs then?
Also, how come auc.mid > auc.low > auc.high? I would think that it should be auc.high > auc.mid > auc.low
The IC50 values also seem a little low. Do they make sense?
Bonus question: how do I avoid the trailing zeros in slope, low, high, ED50, ic50.mid, and ic50.high in the figure?
The parameter you are pulling out is the hill slope parameter, or the coefficient in front of the concentration variable in the exponential, not the actual slope of the curve.
The AUC provided is in the [0-100] range, for the area above the curve. I ran the code and got the order as auc.low>auc.mid>auc.high. Traditionally the area under the response curve was reported, or 1-viability.
It is important to note that the PharmacoGx package uses a 3 parameter hill slope model, similar to LL.3 in drc. Therefore, the plot will not correspond to the function fit by PharmacoGx to calculate the IC50 or AUC.
Source: PharmacoGx dev.
Does anyone know how to visualize the smooth component of gam models in R very well? I would really like to visualize something like the output of the function visreg. This code below illustrates my problem
library(gam)
f=function(v){exp(v)}
n=100
x=runif(n)
t=runif(n)
y=x+f(t)+rnorm(n, sd=0.1)
fit=gam(y~x+s(t))
plot(t,y)
lines(t,as.numeric(fit$smooth))
#want something more like
library(visreg)
visreg(fit)
You could use the plotting method for gam objects, but you'd have to use the data parameter of gam:
library(gam)
f <- function(v){exp(v)}
n <- 100
x <- runif(n)
t <- runif(n)
y <- x+f(t)+rnorm(n, sd=0.1)
DF <- data.frame(y, x, t)
fit <- gam(y~x+s(t), data = DF)
layout(t(1:2))
plot(fit, se=TRUE)
See help("plot.gam") for other options.