I have a multinomial logistic regression model built using multinom() function from nnet package in R. I have a 7 class target variable and I want to plot the coefficients that the variables included in the model have for each class of my dependent variable.
For a binary logistic regression I used coefplot() function from arm package, but I don't know how to do this for a multiclass problem.
I want my plots to look like this:
I couldn't easily find a sensible multinom() example: the one below gives ridiculous values, but the structure of the code should work anyway. The basic idea is to use broom::tidy() to extract coefficients and ggplot/ggstance to plot them. ggstance is specifically for plotting horizontal point-ranges and displacing them from each other an appropriate amount; this can also be done via coord_flip(), but coord_flip() induces a certain lack of flexibility (e.g. it can't easily be combined with faceting).
library(nnet)
library(broom)
library(ggplot2); theme_set(theme_bw())
library(ggstance)
Create example multinom() fit:
nvars <- c("mpg","disp","hp")
mtcars_sc <- mtcars
mtcars[nvars] <- scale(mtcars[nvars])
m <- multinom(cyl~mpg+hp+disp,mtcars_sc,
maxit=1e4)
Extract coefficients and drop intercept terms:
tt <- broom::tidy(m,conf.int=TRUE)
tt <- dplyr::filter(tt, term!="(Intercept)")
Plot:
ggplot(tt, aes(x=estimate,y=term,colour=y.level))+
geom_pointrangeh(aes(xmin=conf.low,
xmax=conf.high),
position=position_dodgev(height=0.75))
Given that you're able to get your data like this:
coeff <- factor(1:7,labels=c("inc", "lwg", "hcyes", "wcyes","age", "k618", "k5"))
values <- c(-0.1,0.6,0.15,0.8,-0.05,-0.05,-1.5)
upper <- c(-0.1,1,.6,1.3,-.05,.1,-1)
lower <- c(-0.1,.2,-.2,.3,-.05,-.2,-2)
df <- data.frame(coeff,values,upper,lower)
Then all you have to do is run:
library(ggplot2)
ggplot(df, aes(x=coeff, y=values, ymin=lower, ymax=upper)) +
geom_pointrange() +
geom_hline(yintercept=0, linetype=2)+
coord_flip()
The result should look like this:
You can experiment with certain options to get it to look identical to your example
Related
I'm trying to plot the predictions (predict()) of my mixed model below such that I can obtain my conceptually desired plot as a line below.
I have tried to plot my model's predictions, but I don't achieve my desired plot. Is there a better way to define predict() so I can achieve my desired plot?
library(lme4)
dat3 <- read.csv('https://raw.githubusercontent.com/rnorouzian/e/master/dat3.csv')
m4 <- lmer(math~pc1+pc2+discon+(pc1+pc2+discon|id), data=dat3)
newdata <- with(dat3, expand.grid(pc1=unique(pc1), pc2=unique(pc2), discon=unique(discon)))
y <- predict(m4, newdata=newdata, re.form=NA)
plot(newdata$pc1+newdata$pc2, y)
More sjPlot. See the parameter grid to wrap several predictors in one plot.
library(lme4)
library(sjPlot)
library(patchwork)
dat3 <- read.csv('https://raw.githubusercontent.com/rnorouzian/e/master/dat3.csv')
m4 <- lmer(math~pc1+pc2+discon+(pc1+pc2+discon|id), data=dat3) # Does not converge
m4 <- lmer(math~pc1+pc2+discon+(1|id), data=dat3) # Converges
# To remove discon
a <- plot_model(m4,type = 'pred')[[1]]
b <- plot_model(m4,type = 'pred',title = '')[[2]]
a + b
Edit 1: I had some trouble removing the dropcon term within the sjPlot framework. I gave up and fell back on patchwork. I'm sure Daniel could knows the correct way.
As Magnus Nordmo suggest, this is very simple with sjPlot which has some predefined functions for these types of plot.
library(lme4)
dat3 <- read.csv('https://raw.githubusercontent.com/rnorouzian/e/master/dat3.csv')
m4 <- lmer(math~pc1+pc2+discon+(pc1+pc2+discon|id), data=dat3)
plot_model(m4, type = 'pred', terms = c('pc1', 'pc2'),
ci.lvl = 0)
which gives the following result.
This plot is designed to include different quantiles of the second term in terms over the axes of pc1 and pred. You could split up these plots and combine them using patchwork and the interval can be changed by using square brackets after the term in terms (eg pc1 [-10:1] for interval between -10 and 1).
I am using the function ggpredict to display a lmer model's result.
The model has a continuous X (RT), one continuous Y (RC1) and 4 discrete factors (2x2x2x14).
Model:
SailorJupiter <- lmer(RT~RC1*m2*m3*m5*m4 + (1|Trial:sonTrial) + (1|Subject) + (1|Trial) + (1|sonleft) + (1|sonright), data=audiostim, REML=FALSE)
library(see)
library(ggeffects)
a <- ggpredict(SailorJupiter, c("RC1","m2","m3","m4","m5"), dependencies=TRUE)
plot(a)
Example of plot without the 14-levels factor because it's too big
Question 1:
I'd like to have results with groups being a combination of m3 and m4 in order to simplify the graphs. I tried :
a <- ggpredict(SailorJupiter, c("RC1","m2","m3:m4","m5"), dependencies=TRUE)
plot(a)
But it doesn't work.
Question 2: Is there a way to use only one level of a factor in order to simplify the plot ? I know some other plotting packages allow it, but can't find it in ggpredict().
I am attempting to create an adjusted survival curve (from a Cox model) and would like to display this information as cumulative events.
I have attempted this:
library(survival)
data("ovarian")
library(survminer)
model<-coxph(Surv(futime, fustat) ~ age + strata(rx), data=ovarian)
gplot<-ggadjustedcurves(model) ## Expected plot of adjusted survival curve
Because the "fun=" still has not been implemented in ggadjustedcurves I took the advice of a user on this page and extracted the elements into plotdata and created a new column as shown below.
plotdata<-gplot$data
plotdata%<>%
mutate(new=1-surv) ## 1-survival probability
I am new to R environment and ggplot so how can I then plot the new adjusted survival curve with the new created column and keep the theme of the original plot (contained in gplot).
Thanks!
Edit:
My current solution is as follows.
library(rms)
model<-coxph(Surv(futime, fustat) ~ age+ strata(rx), data=ovarian)
survfit(model, conf.type = "plain", conf.int = 1)
plot(survfit(model), conf.int = T,col = c(1,2), fun='event')
This achieves the survival curve I wanted however I am not sure if the confidence bars are really the standard errors (+/-1). I supplied 1 to the conf.int argument and believe this to create the standard errors in this way since conf.type is specified as plain.
How can I further customize this plot as the base graph looks rather bland! How do I get a display as close as possible to the survminer curves?
You can use the adjustedCurves package instead, which allows both plotting confidence intervals and naturally includes an option to display cumulative incidence functions. First, install it using:
devtools::install_github("https://github.com/RobinDenz1/adjustedCurves")
Now you can use:
library(adjustedCurves)
library(survival)
library(riskRegression)
# needs to be a factor
ovarian$rx <- factor(ovarian$rx)
# needs to include x=TRUE
model <- coxph(Surv(futime, fustat) ~ age + strata(rx), data=ovarian, x=TRUE)
adj <- adjustedsurv(data=ovarian,
event="fustat",
ev_time="futime",
variable="rx",
method="direct",
outcome_model=model,
conf_int=TRUE)
plot(adj, cif=TRUE, conf_int=TRUE)
Which produces:
I would probably not use this method here, though. Simulation studies have shown that the cox-regression based method performs badly in small sample sizes. You might want to take a look at method="iptw" or method="aiptw" inside the adjustedCurves package instead.
Let's say that I have some data and I have created a linear model to fit the data. Then I plot the data using ggplot2 and I want to add the linear model to the plot. As far as I know, this is the standard way of doing it (using the built-in cars dataset):
library(ggplot2)
fit <- lm(dist ~ speed, data = cars)
summary(fit)
p <- ggplot(cars, aes(speed, dist))
p <- p + geom_point()
p <- p + geom_smooth(method='lm')
p
However, the above violates the DRY principle ('don't repeat yourself'): it involves creating the linear model in the call to lm and then recreating it in the call to geom_smooth. This seems inelegant to me, and it also introduces a space for bugs. For example, if I change the model that is created with lm but forget to change the model that is created with geom_smooth, then the summary and the plot won't be of the same model.
Is there a way of using ggplot2 to plot an already existing linear model, e.g. by passing the lm object itself to the geom_smooth function?
What one needs to do is to create a new data frame with the observations from the old one plus the predicted values from the model, then plot that dataframe using ggplot2.
library(ggplot2)
# create and summarise model
cars.model <- lm(dist ~ speed, data = cars)
summary(cars.model)
# add 'fit', 'lwr', and 'upr' columns to dataframe (generated by predict)
cars.predict <- cbind(cars, predict(cars.model, interval = 'confidence'))
# plot the points (actual observations), regression line, and confidence interval
p <- ggplot(cars.predict, aes(speed,dist))
p <- p + geom_point()
p <- p + geom_line(aes(speed, fit))
p <- p + geom_ribbon(aes(ymin=lwr,ymax=upr), alpha=0.3)
p
The great advantage of doing this is that if one changes the model (e.g. cars.model <- lm(dist ~ poly(speed, 2), data = cars)) then the plot and the summary will both change.
Thanks to Plamen Petrov for making me realise what was needed here. As he points out, this approach will only work if predict is defined for the model in question; if not, one has to define it oneself.
I believe you want to do something along the lines of :
library(ggplot2)
# install.packages('dplyr')
library(dplyr)
fit <- lm(dist ~ speed, data = cars)
cars %>%
mutate( my_model = predict(fit) ) %>%
ggplot() +
geom_point( aes(speed, dist) ) +
geom_line( aes(speed, my_model) )
This will also work for more complex models as long as the corresponding predict method is defined. Otherwise you will need to define it yourself.
In the case of linear model you can add the confidence/prediction bands with slightly more work and reproduce your plot.
I am trying to show that there is a wierd "bump" in some data I am analysing (it is to do with market share. My code is here:-
qplot(Share, Rate, data = Dataset3, geom=c("point", "smooth"))
(I appreciate that this is not very useful code without the dataset).
Is there anyway that I can get the numeric vector used to generate the smoothed line out of R? I just need that layer to try to fit a model to the smoothed data.
Any help gratefully received.
Yes, there is. ggplot uses the function loess as the default smoother in geom_smooth. this means you can use loess directly to estimate your smoothing parameters.
Here is an example, adapted from ?loess :
qplot(speed, dist, data=cars, geom="smooth")
Use loess to estimate the smoothed data, and predict for the estimated values::
cars.lo <- loess(dist ~ speed, cars)
pc <- predict(cars.lo, data.frame(speed = seq(4, 25, 1)), se = TRUE)
The estimates are now in pc$fit and the standard error in pc$fit.se. The following bit of code extraxts the fitted values into a data.frame and then plots it using ggplot :
pc_df <- data.frame(
x=4:25,
fit=pc$fit)
ggplot(pc_df, aes(x=x, y=fit)) + geom_line()