How to add model name to the regression plot - r

I'm trying to plot the regression coefficient where I get a plot like this
For example I would like to add the model name into the plot such as this on the top of the image
`PC2 ~ Index + Lane + Gen`
How to do that? I would like to add the model names to the respective plots
My code which I'm using
plot_list = list()
for (i in seq(length(bb))) {
p = ggcoefstats(bb[[i]])
plot_list[[i]] = p
}
pdf("plots1.pdf",height = 10,width = 6)
for (i in seq(length(bb))) {
print(plot_list[[i]])
}
dev.off()
My data bb which is my model output
> bb
$`PC2 ~ Sex + Index + Lane`
Call:
lm(formula = x, data = mrna.pcs)
Coefficients:
(Intercept) SexM IndexAR002 IndexAR003 IndexAR004 IndexAR005 IndexAR006 IndexAR007 IndexAR008 IndexAR009 IndexAR010
0.8055 -11.3695 2.6964 -7.9438 -1.7453 -10.5309 -10.7135 -9.8775 4.4912 0.7830 -4.8674
IndexAR011 IndexAR012 IndexAR013 IndexAR014 IndexAR015 IndexAR016 IndexAR018 IndexAR019 IndexAR020 IndexAR021 IndexAR022
-8.1402 -10.6590 -8.1678 1.0441 -0.4174 7.2952 12.9489 -7.4206 -6.6895 -10.6862 4.9863
IndexAR023 IndexAR025 IndexAR027 Lane2 Lane3 Lane4 Lane5 Lane6 Lane7 Lane8
1.5614 -15.7488 -1.5925 12.3677 -10.3617 -55.5894 25.6420 34.1251 42.4888 16.1013
Any suggestion or help would be really appreciated

I was not previously familiar with this package your using to create this visual, which is not actually ggplot2. According to this page, there is a title parameter that you should use within the function: https://indrajeetpatil.github.io/ggstatsplot/reference/ggcoefstats.html
You might benefit by converting this to ggplot2, however. The first thing I thought when I saw your visual was "this needs ggrepel". That will automatically move your text labels around so that they do not overlap. Seeing those coefficient values would seem to be very important. AFAIK, ggrepel only works with ggplot2 not other graphing libraries.

Related

How do I use the group argument for the plot_summs() function from the jtools package?

I am plotting my coefficient estimates using the function plot_summs() and would like to divide my coefficients into two separate groups.
The function plot_summs() has an argument groups, however, when I try to use it as explained in the documentation, I do not get any results nor error. Can someone give me an example of how I can use this argument please?
This is the code I currently have:
plot_summs(model.c, scale = TRUE, groups = list(pane_1 = c("AQI_average", "temp_yearly"), pane_2 = c("rain_1h_yearly", "snow_1h_yearly")), coefs = c("AQI Average"= "AQI_average", "Temperature (in Farenheit)" = "temp_yearly","Rain volume in mm" = "rain_1h_yearly", "Snow volume in mm" = "snow_1h_yearly"))
And the image below is what I get as a result. What I would like to get is to have two panes separate panes. One which would include "AQI_average" and "temp_yearly" and the other one that would have "rain_1h_yearly" and "snow_1h_yearly". Event though I use the groups argument, I do not get this.
Output of my code
By minimal reproducible example, markus is refering to a piece of code that enables others to exactly reproduce the issue you are refering to on our respective computers, as described in the link that they provided.
To me, it seems the problem is that the groups function does not seem to work in plot_summs - it seems someone here also pointed it out.
If plot_summs is replaced by plot_coef, the groups function work for me. However, the scale function does not seem to be available. A workaround might be:
r <- lm(Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width, data = iris)
y <- plot_summs(r, scale = TRUE) #Plot for scaled version
t <- plot_coefs(r, #Plot for unscaled versions but with facetting
groups =
list(
pane_1 = c("Sepal.Width", "Petal.Length"),
pane_2 = c("Petal.Width"))) + theme_linedraw()
y$data$group <- t$data$group #Add faceting column to data for the plot
t$data <- y$data #Replace the data with the scaled version
t
I hope this is what you meant!

Is there a ggplot2 analogue to the avPlots function in R?

When undertaking regression modelling it is useful to produce added variable plots for the explanatory variables in the model, to check whether the posited relationships to the response variable are appropriate to the data. The avPlots function in the car package in R takes a model input, and produces a grid of added-variable plots using the base graphics system. This function is extremely user-friendly, insofar as all you need to do is put in the model object as an argument, and it automatically produces all the added variable plots for each explanatory variable. This matrix of plots contains all the desired information, but unfortunately the plots look poor, owing to the fact that it uses the base graphics system rather than the ggplot2 package. For example, using data found here (downloaded as the file Trucking.csv) here is the output of the avPlots function.
#Load required libraries
library(car);
#Import data, fit model, and show AV plots
DATA <- read.csv('Trucking.csv');
MODEL <- lm(log(PRICPTM) ~ DISTANCE + PCTLOAD + ORIGIN + MARKET + DEREG + PRODUCT,
data = DATA);
avPlots(MODEL);
Question: Is there an equivalent function in ggplot2 that produces a matrix of each of the added-variable plots for a model, but with "prettier" plots? Is it possible to produce these plots, but then customise them using standard ggplot syntax?
I am not aware of any automated function that produces the added variable plots using ggplot. However, as well as giving a plot output as a side-effect of the function call, the avPlots function produces an object that is a list containing the data values used in each of the added variable plots. It is relatively simple to extract data frames of these variables and use these to generate added variable plots using ggplot. This can be done for a general model object using the following functions.
avPlots.invis <- function(MODEL, ...) {
ff <- tempfile()
png(filename = ff)
OUT <- car::avPlots(MODEL, ...)
dev.off()
unlink(ff)
OUT }
ggAVPLOTS <- function(MODEL, YLAB = NULL) {
#Extract the information for AV plots
AVPLOTS <- avPlots.invis(MODEL)
K <- length(AVPLOTS)
#Create the added variable plots using ggplot
GGPLOTS <- vector('list', K)
for (i in 1:K) {
DATA <- data.frame(AVPLOTS[[i]])
GGPLOTS[[i]] <- ggplot2::ggplot(aes_string(x = colnames(DATA)[1],
y = colnames(DATA)[2]),
data = DATA) +
geom_point(colour = 'blue') +
geom_smooth(method = 'lm', se = FALSE,
color = 'red', formula = y ~ x, linetype = 'dashed') +
xlab(paste0('Predictor Residual \n (',
names(DATA)[1], ' | others)')) +
ylab(paste0('Response Residual \n (',
ifelse(is.null(YLAB),
paste0(names(DATA)[2], ' | others'), YLAB), ')')) }
#Return output object
GGPLOTS }
The function ggAVPLOTS will take an input model and produce a list of ggplot objects for each of the added variable plots. These have been constructed to give "pretty" plots with blue points and a dashed red regression line through each plot. If you want all the added variable plots to show up in a single plot, it is relatively simple to do this using the grid.arrange function in the gridExtra package. Below we apply this to your model and show the resulting plot.
#Produce matrix of added variable plots
library(gridExtra)
PLOTS <- ggAVPLOTS(MODEL)
K <- length(PLOTS)
NCOL <- ceiling(sqrt(K))
AVPLOTS <- do.call("arrangeGrob", c(PLOTS, ncol = NCOL, top = 'Added Variable Plots'))
ggsave('AV Plots - Trucking.jpg', width = 10, height = 10)
It is possible to make whatever alterations you want to these plots in the ggplot code above, so if a user prefers to change the colours, font sizes, etc., this is done using standard syntax in ggplot. This method works by importing the data for the added variable plots from the avPlots function, but once you have done that, you can use this data to produce any kind of plot.

Save plots as R objects and displaying in grid

In the following reproducible example I try to create a function for a ggplot distribution plot and saving it as an R object, with the intention of displaying two plots in a grid.
ggplothist<- function(dat,var1)
{
if (is.character(var1)) {
var1 <- which(names(dat) == var1)
}
distribution <- ggplot(data=dat, aes(dat[,var1]))
distribution <- distribution + geom_histogram(aes(y=..density..),binwidth=0.1,colour="black", fill="white")
output<-list(distribution,var1,dat)
return(output)
}
Call to function:
set.seed(100)
df <- data.frame(x = rnorm(100, mean=10),y =rep(1,100))
output1 <- ggplothist(dat=df,var1='x')
output1[1]
All fine untill now.
Then i want to make a second plot, (of note mean=100 instead of previous 10)
df2 <- data.frame(x = rep(1,1000),y = rnorm(1000, mean=100))
output2 <- ggplothist(dat=df2,var1='y')
output2[1]
Then i try to replot first distribution with mean 10.
output1[1]
I get the same distibution as before?
If however i use the information contained inside the function, return it back and reset it as a global variable it works.
var1=as.numeric(output1[2]);dat=as.data.frame(output1[3]);p1 <- output1[1]
p1
If anyone can explain why this happens I would like to know. It seems that in order to to draw the intended distribution I have to reset the data.frame and variable to what was used to draw the plot. Is there a way to save the plot as an object without having to this. luckly I can replot the first distribution.
but i can't plot them both at the same time
var1=as.numeric(output2[2]);dat=as.data.frame(output2[3]);p2 <- output2[1]
grid.arrange(p1,p2)
ERROR: Error in gList(list(list(data = list(x = c(9.66707664902549, 11.3631137069225, :
only 'grobs' allowed in "gList"
In this" Grid of multiple ggplot2 plots which have been made in a for loop " answer is suggested to use a list for containing the plots
ggplothist<- function(dat,var1)
{
if (is.character(var1)) {
var1 <- which(names(dat) == var1)
}
distribution <- ggplot(data=dat, aes(dat[,var1]))
distribution <- distribution + geom_histogram(aes(y=..density..),binwidth=0.1,colour="black", fill="white")
plot(distribution)
pltlist <- list()
pltlist[["plot"]] <- distribution
output<-list(pltlist,var1,dat)
return(output)
}
output1 <- ggplothist(dat=df,var1='x')
p1<-output1[1]
output2 <- ggplothist(dat=df2,var1='y')
p2<-output2[1]
output1[1]
Will produce the distribution with mean=100 again instead of mean=10
and:
grid.arrange(p1,p2)
will produce the same Error
Error in gList(list(list(plot = list(data = list(x = c(9.66707664902549, :
only 'grobs' allowed in "gList"
As a last attempt i try to use recordPlot() to record everything about the plot into an object. The following is now inside the function.
ggplothist<- function(dat,var1)
{
if (is.character(var1)) {
var1 <- which(names(dat) == var1)
}
distribution <- ggplot(data=dat, aes(dat[,var1]))
distribution <- distribution + geom_histogram(aes(y=..density..),binwidth=0.1,colour="black", fill="white")
plot(distribution)
distribution<-recordPlot()
output<-list(distribution,var1,dat)
return(output)
}
This function will produce the same errors as before, dependent on resetting the dat, and var1 variables to what is needed for drawing the distribution. and similarly can't be put inside a grid.
I've tried similar things like arrangeGrob() in this question "R saving multiple ggplot2 plots as R-object in list and re-displaying in grid " but with no luck.
I would really like a solution that creates an R object containing the plot, that can be redrawn by itself and can be used inside a grid without having to reset the variables used to draw the plot each time it is done. I would also like to understand wht this is happening as I don't consider it intuitive at all.
The only solution I can think of is to draw the plot as a png file, saved somewhere and then have the function return the path such that i can be reused - is that what other people are doing?.
Thanks for reading, and sorry for the long question.
Found a solution
How can I reference the local environment within a function, in R?
by inserting
localenv <- environment()
And referencing that in the ggplot
distribution <- ggplot(data=dat, aes(dat[,var1]),environment = localenv)
made it all work! even with grid arrange!

How to control plot layout for lmerTest output results?

I am using lme4 and lmerTest to run a mixed model and then use backward variable elimination (step) for my model. This seems to work well. After running the 'step' function in lmerTest, I plot the final model. The 'plot' results appear similar to ggplot2 output.
I would like to change the layout of the plot. The obvious answer is to do it manually myself creating an original plot(s) with ggplot2. If possible, I would like to simply change the layout of of the output, so that each plot (i.e. plotted dependent variable in the final model) are in their own rows.
See below code and plot to see my results. Note plot has three columns and I would like three rows. Further, I have not provided sample data (let me know if I need too!).
library(lme4)
library(lmerTest)
# Full model
Female.Survival.model.1 <- lmer(Survival.Female ~ Location + Substrate + Location:Substrate + (1|Replicate), data = Transplant.Survival, REML = TRUE)
# lmerTest - backward stepwise elimination of dependent variables
Female.Survival.model.ST <- step(Female.Survival.model.1, reduce.fixed = TRUE, reduce.random = FALSE, ddf = "Kenward-Roger" )
Female.Survival.model.ST
plot(Female.Survival.model.ST)
The function that creates these plots is called plotLSMEANS. You can look at the code for the function via lmerTest:::plotLSMEANS. The reason to look at the code is 1) to verify that, indeed, the plots are based on ggplot2 code and 2) to see if you can figure out what needs to be changed to get what you want.
In this case, it sounds like you'd want facet_wrap to have one column instead of three. I tested with the example from the **lmerTest* function step help page, and it looks like you can simply add a new facet_wrap layer to the plot.
library(ggplot2)
plot(Female.Survival.model.ST) +
facet_wrap(~namesforplots, scales = "free", ncol = 1)
Try this: plot(difflsmeans(Female.Survival.model.ST$model, test.effs = "Location "))

Looping over attributes vector to produce combined graphs

Here is some code that tries to compute the marginal effects of each of the predictors in a model (using the effects package) and then plot the results. To do this, I am looping over the "term.labels" attribute of the glm terms object).
library(DAAG)
library(effects)
formula = pres.abs ~ altitude + distance + NoOfPools + NoOfSites + avrain + meanmin + meanmax
summary(logitFrogs <- glm(formula = formula, data = frogs, family = binomial(link = "logit")))
par(mfrow = c(4, 2))
for (predictorName in attr(logitFrogs$terms, "term.labels")) {
print(predictorName)
effLogitFrogs <- effect(predictorName, logitFrogs)
plot(effLogitFrogs)
}
This produces no picture at all. On the other hand, explicitly stating the predictor names does work:
effLogitFrogs <- effect("distance", logitFrogs)
plot(effLogitFrogs)
What am I doing wrong?
Although you call function plot(), actually it calls function plot.eff() and it is lattice plot and so par() argument is ignored. One solution is to use function allEffects() and then plot(). This will call function plot.efflist(). With this function you do not need for loop because all plots are made automatically.
effLogitFrogs <- allEffects(predictorName, logitFrogs)
plot(effLogitFrogs)
EDIT - solution with for loop
There is "ugly" solution to use with for() loop. For this we need also package grid. First, make as variables number of rows and columns (now it works only with 1 or 2 columns). Then grid.newpage() and pushViewport() set graphical window.
Predictor names are stored in vector outside the loop. Using functions pushViewport() and popViewport() all plots are put in the same graphical window.
library(lattice)
library(grid)
n.col=2
n.row= 4
grid.newpage()
pushViewport(viewport(layout = grid.layout(n.row,n.col)))
predictorName <- attr(logitFrogs$terms, "term.labels")
for (i in 1:length(predictorName)) {
print(predictorName[i])
effLogitFrogs <- effect(predictorName[i], logitFrogs)
pushViewport(viewport(layout.pos.col=ceiling(i/n.row), layout.pos.row=ifelse(i-n.row<=0,i,i-n.row)))
p<-plot(effLogitFrogs)
print(p,newpage=FALSE)
popViewport(1)
}
add print to your loop resolve the problem.
print(plot(effLogitFrogs))
plot call plot.eff , which create the plot without printing it.
allEffects generete an object of type eff.list. When we try to plot this object, its calls plot.efflist function which prints the plot so no need to call print like plot.eff.

Resources