Code below plots random effects from a mixed effects model:
mtcarsSub <- mtcars[,c("wt", "drat", "cyl")]
library(lme4)
mtcarsME <- lmer(drat ~ (1|cyl) + wt, data=mtcarsSub)
mtcarsSub$fixed.effect <- predict(mtcarsME)
library(plyr)
l_ply(list(4, 6, 8), function(x) mtcarsSub[[ paste0("random.effect.cyl", x) ]] <<- mtcarsSub$fixed.effect + ranef(mtcarsME)$cyl[as.character(x),])
library(ggplot2)
ggplot(mtcarsSub, aes(wt, drat, color=factor(cyl))) +
geom_point() +
geom_line(aes(wt, fixed.effect), color="black", size=2) +
geom_line(aes(wt, random.effect.cyl4), size=2) +
geom_line(aes(wt, random.effect.cyl6), size=2) +
geom_line(aes(wt, random.effect.cyl8), size=2)
How can I programatically make each random effect line the same colour as the colours displayed for cyl? Therefore, the random effect line for level 4 of cyl should be red, level 6 of cyl should be green and level 8 of cyl should be blue. I dont want to specify color="red" etc in geom_line().
I would suggest to make new data frame for the random effects. For this I use function ldply() and the function you made to calculate random effects for each level. Additionally in this new data frame added column wt and cyl. wt will contain all wt values from mtcarsSub data frame repeated for each level. cyl will contain values 4, 6 and 8.
mt.rand<-ldply(list(4,6,8), function(x) data.frame(
wt=mtcarsSub$wt,
cyl=x,
rand=mtcarsSub$fixed.effect + ranef(mtcarsME)$cyl[as.character(x),]))
Now for the plotting use new data frame in one geom_line() call. As the new data frame also has cyl column it will be assigned the colors as for points.
ggplot(mtcarsSub, aes(wt, drat, color=factor(cyl))) +
geom_point() +
geom_line(aes(wt, fixed.effect), color="black", size=2)+
geom_line(data=mt.rand,aes(wt,rand),size=2)
Related
I am working with the R programming language. I made the following graph using the built-in "mtcars" dataset:
library(ggplot2)
a = ggplot(data=mtcars, aes(x=wt, y=mpg)) + geom_point() + ggtitle("mtcars: wt vs mpg")
Question: Now, I am trying to customize the title, but I want the title to "contain a variable reference", for example:
b = ggplot(data=mtcars, aes(x=wt, y=mpg)) + geom_point() + ggtitle("mtcars: wt vs mpg - average mpg = mean(mtcars$mpg)")
But this command literally just prints the code I wrote:
I know that the "mean" command runs by itself:
mean(mtcars$mpg)
[1] 20.09062
But can someone please help me change this code:
ggplot(data=mtcars, aes(x=wt, y=mpg)) + geom_point() + ggtitle("mtcars: wt vs mpg - average mpg = mean(mtcars$mpg)")
So that it produces something like this (note: here, I manually wrote the "mean" by hand):
Thanks
You can do this using paste().
ggplot(data=mtcars, aes(x=wt, y=mpg)) + geom_point() + ggtitle(paste0("mtcars: wt vs mpg - average mpg = ", mean(mtcars$mpg)))
If you need to you can add more text and variables with commas like so:
ggtitle(paste0('text ', var1, 'text2 etc ', var2, var3, 'text3'))
Note that paste0 is a variant of paste that concatenates things with no space or separator in between.
An option with glue
library(ggplot2)
library(glue)
ggplot(mtcars, aes(x = wt, y = mpg)) +
geom_point() +
ggtitle(glue("mtcars: wt vs mpg - average mpg = {mean(mtcars$mpg)}"))
We could also use str_c from stringr package which is part of tidyverse package. str_c is equivalent to paste0
library(tidyverse)
ggplot(data=mtcars, aes(x=wt, y=mpg)) + geom_point() + ggtitle(str_c("mtcars: wt vs mpg - average mpg = ", mean(mtcars$mpg)))
I am using lms to analyse data and simplify models using backwards selection. Now it is really easy to plot that outcome in R base-package via predict(), but I struggle to do so in ggplot2.
To be more specific about my model, I am having a metric dependent variable and a metric variable, a factor (22 levels) as explanatory variables as well as the interactions thereof. During the simplification process I find, that the interaction is non-significant, so can be deleted from the minimum adequate model. For simplicity I am demonstrating my issue using the mtcar dataset and transforming cyl as a factor:
mtcars$cyl <- as.factor(mtcars$cyl)
model <- lm(mpg ~ cyl + wt, data = mtcars)
summary(model)
so my simplified model lacks the interaction. I now want to plot the data in ggplot2:
cyl <- ggplot(mtcars, aes(x = wt, y = mpg, col = cyl, fill = cyl, group = cyl))
cyl + geom_point(size = 3, alpha = 0.3) + stat_smooth(method = "lm", se = F)
If I am plotting the data using stat_smooth, I get separate regression lines for each factor levels with differences in intercept and slope, which is not what the simplified model suggests. But how do I implement this in gglot2? Thank you very much in advance!
Consider the simple example:
library(ggplot2)
head(mtcars)
# create the plot
ggplot(mtcars, aes(factor(cyl))) + geom_bar() + theme_bw() +
theme(strip.text.x = element_text(size = 20, face="bold"))+
xlab("number of cyl") + ylab("Count")
Now we can obtain the average $mpg per cyl with:
aggregate(mpg ~ cyl, data = mtcars, FUN=mean)
How can I put these average values into the x-axis so that they appear below the corresponding cyl. Can one draw a table and somehow write that this is the ...average mpg per cyl...
Here is a simple way to do it by rewriting the factor level names:
(Note that this is safe only as long as aggregate generates it table in the same order as the factor level names and without any gaps - which seems like it should be the case, but one would have to investigate to make sure. It might be safer to code it as a loop and look at the level names to make sure they match up correctly)
library(ggplot2)
head(mtcars)
adf <- aggregate(mpg ~ cyl, data = mtcars, FUN=mean)
mtcars$fcyl <- factor(mtcars$cyl)
levels(mtcars$fcyl) <- sprintf("%d \n %.1f",adf$cyl,adf$mpg)
# create the plot
ggplot(mtcars, aes(fcyl)) + geom_bar() + theme_bw() +
theme(strip.text.x = element_text(size = 20, face="bold"))+
xlab("number of cyl") + ylab("Count")
yielding:
I am a novice at R and have a ggplot related question. Below is a dummy data frame with one column containing the predictor (xvar) and multiple columns of dichotomous outcomes (yvar1, yvar2, yvar3).
df <- data.frame("xvar"=c(0,100,200,300,400,500,600,1000),"yvar1"= c(0,0,0,0,0,0,1,1),"yvar2"=c(0,0,1,1,1,1,1,1),"yvar3"=c(0,0,1,1,0,1,1,1))
I have created a for loop to run a logistic regression for each yvar against the predictor xvar. I am able to successfully plot the regression for each yvar. Please ignore the regression warnings (this is a dummy dataset)
for (i in 2:4) {
logr.yvar <- glm(df[,names(df[i])] ~ xvar, data=df, family=binomial(link="logit"))
print(logr.yvar)
plot(df$xvar, df[,i])
curve(predict(logr.yvar, data.frame(xvar=x), type="response"), add=TRUE)
}
Instead of using the base plot function, I would like to switch to ggplot2. I am currently able to generate ggplots for individual regressions:
ggplot(df, aes(x=xvar, y=yvar1)) + geom_point() +
stat_smooth(method="glm", family="binomial", se=TRUE)
How can I set up looping using ggplot2?
If you really want to loop, you could use lapply.
p <- lapply(names(df)[-1], function(nm){
ggplot(df, aes_string(x="xvar", y=nm)) + geom_point() +
stat_smooth(method="glm", family="binomial", se=TRUE)
})
print(p)
However, I suspect that reshaping your data and displaying all the graphs together may be better.
# reshaping data
require(reshape2)
df.melt <- melt(df, id.var='xvar')
# first variation, using facets
ggplot(df.melt, aes(xvar, value)) +
geom_point() +
stat_smooth(method="glm", family="binomial", se=TRUE) +
facet_grid(variable~.)
# second variation using colors
ggplot(df.melt, aes(xvar, value)) +
geom_point() +
stat_smooth(aes(color = variable, fill = variable),
method="glm", family="binomial", se=TRUE, size = 1.2)
How do I show 2 regression lines on the same plot?
Here are both models:
data(mtcars)
a <- lm(mpg~wt+hp)
b <- lm(mpg~wt+hp+wt*hp)
I plot wt on the x-axis, mpg on the y-axis and hp as the colour.
Here it is in base R:
cr <- colorRamp(c("yellow", "red"))
with(mtcars, {
plot(wt, mpg, col = rgb(cr(hp / max(hp)), max=255),
xlab="Weight", ylab="Miles per Gallon", pch=20)
})
Also, please show how to accomplish this in ggplot2.
Here's the plot:
library(ggplot2)
p <- ggplot(mtcars, aes(x = wt, y = mpg)) + geom_point(aes(col = hp))
p + scale_colour_gradientn(colours=c("green","black"))
Thanks in advance!
The documentation for geom_smooth practically tells you how to do this.
One can use the regression models to predict new values for y and then plot these on the same graph using geom_smooth().
Below is code for ggplot2 that produces what I think you want. The two lines overlap so much that it looks like only one line is plotted and I've set one linetype to dashed to demonstrate this.
I don't know how to achieve this in base R though.
data(mtcars)
library(ggplot2)
a <- lm(mpg~wt+hp, data = mtcars)
b <- lm(mpg~wt+hp+wt*hp, data = mtcars)
mtcars$pred.a <- predict(a)
mtcars$pred.b <- predict(b)
p <- ggplot(mtcars, aes(x = wt, y = mpg)) + geom_point(aes(col = hp)) +
scale_colour_gradientn(colours=c("green","black")) +
geom_smooth(aes(x = wt, y = pred.a), method = "lm", colour = "black", fill = NA) +
geom_smooth(aes(x = wt, y = pred.b), method = "lm", colour = "red", fill = NA, linetype = 4)
p
A base R solution:
a <- lm(mpg~wt+hp, data=mtcars)
b <- lm(mpg~wt+hp+wt*hp, data=mtcars)
wt <- mtcars[, "wt"]
idx <- sort(wt, index.return=TRUE)$ix
plot(mpg~wt, data=mtcars)
lines(wt[idx], predict(a)[idx], col="red")
lines(wt[idx], predict(b)[idx], col="blue")
However, it is not the best visualisation conceivable.
You are asking how to add a regression line, but your regression models produce a regression plane and a regression surface, both higher dimensional than a line. You can find a regression line by conditioning on a chosen value of hp, or show multiple lines for different values of hp.
Using base graphics you can use the Predict.Plot function in the TeachingDemos package to add prediction lines/curves to a plot for a fitted model (or 2). The interactive TkPredict' function in the same package will let you interact with the plot to choose conditioning values, then will produce the call toPredict.Plot` to create the current line. You can the combine the generated commands to include them on the same plot.