ggplot2: How to add linebreak to horizontal legend - r

Please consider the following R script (taken and slightly modified from here):
require(ggplot2)
x <- 1:10
y <- jitter(x^2)
DF <- data.frame(x, y)
p <- ggplot(DF, aes(x = x, y = y)) + geom_point() +
stat_smooth(method = 'lm', aes(colour = 'linear')) +
stat_smooth(method = 'lm', formula = y ~ poly(x,2),
aes(colour = 'polynomial')) +
stat_smooth(method = 'nls', formula = y ~ a * log(x) +b,
aes(colour = 'logarithmic')) +
stat_smooth(method = 'nls', formula = y ~ a*exp(b *x),
aes(colour = 'Exponential')) +
theme(legend.position = "top")
p <- p + guides(guide_legend(ncol=2,nrow=2,byrow=TRUE))
p
The legend is displayed at the top of the plot. I want to break this legend into two lines, with two keys in each line. Is this possible?
Please note that, as you may see, I already tried
p+guides(guide_legend(ncol=2,nrow=2,byrow=TRUE))
as suggested here and here, but it did not work for me. This suggestion basically displays the data and the legends of the linear and polynomial models and completely hides the logarithmic and exponential models.

As explained by eipi10,
You need specify which legend, in this case the colour legend: guides(colour=guide_legend(ncol=2,nrow=2,byrow=TRUE)).
To clarify, the aesthetic is defining the colour of each line. If fill were used, the line could be guides(fill=guide_legend(ncol=2,nrow=2,byrow=TRUE)).

Related

R - Adding legend to ggplot graph for regression lines

I do a Multiple Linear Regression in R, where I want to add a simple legend to a graph (ggplot). The legend should show the points and fitted lines with their corresponding colors. So far it works fine (without legend):
ggplot() +
geom_point(aes(x = training_set$R.D.Spend, y = training_set$Profit),
col = 'red') +
geom_line(aes(x = training_set$R.D.Spend, y = predict(regressor, newdata = training_set)),
col = 'blue') +
geom_line(aes(x = training_set$R.D.Spend, y = predict(regressor_sig, newdata = training_set)),
col = 'green') +
ggtitle('Multiple Linear Regression (Training set)') +
xlab('R.D.Spend [k$]') +
ylab('Profit of Venture [k$]')
How can I add a legend here most easily?
I tried the solutions from similar question, but did not succeed (add legend to ggplot2 | Add legend for multiple regression lines from different datasets to ggplot)
So, I appended my original model like this:
ggplot() +
geom_point(aes(x = training_set$R.D.Spend, y = training_set$Profit),
col = 'p1') +
geom_line(aes(x = training_set$R.D.Spend, y = predict(regressor, newdata = training_set)),
col = 'p2') +
geom_line(aes(x = training_set$R.D.Spend, y = predict(regressor_sig, newdata = training_set)),
col = 'p3') +
scale_color_manual(
name='My lines',
values=c('blue', 'orangered', 'green')) +
ggtitle('Multiple Linear Regression (Training set)') +
xlab('R.D.Spend [k$]') +
ylab('Profit of Venture [k$]')
But here I am getting the error of "Unknown colour name: p1". which makes somewhat sense, as I do not define p1 above. How can I make the ggplot recognise my intended legend?
Move col into the aes and then you can set the color using scale_color_manual:
library(ggplot2)
set.seed(1)
x <- 1:30
y <- rnorm(30) + x
fit <- lm(y ~ x)
ggplot2::ggplot(data.frame(x, y)) +
geom_point(aes(x = x, y = y)) +
geom_line(aes(x = x, y = predict(fit), col = "Regression")) +
scale_color_manual(name = "My Lines",
values = c("blue"))

How to plot raw data but use predicted values for line fit in ggplot2 R?

I have a data set (dat), with raw data (raw_x and raw_y). I have predicted a model and the predictions from the model are stored in dat$predict.
I wish to plot the raw data but overlay the data with a geom_smooth (here a quadratic function) but using the predicted data. This is my attempt at the basic code. I am not sure how to use predicted values in the geom_smooth yet.
ggplot(dat, aes(x = raw_x, y = raw_y, colours = "red")) +
geom_point() +
theme_bw() +
geom_smooth(method = "lm", formula = y ~ x + I(x^2))
The following plots the original points, the linear fit line and the fitted points. I use made up data since you have posted none.
set.seed(1234)
x <- cumsum(rnorm(100))
y <- x + x^2 + rnorm(100, sd = 50)
dat <- data.frame(raw_x = x, raw_y = y)
fit <- lm(y ~ x + I(x^2), dat)
dat$predict <- predict(fit)
ggplot(dat, aes(x = raw_x, y = raw_y)) +
geom_point(colour = "blue") +
theme_bw() +
geom_smooth(method = "lm", formula = y ~ x + I(x^2), colour = "red") +
geom_point(aes(y = predict), colour = "black")

Scatter plot with horizontal lines representing averages with R and ggplot

The below code produces a scatter plot with regression lines for each group. Instead of the sloped regression lines is it possible to plot horizontal lines that represent the average of each group's y values? I tried modifying the formula parameter to "y ~ 0 *x" but can't think of anything else that's obvious to use.
Thanks
ggplot(data = iris, aes(y = Sepal.Length, x = Sepal.Width, colour = Species)) + geom_point() +
geom_smooth(method = 'lm', formula = y ~ x , se = F)
We can specify the formula as y ~ 1.
library(ggplot2)
ggplot(data = iris, aes(y = Sepal.Length, x = Sepal.Width, colour = Species)) +
geom_point() +
geom_smooth(method = "lm", formula = y ~ 1)

Limiting the x-axis range of geom_line (defined by slope and intercept)

library(ggplot2)
##
df <- as.data.frame(matrix(rnorm(60*2, mean=3,sd=1), 60, 2))
colnames(df) <- c("A", "B")
cf1 <- coef(lm(B~A, data=df))
##
ggplot(df, aes(A,B)) +
geom_point() +
stat_smooth(method = "lm", color="red", fill="red", alpha=0.1, fullrange=TRUE) +
#xlim(0,6)+
geom_abline(intercept = cf1[1], slope = cf1[2], lty="dashed", col="green")
I want to limit geom_line to the same range as stat_smooth (which seems to be defined by xmax/xmin).
The xlim argument did not help (this was proposed here). In the real life application, the geom_line slope and intercept will be extracted from model updates, so they will be slightly different. Thank you.
I think this is one way to get what you are looking for:
min_x <- min(df$A)
min_y <- unname(cf1[1])
max_x <- max(df$A)
max_y <- min_y + unname(cf1[2]) * max_x
##
p <- ggplot(df, aes(A,B)) +
geom_point() +
stat_smooth(
method = "lm", color = "red",
fill = "red", alpha = 0.1,
fullrange = TRUE)
##
R> p + geom_segment(
aes(x = min_x, y = min_y,
xend = max_x, yend = max_y),
linetype = "dashed",
color = "green")
This requires a little extra effort as you are calculating the endpoint coordinates by hand, rather than just passing the slope and intercept values to the function, but it does not seem like geom_abline allows you to set its domain.

Change alpha level of geom_point in legend on top of stat_smooth

I'm running into trouble changing the alpha of my (coloured) points in the legend when I add stat_smooth.
require(ggplot2)
set.seed(1052)
dx <- runif(2000,0,10)
dy <- dx * rep(c(1,-1), each = 1000) + rnorm(2000,0,1)
dcol <- rep(c(TRUE, FALSE), each = 1000)
dd <- data.frame(x = dx, y = dy, col = dcol)
gg <- ggplot(dd) + aes(x = x, y = y, colour = col) + geom_point(alpha = 1/5)
gg
The alpha of the points carries over to the legend (making the colours hard to view), but this question shows that you can override legend details with guides:
magic <- guides(colour = guide_legend(override.aes = list(alpha = 1)))
gg + magic
Cool. But when I throw in stat_smooth, the magic stops working.
gg + stat_smooth(method = "lm")
gg + stat_smooth(method = "lm") + magic
How can I fix this? I would rather have the below result for the legend (white background, line and point with alpha = 1. (The issues seems to go away if you use geom_line and not stat_smooth)
gg + geom_line(alpha = 1/10) + magic
If you want to get legend key with just line and point and without background then you can add fill=NA inside the override.aes= - this will remove grey fill of legend key that is set due to confidence intervals of stat_smooth() (se=TRUE). Then with theme() and legend.key= you can change background to white.
ggplot(dd, aes(x = x, y = y, colour = col)) + geom_point(alpha = 1/5)+
stat_smooth(method = "lm")+
guides(colour = guide_legend(override.aes = list(alpha = 1,fill=NA))) +
theme(legend.key=element_rect(fill="white"))

Resources