Remove "group" legend in ggplot - r

So a handful of posts already address how to remove unwanted legends in ggplot.
The wonderful answer posted to "Remove extra legends in ggplot2"
suggests:
For any mapped variable you can supress the appearance of a legend by using guide = 'none' in the appropriate scale_...
However, I'm having problems with unwanted legends being created by adding the group aesthetic.
I tried the scale approach, but it doesn't seem to work with the group argument:
could not find function "scale_group"
A search here didn't provide any insight on the proper function call to modify group aesthetics either.
User #joran provided the following insight in the linked post above:
That's because the group aesthetic doesn't generate any scales or guides on its own. It's always sort of modifying something else. You'll never get a legend for the group aesthetic.
Example
So I could just add show.legend = FALSE to my function call containing group to remove any legend for that function, but this doesn't work out if I want some other portion (i.e., aesthetic) of that call to be included in the legend.
#Set Up Example:
library(lme4)
library(ggplot2)
mod <- lmer(mpg ~ hp + (1 |cyl), data = mtcars)
pred <- predict(mod,re.form = NA)
pdat <- data.frame(mtcars[,c('hp','cyl')], mpg = pred, up = pred+1, low = pred-1)
Adding show.legend = F to function calls work as expected:
gp <-
ggplot(data = mtcars, aes(x = hp, y = mpg, color = cyl, group = cyl), show.legend = F) +
geom_point(aes(group = cyl),show.legend = F) +
facet_wrap(~cyl) +
geom_line(data = pdat, aes(group = cyl),show.legend = F, color = 'orange')
But when I want to add a legend for a geom_ribbon fill based on the same group (and therefore cannot use the show.legend = F argument), I get a legend for my group again...
gp + geom_ribbon(data = pdat, aes(ymin = low, ymax = up, group = cyl, fill = 'mod'), alpha = 0.3) +
scale_fill_manual(values=c("orange"), name="model")
The outputs:

The last geom cover the first geom, you can try this
ggplot(data = mtcars, aes(x = hp, y = mpg, color = cyl, group = cyl), show.legend = F) +
geom_point(aes(group = cyl)) +
facet_wrap(~cyl) +
geom_line(data = pdat, aes(group = cyl),color = 'orange')
gp + geom_ribbon(data = pdat, aes(ymin = low, ymax = up, group = cyl, fill = 'mod'), alpha = 0.3) +
scale_fill_manual(values=c("orange"), name="model")+
scale_color_continuous(guide ='none')
By the way, my idea comes from the link you posted at the top.

Related

How to assign colors to multicolor scatter plot with multicolor fitted lines in ggplot2

Problem
I have some data points stored in data.frame with three variables, x, y, and gender. My goal is to draw several generally fitted lines and also lines specifically fitted for male/female over the scatter plot, with points coloured by gender. It sounds easy but some issues just persist.
What I currently do is to use a new set of x's and predict y's for every model, combine the fitted lines together in a data.frame, and then convert wide to long, with their model name as the third var (from this post: ggplot2: how to add the legend for a line added to a scatter plot? and this: Add legend to ggplot2 line plot I learnt that mapping should be used instead of setting colours/legends separately). However, while I can get a multicolor line plot, the points come without specific colour for gender (already a factor) as I expected from the posts I referenced.
I also know it might be possible to use aes=(y=predict(model)), but I met other problems for this. I also tried to colour the points directly in aes, and assign colours separately for each line, but the legend cannot be generated unless I use lty, which makes legend in the same colour.
Would appreciate any idea, and also welcome to change the whole method.
Code
Note that two pairs of lines overlap. So it only appeared to be two lines. I guess adding some jitter in the data might make it look differently.
slrmen<-lm(tc~x+I(x^2),data=data[data['gender']==0,])
slrwomen<-lm(tc~x+I(x^2),data=data[data['gender']==1,])
prdf <- data.frame(x = seq(from = range(data$x)[1],
to = range(data$x)[2], length.out = 100),
gender = as.factor(rep(1,100)))
prdm <- data.frame(x = seq(from = range(data$x)[1],
to = range(data$x)[2], length.out = 100),
gender = as.factor(rep(0,100)))
prdf$fit <- predict(fullmodel, newdata = prdf)
prdm$fit <- predict(fullmodel, newdata = prdm)
rawplotdata<-data.frame(x=prdf$x, fullf=prdf$fit, fullm=prdm$fit,
linf=predict(slrwomen, newdata = prdf),
linm=predict(slrmen, newdata = prdm))
plotdata<-reshape2::melt(rawplotdata,id.vars="x",
measure.vars=c("fullf","fullm","linf","linm"),
variable.name="fitmethod", value.name="y")
plotdata$fitmethod<-as.factor(plotdata$fitmethod)
plt <- ggplot() +
geom_line(data = plotdata, aes(x = x, y = y, group = fitmethod,
colour=fitmethod)) +
scale_colour_manual(name = "Fit Methods",
values = c("fullf" = "lightskyblue",
"linf" = "cornflowerblue",
"fullm"="darkseagreen", "linm" = "olivedrab")) +
geom_point(data = data, aes(x = x, y = y, fill = gender)) +
scale_fill_manual(values=c("blue","green")) ## This does not work as I expected...
show(plt)
Code for another method (omitted two lines), which generates same-colour legend and multi-color plot:
ggplot(data = prdf, aes(x = x, y = fit)) + # prdf and prdm are just data frames containing the x's and fitted values for different models
geom_line(aes(lty="Female"),colour = "chocolate") +
geom_line(data = prdm, aes(x = x, y = fit, lty="Male"), colour = "darkblue") +
geom_point(data = data, aes(x = x, y = y, colour = gender)) +
scale_colour_discrete(name="Gender", breaks=c(0,1),
labels=c("Male","Female"))
This is related to using the colour aesthetic for lines and the fill aesthetics for points in your own (first) example. In the second example, it works because the colour aesthetic is used for lines and points.
By default, geom_point can not map a variable to fill, because the default point shape (19) doesn't have a fill.
For fill to work on points, you have to specify shape = 21:25 in geom_point(), outside of aes().
Perhaps this small reproducible example helps to illustrate the point:
Simulate data
set.seed(4821)
x1 <- rnorm(100, mean = 5)
set.seed(4821)
x2 <- rnorm(100, mean = 6)
data <- data.frame(x = rep(seq(20,80,length.out = 100),2),
tc = c(x1, x2),
gender = factor(c(rep("Female", 100), rep("Male", 100))))
Fit models
slrmen <-lm(tc~x+I(x^2), data = data[data["gender"]=="Male",])
slrwomen <-lm(tc~x+I(x^2),data = data[data["gender"]=="Female",])
newdat <- data.frame(x = seq(20,80,length.out = 200))
fitted.male <- data.frame(x = newdat,
gender = "Male",
tc = predict(object = slrmen, newdata = newdat))
fitted.female <- data.frame(x = newdat,
gender = "Female",
tc = predict(object = slrwomen, newdata = newdat))
Plot using colour aesthetics
Use the colour aesthetics for both points and lines (specify in ggplot such that it gets inherited throughout). By default, geom_point can map a variable to colour.
library(ggplot2)
ggplot(data, aes(x = x, y = tc, colour = gender)) +
geom_point() +
geom_line(data = fitted.male) +
geom_line(data = fitted.female) +
scale_colour_manual(values = c("tomato","blue")) +
theme_bw()
Plot using colour and fill aesthetics
Use the fill aesthetics for points and the colour aesthetics for lines (specify aesthetics in geom_* to prevent them being inherited). This will reproduce the problem.
ggplot(data, aes(x = x, y = tc)) +
geom_point(aes(fill = gender)) +
geom_line(data = fitted.male, aes(colour = gender)) +
geom_line(data = fitted.female, aes(colour = gender)) +
scale_colour_manual(values = c("tomato","blue")) +
scale_fill_manual(values = c("tomato","blue")) +
theme_bw()
To fix this, change the shape argument in geom_point to a point shape that can be filled (21:25).
ggplot(data, aes(x = x, y = tc)) +
geom_point(aes(fill = gender), shape = 21) +
geom_line(data = fitted.male, aes(colour = gender)) +
geom_line(data = fitted.female, aes(colour = gender)) +
scale_colour_manual(values = c("tomato","blue")) +
scale_fill_manual(values = c("tomato","blue")) +
theme_bw()
Created on 2021-09-19 by the reprex package (v2.0.1)
Note that the scales for colour and fill get merged automatically if the same variable is mapped to both aesthetics.
It seems to me that what you really want to do is use ggplot2::stat_smooth instead of trying to predict yourself.
Borrowing the data from #scrameri:
ggplot(data, aes(x = x, y = tc, color = gender)) +
geom_point() +
stat_smooth(aes(linetype = "X^2"), method = 'lm',formula = y~x + I(x^2)) +
stat_smooth(aes(linetype = "X^3"), method = 'lm',formula = y~x + I(x^2) + I(x^3)) +
scale_color_manual(values = c("darkseagreen","lightskyblue"))

Set the legend of a ggplotly() plot to have only the color and not the shape index

I have the dataframe below:
etf_id<-c("a","b","c","d","e","a","b","c","d","e","a","b","c","d","e")
factor<-c("A","A","A","A","A","B","B","B","B","B","C","C","C","C","C")
normalized<-c(-0.048436801,2.850578601,1.551666490,0.928625186,-0.638111793,
-0.540615895,-0.501691539,-1.099239823,-0.040736139,-0.192048665,
0.198915407,-0.092525810,0.214317734,0.550478998,0.024613778)
df<-data.frame(etf_id,factor,normalized)
and I create a ggplotly() boxplot with:
library(ggplot2)
library(plotly)
ggplotly(ggplot(data = df, aes(x = factor, y = normalized)) +
geom_boxplot(aes(fill = as.factor(factor)),outlier.colour = 'black') +
geom_point(data = df, position = position_dodge(0.75))+geom_point(data = df,
aes(x = factor, y = normalized, shape = etf_id, color = etf_id),
size = 2))
I take as a result a boxplot with this legend:
but I want my legend to have only the color distinction like below. Note that the factors wont be 3 every time but may vary from 1 to 8.
The recommended way to alter plotly elements is to use the style() function. You can identify the elements and traces by inspecting plotly_json().
I'm not sure if there's a more compact way, but you can achieve the desired result using:
p <- ggplotly(ggplot(data = df, aes(x = factor, y = normalized)) +
geom_boxplot(aes(fill = as.factor(factor)),outlier.colour = 'black') +
geom_point(data = df, position = position_dodge(0.75))+geom_point(data = df,
aes(x = factor, y = normalized, shape = etf_id, color = etf_id),
size = 2))
p <- style(p, showlegend = FALSE, traces = 5:9)
for (i in seq_along(levels(df$factor))) {
p <- style(p, name = levels(df$factor)[i], traces = i)
}
p
Note that in this case the factor levels and traces align but that won't always be the case so you may need to adjust this (i.e. i + x).
One quick way would be to add show.legend = FALSE to supress the legend from showing.
library(ggplot2)
ggplot(data = df, aes(x = factor, y = normalized)) +
geom_boxplot(aes(fill = as.factor(factor)),outlier.colour = 'black') +
geom_point(position = position_dodge(0.75)) +
geom_point(aes(x = factor, y = normalized, shape = etf_id, color = etf_id),
size = 2, show.legend=FALSE)
Unfortunately, this does not work when this is passed to ggplotly. You can use theme(legend.position='none') which works but suppresses all the legends instead of specific ones. One dirty hack is to disable specific legend manually
temp_plot <- ggplotly(ggplot(data = df, aes(x = factor, y = normalized)) +
geom_boxplot(aes(fill = as.factor(factor)),outlier.colour = 'black') +
geom_point(position = position_dodge(0.75)) +
geom_point(aes(x = factor, y = normalized, shape = etf_id, color = etf_id),size = 2))
temp_plot[[1]][[1]][4:9] <- lapply(temp_plot[[1]][[1]][4:9], function(x) {x$showlegend <- FALSE;x})
temp_plot

ggplot legend: geom_abline interference

Objective: to have a color legend for geom_point showing a colored dot for each group together with a legend for geom_abline showing one colored line for each line.
Am I doing something wrong? Is there a solution?
# data: mtcars + some made-up straight lines
library(ggplot2)
df = data.frame(Intercept = c(2,3,4), Slope = c(0,0,0), Group = factor(c(1, 2, 3)))
Comment about #1: There's nothing special about the basic plot, but I have grouped the data inside the aes() and made color an aes(). I think it is standard to have both "group" and "color" inside the aes to achieve grouping and coloring.
# 1. my basic plot
ggplot(data = mtcars, aes(x = mpg, y = wt, group = vs, color = factor(vs))) +
geom_point() -> p
Comment about #2: Clearly I did not set up ggplot right to handle the legend properly. I also tried to add group = Group inside the aes. But there is a somewhat more serious problem: geom_point forms 2 groups, geom_abline forms 3 groups, but the legend is showing only 4 color/line combinations. One of them has been merged (the green one). What have I done wrong here?
# 2. my naive attempt to add ablines of 3 different colours
p + geom_abline(data = df, aes(intercept = Intercept, slope = Slope,
colour = Group))
Comment about #3: The ablines have been removed in the legend, but the points are still not right. From here on it gets more and more desperate.
# 3. Suppress the ab_line legend?
p + geom_abline(data = df, aes(intercept = Intercept, slope = Slope,
colour = Group), show.legend = FALSE)
Comment about #4: This is what I'm going for at the moment. Better no legend than a wrong legend. Shame about losing the colors though.
# 4. Remove the geom_abline legend AND colors
p + geom_abline(data = df, aes(intercept = Intercept, slope = Slope))
Comment #5: I don't know what I was hoping here... that if I defined the data and aes inside the call to geom_point() rather than the ggplot(), somehow geom_abline()) would not hijack the colors and legend, but no, it does not appear to make a difference.
# 5. An attempt to set the aes inside geom_point() instead of ggplot()
ggplot() +
geom_point(data = mtcars, aes(x = mpg, y = wt, group = vs, color = factor(vs))) +
geom_abline(data = df, aes(intercept = Intercept, slope = Slope, color = "groups")) +
scale_color_manual(values = c("red", "blue", "black"))
One option would be to use a filled shape for the mtcars data, then you can have a fill scale and a colour scale, rather than two colour scales. You could add an option such as colour="white" to the geom_point statement in order to change the colour of the edges of the points, if you don't want the black outlines.
library(ggplot2)
df = data.frame(Intercept = c(2,3,4), Slope = c(0,0,0), Group = factor(c(1, 2, 3)))
ggplot(data = mtcars, aes(x = mpg, y = wt, group = vs, fill = factor(vs))) +
geom_point(shape=21, size=2) +
geom_abline(data = df, aes(intercept = Intercept, slope = Slope,
colour = Group))
if you need or want a horizontal line in the legend you might consider using this code:
library(ggplot2)
df = data.frame(Intercept = c(2,3,4), Slope = c(0,0,0),
Group = factor(c(1, 2, 3)))
ggplot(data = mtcars,
aes(x = mpg, y = wt, group = vs, fill = factor(vs))) +
geom_point(shape=21, size=2) +
geom_hline(data = df,
aes(yintercept = Intercept,colour = Group))
plot with geom_hline

Have separate legends for a set of point-line plots, and a vertical line plot

Example data frame (if there's a better/more idiomatic way to do this, let me know):
n <- 10
group <- rep(c("A","B","C"),each = n)
x <- rep(seq(0,1,length = n),3)
y <- ifelse(group == "A",1+x,ifelse(group == "B",2+2*x,3+3*x))
df <- data.frame(group,x,y)
xd <- 0.5
des <- data.frame(xd)
I want to plot create point-line plots for the data in df, add a vertical curve at the x location indicated by xd, and get readable legends for both. I tried the following:
p <- ggplot(data = df, aes(x = x, y = y, color = group)) + geom_point() + geom_line(aes(linetype=group))
p <- p + geom_vline(data = des, aes(xintercept = xd), color = "blue")
p
Not quite what I had in mind, there's no legend for the vertical line.
A small modification (I don't understand why geom_vline is one of the few geometries with a show.legend parameter, which moreover defaults to FALSE!):
p <- ggplot(data = df, aes(x = x, y = y, color = group)) + geom_point() + geom_line(aes(linetype=group))
p <- p + geom_vline(data = des, aes(xintercept = xd), color = "blue", show.legend = TRUE)
p
At least now the vertical bar is showing in the legend, but I don't want it to go in the same "category" (?) as group. I would like another legend entry, titled Design, and containing only the vertical line. How can I achieve this?
A possible approach is to add an extra dummy aesthetic like fill =, which we'll subsequently use to create the second legend in combination with scale_fill_manual() :
ggplot(data = df, aes(x = x, y = y, color = group)) +
geom_point() +
geom_line(aes(linetype=group), show.legend = TRUE) +
geom_vline(data = des,
aes(xintercept = xd, fill = "Vertical Line"), # add dummy fill
colour = "blue") +
scale_fill_manual(values = 1, "Design", # customize second legend
guide = guide_legend(override.aes = list(colour = c("blue"))))

Adding legend to a single plot of multiple linear regression plot

I plotted two ggplots from two different datasets in one single plot. plots are simple linear regression. I want to add legend both for lines and dots in the plot with different colours. How can I do that? The code I used for plot is as below. But, I failed to add a desirable legend to that.
ggplot() +
geom_point(aes(x = Time_1, y = value1)) +
geom_point(aes(x = Time_2, y = value2)) +
geom_line(aes(x = Time_1, y = predict(reg, newdata = dataset)))+
geom_line(aes(x = Time_Month.x, y = predict(regressor, newdata = training_set)))+
ggtitle('Two plots in a single plot')
ggplot2 adds legends automatically if it has groups within the data. Your original code provides the minimum amount of information to ggplot(), basically enough for it to work but not enough to create a legend.
Since your data comes from two different objects due to the two different regressions, then it looks like all you need in this case is to add the 'color = "INSERT COLOR NAME"' argument to each geom_point() and each geom_line(). Using R's built-in mtcars data set for example, what you have is similar to
ggplot(mtcars) + geom_point(aes(x = cyl, y = mpg)) + geom_point(aes(x = cyl, y = wt)) + ggtitle("Example Graph")
Graph without Legend
And what you want can be obtained by using something similar to,
ggplot(mtcars) + geom_point(aes(x = cyl, y = mpg, color = "blue")) + geom_point(aes(x = cyl, y = wt, color = "green")) + ggtitle("Example Graph")
Graph with Legend
Which would seem to translate to
ggplot() +
geom_point(aes(x = Time_1, y = value1, color = "blue")) +
geom_point(aes(x = Time_2, y = value2, color = "green")) +
geom_line(aes(x = Time_1, y = predict(reg, newdata = dataset), color = "red"))+
geom_line(aes(x = Time_Month.x, y = predict(regressor, newdata = training_set), color = "yellow"))+
ggtitle('Two plots in a single plot')
You could also use the size, shape, or alpha arguments inside of aes() to differentiate the different series.

Resources