My grouped Cross bars DON'T dodge, while my Box plots DO - r

I want my Crossbars to dodge as well, like my boxplots do, in my example it didn't work, any one can explain what i'm doing wrong or fix my code? I used mtcars as an example and included the result as a picture in which my Crossbars DON'T dodge.
library(ggplot2)
mtcars$am = factor(mtcars$am)
mtcars$vs = factor(mtcars$vs)
cleanup = theme(
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_blank(),
axis.line = element_line(colour = "black"),
legend.key = element_rect(fill = "white"),
text = element_text(size = 10))
p = ggplot(data = mtcars, aes(x = am , y = mpg, colour = vs)) +
geom_boxplot(aes(colour = vs)) +
stat_summary(aes(colour = vs),
fun.data = "mean_cl_normal",
geom = "crossbar",
position = position_dodge(width = 0.90),
width = .2,
col = "red")
p +
cleanup +
xlab("AM") +
ylab("Miles per Gallon") +
scale_colour_manual(name = "VS",
values = c("Light Gray",
"Dark Grey"))
Which gave me this Graph:

The reason is simple: Specifying col = "red" overwrites the aes mapping to color. There is actually only one group for the crossbars and thus nothing to dodge.
You can fix this by mapping to group:
ggplot(mtcars, aes(x = am , y = mpg, colour = vs)) +
#geom_boxplot() +
stat_summary(aes(group = vs),
fun.data = "mean_cl_normal",
geom = "crossbar",
position = position_dodge(width = 0.9),
width = .2,
col = "red")
However, discarding a color scale only for the crossbars obviously doesn't result in a good plot.

Related

How to set a conditional size scale based on name in ggplot?

Below is a simple bubble plot for three character traits (Lg_chr, Mid_chr, and Sm_chr) across three locations.
All good, except that because the range of Lg_chr is several orders of magnitude larger than the ranges for the other two traits, it swamps out the area differences between the smaller states, making the differences very difficult to see - for example, the area of the points for for Location_3's Mid_chr (70) and Sm_chr (5), look almost the same.
Is there a way to set a conditional size scale based on name in ggplot2 without having to facit wrap them? Maybe a conditional statement for scale_size_continuous(range = c(<>, <>)) separately for Lg_chr, Mid_chr, and Sm_chr?
test_df = data.frame(lg_chr = c(100000, 150000, 190000),
mid_chr = c(50, 90, 70),
sm_chr = c(15, 10, 5),
names = c("location_1", "location_2", "location_3"))
#reformat for graphing
test_df_long<- test_df %>% pivot_longer(!names,
names_to = c("category"),
values_to = "value")
#plot
ggplot(test_df_long,
aes(x = str_to_title(category),
y = str_to_title(names),
colour = str_to_title(names),
size = value)) +
geom_point() +
geom_text(aes(label = value),
colour = "white",
size = 3) +
scale_x_discrete(position = "top") +
scale_size_continuous(range = c(10, 50)) +
scale_color_manual(values = c("blue", "red",
"orange")) +
labs(x = NULL, y = NULL) +
theme(legend.position = "none",
panel.background = element_blank(),
panel.grid = element_blank(),
axis.ticks = element_blank()) ```
Edit:
You could use ggplot_build to manually modify the point layer [[1]] to specify the sizes of your points like this:
#plot
p <- ggplot(test_df_long,
aes(x = str_to_title(category),
y = str_to_title(names),
colour = str_to_title(names),
size = value)) +
geom_point() +
geom_text(aes(label = value),
colour = "white",
size = 3) +
scale_x_discrete(position = "top") +
scale_color_manual(values = c("blue", "red",
"orange")) +
labs(x = NULL, y = NULL) +
theme(legend.position = "none",
panel.background = element_blank(),
panel.grid = element_blank(),
axis.ticks = element_blank())
q <- ggplot_build(p)
q$data[[1]]$size <- c(7,4,1,8,5,2,9,6,3)*5
q <- ggplot_gtable(q)
plot(q)
Output:
You could use scale_size with a log10 scale to make the difference more visuable like this:
#plot
ggplot(test_df_long,
aes(x = str_to_title(category),
y = str_to_title(names),
colour = str_to_title(names),
size = value)) +
geom_point() +
geom_text(aes(label = value),
colour = "white",
size = 3) +
scale_size(trans="log10", range = c(10, 50)) +
scale_x_discrete(position = "top") +
scale_color_manual(values = c("blue", "red",
"orange")) +
labs(x = NULL, y = NULL) +
theme(legend.position = "none",
panel.background = element_blank(),
panel.grid = element_blank(),
axis.ticks = element_blank())
Output:

Adding 2 vlines to a ggplot, with an additional custom legend for the lines

I'm trying to have a ggplot with two vertical lines on it, with a separate custom legend to explain what the lines represent.
This is my code (using iris):
irate <- as.data.frame(iris)
irate$Species <- as.character(irate$Species)
irritating <- ggplot(irate) +
geom_line(aes(y = Sepal.Length, x = Sepal.Width), color = "blue") +
geom_point(aes(y = Sepal.Length, x = Sepal.Width, color = Species), size = 5) +
theme(legend.position = "right", axis.text.y = element_blank(), axis.title.y = element_blank(), axis.ticks.y = element_blank(), panel.grid.major.y = element_blank())+
labs(title = "The chart", x = "Sepal Width") +
geom_vline(color = "black", linetype = "dashed", aes(xintercept = 3))+
geom_vline(color = "purple", linetype = "dashed", aes(xintercept = 4))
irritating
I've tried using things like scale_color_manual (etc), but for some reason when doing so it will interfere with the main legend and not produce a separate one.
Using answers to questions like: Add legend to geom_vline
I add: +scale_color_manual(name = "still problematic", values = c("black", "purple", "red"))
the addition of "red" in the vector the only way to get it to produce a chart (otherwise there's a: "Insufficient values in manual scale. 3 needed but only 2 provided." error).
One option to achieve your desired result would be to use a different aesthetic to create the colro legend for your vlines. In my code below I map on the linetype aes and use the override.aes argument of guide_legend to assign the right colors:
irate <- as.data.frame(iris)
irate$Species <- as.character(irate$Species)
library(ggplot2)
#> Warning: package 'ggplot2' was built under R version 4.2.2
base <- ggplot(irate) +
geom_line(aes(y = Sepal.Length, x = Sepal.Width), color = "white") +
geom_point(aes(y = Sepal.Length, x = Sepal.Width, color = Species), size = 5) +
theme(legend.position = "right", axis.text.y = element_blank(), axis.title.y = element_blank(), axis.ticks.y = element_blank(), panel.grid.major.y = element_blank())+
labs(title = "The chart", x = "Sepal Width")
base +
geom_vline(color = "black", aes(xintercept = 3, linetype = "Black Line"))+
geom_vline(color = "purple", aes(xintercept = 4, linetype = "Purple line")) +
scale_linetype_manual(name = "still problematic", values = c("dashed", "dashed")) +
guides(linetype = guide_legend(override.aes = list(color = c("black", "purple"))))
And the second and perhaps cleaner solution would be to use the ggnewscale package which allows to have multiple legends for the same aesthetic:
library(ggnewscale)
base +
new_scale_color() +
geom_vline(linetype = "dashed", aes(xintercept = 3, color = "Black Line"))+
geom_vline(linetype = "dashed", aes(xintercept = 4, color = "Purple line")) +
scale_color_manual(name = "still problematic", values = c("black", "purple"))
Here is a way with package ggnewscale that makes plotting two legends for two color mappings very easy.
The main trick is to create a data.frame with the x intercept values and colors, then assign this data set to the data argument of geom_vline. If this is run after new_scale_color() the colors will be the right ones.
library(ggplot2)
library(ggnewscale)
irate <- iris
irate$Species <- as.character(irate$Species)
happy <- data.frame(xintercept = c(3, 4), color = c("black", "purple"))
delightful <- ggplot(irate) +
geom_line(aes(y = Sepal.Length, x = Sepal.Width), color = "blue") +
geom_point(aes(y = Sepal.Length, x = Sepal.Width, color = Species), size = 5) +
theme(legend.position = "right", axis.text.y = element_blank(), axis.title.y = element_blank(), axis.ticks.y = element_blank(), panel.grid.major.y = element_blank())+
labs(title = "The chart", x = "Sepal Width") +
new_scale_color() +
geom_vline(
data = happy,
mapping = aes(xintercept = xintercept, color = color),
linetype = "dashed"
) +
scale_color_manual(values = c(black = "black", purple = "purple"))
delightful
Created on 2022-11-30 with reprex v2.0.2
Using linetype in aes to put those parts in the legend you can then override the guide display colours:
library(ggplot2)
irate <- as.data.frame(iris)
irate$Species <- as.character(irate$Species)
irritating <- ggplot(irate) +
geom_line(aes(y = Sepal.Length, x = Sepal.Width), color = "white") +
geom_point(aes(y = Sepal.Length, x = Sepal.Width, color = Species), size = 5) +
theme(
legend.position = "right",
axis.text.y = element_blank(),
axis.title.y = element_blank(),
axis.ticks.y = element_blank(),
panel.grid.major.y = element_blank()
) +
labs(title = "The chart", x = "Sepal Width") +
geom_vline(linewidth = 1.5,
color = "black",
aes(xintercept = 3, linetype = "Something")) +
geom_vline(linewidth = 1.5,
color = "purple",
aes(xintercept = 4, linetype = "Another thing")) +
scale_linetype_manual(
"Things",
values = c("dashed", "dashed"),
guide = guide_legend(override.aes = list(colour = c("purple", "black")))
)
irritating

Remove standard deviation from legend ggplot in R

I would like to remove sd bars and mean from legend while keeping them on the main figure. In my case I have this:
And I want something like this:
This is my code:
data_summary <- function(x) {
m <- mean(x)
ymin <- m-std.error(x)
ymax <- m+std.error(x)
return(c(y=m,ymin=ymin,ymax=ymax))
}
a<-ggplot(esto,aes(x= Group, y=value, colour = Group, fill=fluency_test),
pattern_fill = "black",
colour = 'black') +
geom_boxplot(outlier.shape = NA,lwd=1.5) +
guides(colour = "none")+
geom_point(position=position_jitterdodge(),alpha=0.5)+
xlab("Group")+
labs(y = names(features)[[i_feature]])+
theme(axis.text.x = element_text(angle = 45, hjust = 1))+
stat_summary(fun.data=data_summary, color="black",position=position_dodge(width=0.75), size = 1.3)+
scale_shape_manual("Summary Statistics", values=c("Mean"="+"))+
scale_color_manual(values=c("#7CAE00","#F8766D","#00BFC4","#C77CFF"))+
scale_fill_manual(values=c("white","azure3"))+
theme_gray(base_size = 18)+
theme(legend.key.size = unit(2, "cm"),
legend.key.width = unit(1,"cm"),legend.title=element_blank(),panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
panel.background = element_blank(), axis.line = element_line(colour = "black"))
You can prevent the point range from being displayed in the legend by adding show.legend=FALSE to stat_summary.
Using a minimal reprex based on mtcars:
library(ggplot2)
ggplot(mtcars, aes(x = cyl, y = mpg, color = factor(cyl))) +
geom_boxplot() +
stat_summary(color = "black", show.legend = FALSE)
#> No summary function supplied, defaulting to `mean_se()`

ggplot2: Adjust legend symbols in overlayed plot

I need to create a plot, in which a histogram gets overlayed by a density. Here is my result so far using some example data:
library("ggplot2")
set.seed(1234)
a <- round(rnorm(10000, 5, 5), 0)
b <- rnorm(10000, 5, 7)
df <- data.frame(a, b)
ggplot(df) +
geom_histogram(aes(x = a, y = ..density.., col = "histogram", linetype = "histogram"), fill = "blue") +
stat_density(aes(x = b, y = ..density.., col = "density", linetype = "density"), geom = "line") +
scale_color_manual(values = c("red", "white"),
breaks = c("density", "histogram")) +
scale_linetype_manual(values = c("solid", "solid")) +
theme(legend.title = element_blank(),
legend.position = c(.75, .75),
legend.text = element_text(size = 15))
Unfortunately I can not figure out how I can change the symbols in the legend properly. The first symbol should be a relatively thick red line and the second symbol should be a blue box without the white line in the middle.
Based on some internet research, I tried to change different things in scale_linetype_manual and further I tried to use override.aes, but I could not figure out how I would have to use it in this specific case.
EDIT - Here is the best solution based on the very helpful answers below.
ggplot(df) +
geom_histogram(aes(x = a, y = ..density.., linetype = "histogram"),
fill = "blue",
# I added the following 2 lines to keep the white colour arround the histogram.
col = "white") +
scale_linetype_manual(values = c("solid", "solid")) +
stat_density(aes(x = b, y = ..density.., linetype = "density"),
geom = "line", color = "red") +
theme(legend.title = element_blank(),
legend.position = c(.75, .75),
legend.text = element_text(size = 15),
legend.key = element_blank()) +
guides(linetype = guide_legend(override.aes = list(linetype = c(1, 0),
fill = c("white", "blue"),
size = c(1.5, 1.5))))
As you thought, most of the work can be done via override.aes for linetype.
Note I removed color from the aes of both layers to avoid some trouble I was having with the legend box outline. Doing this also avoids the need for the scale_*_* function calls. To set the color of the density line I used color outside of aes.
In override.aes I set the linetype to be solid or blank, the fill to be either white or blue, and the size to be 2 or 0 for the density box and histogram box, respectively.
ggplot(df) +
geom_histogram(aes(x = a, y = ..density.., linetype = "histogram"), fill = "blue") +
stat_density(aes(x = b, y = ..density.., linetype = "density"), geom = "line", color = "red") +
theme(legend.title = element_blank(),
legend.position = c(.75, .75),
legend.text = element_text(size = 15),
legend.key = element_blank()) +
guides(linetype = guide_legend(override.aes = list(linetype = c(1, 0),
fill = c("white", "blue"),
size = c(2, 0))))
The fill and colour aesthetics are labelled by histogram and density respectively, and their values set using scale_*_manual. Doing so maps directly to the desired legend without needing any overrides.
ggplot(df) +
geom_histogram(aes(x = a, y = ..density.., fill = "histogram")) +
stat_density(aes(x = b, y = ..density.., colour="density"), geom = "line") +
scale_fill_manual(values = c("blue")) +
scale_colour_manual(values = c("red")) +
labs(fill="", colour="") +
theme(legend.title = element_blank(),
legend.position = c(.75, .75),
legend.box.just = "left",
legend.background = element_rect(fill=NULL),
legend.key = element_rect(fill=NULL),
legend.text = element_text(size = 15))

Remove fill around legend key in ggplot

I would like to remove the gray rectangle around the legend. I have tried various methods but none have worked.
ggtheme <-
theme(
axis.text.x = element_text(colour='black'),
axis.text.y = element_text(colour='black'),
panel.background = element_blank(),
panel.grid.minor = element_blank(),
panel.grid.major = element_blank(),
panel.border = element_rect(colour='black', fill=NA),
strip.background = element_blank(),
legend.justification = c(0, 1),
legend.position = c(0, 1),
legend.background = element_rect(colour = NA),
legend.key = element_rect(colour = "white", fill = NA),
legend.title = element_blank()
)
colors <- c("red", "blue")
df <- data.frame(year = c(1:10), value = c(10:19), gender = rep(c("male","female"),each=5))
ggplot(df, aes(x = year, y = value)) + geom_point(aes(colour=gender)) +
stat_smooth(method = "loess", formula = y ~ x, level=0, size = 1,
aes(group = gender, colour=gender)) +
ggtheme + scale_color_manual(values = colors)
You get this grey color inside legend keys because you use stat_smooth() that as default makes also confidence interval around the line with some fill (grey if fill= isn't used inside the aes()).
One solution is to set se=FALSE for stat_smooth() if you don't need the confidence intervals.
+stat_smooth(method = "loess", formula = y ~ x, level=0, size = 1,
aes(group = gender, colour=gender),se=FALSE)
Another solution is to use the function guides() and override.aes= to remove fill from the legend but keep confidence intervals around lines.
+ guides(color=guide_legend(override.aes=list(fill=NA)))
theme_set(theme_gray() + theme(legend.key=element_blank()))
If you want also to remove grey background:
theme_set(theme_bw() + theme(legend.key=element_blank()))
+ theme(legend.background=element_blank())

Resources