R ggplot2 scale alpha discrete to display in legend - r

I'm trying to make a plot across two factors (strain and sex) and use the alpha value to communicate sex. Here is my code and the resulting plot:
ggplot(subset(df.zfish.data.overall.long, day=='day_01' & measure=='distance.from.bottom'), aes(x=Fish.name, y=value*100)) +
geom_boxplot(aes(alpha=Sex, fill=Fish.name), outlier.shape=NA) +
scale_alpha_discrete(range=c(0.3,0.9)) +
scale_fill_brewer(palette='Set1') +
coord_cartesian(ylim=c(0,10)) +
ylab('Distance From Bottom (cm)') +
xlab('Strain') +
scale_x_discrete(breaks = c('WT(AB)', 'WT(TL)', 'WT(TU)', 'WT(WIK)'), labels=c('AB', 'TL', 'TU', 'WIK')) +
guides(color=guide_legend('Fish.name'), fill=FALSE) +
theme_classic(base_size=10)
I'd like for the legend to reflect the alpha value in the plot (i.e. alpha value F = 0.3, alpha value M=0.9) as greyscale/black as I think that will be intuitive.
I've tried altering the scale_alpha_discrete, but cannot figure out how to send it a single color for the legend. I've also tried playing with 'guides()' without much luck. I suspect there's a simple solution, but I cannot see it.

One option to achieve your desired result would be to set the fill color for the alpha legend via the override.aes argument of guide_legend.
Making use of mtcars as example data:
library(ggplot2)
ggplot(mtcars, aes(x = cyl, y = mpg)) +
geom_boxplot(aes(fill = factor(cyl), alpha = factor(am))) +
scale_alpha_discrete(range = c(0.3, 0.9), guide = guide_legend(override.aes = list(fill = "black"))) +
scale_fill_brewer(palette='Set1') +
theme_classic(base_size=10) +
guides(fill = "none")
#> Warning: Using alpha for a discrete variable is not advised.

Related

ggplot2 and R - Applying custom colors to a multi group histogram in long format

I've made a histogram graph that shows the distribution of lidar returns per elevation for three lidar scans I have done.
I've converted my data to long format, with:
one column called 'value', describing the z position of each point
one column called 'variable', containing the name of each
scan group
In the attached image you can see the histograms of my three scan groups. I am currently using viridis to color the histogram by scan group (ie. the name of the scan in the variable column). However, I want to match the colours in the graph with colours I already have.
How might I do this?
The hexcols I'd like to like color each of my three histograms with are:
lightgreen = "#62FE96"
lightred = "#FE206B"
darkpurple = "#62278E"
A link to my data - 'density2'
My current code:
library(tidyverse)
library(viridisLite)
library(viridis)
# histogram
p <- density2 %>%
ggplot( aes(x=value,color = variable, show.legend = FALSE)) +
geom_histogram(binwidth = 1, alpha = 0.5, position="identity") +
scale_color_viridis(discrete =TRUE) +
scale_fill_viridis(discrete=TRUE) +
theme_bw() +
labs(fill="") +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())
p + scale_y_sqrt() + theme(legend.position="none") + labs(y = "data pts", x = "elevation (m)")
Any help would be most appreciated!
Delete the scale_color_viridis and scale_fill_viridis lines - these are applying the Viridis color scale. Replace with scale_fill_manual(values = c(lightgreen, lightred, darkpurple)). And in your aesthetic mapping replace color = variable with fill = variable. For a histogram, color refers to the color of the lines outlining each bar, and fill refers to the color each bar is filled in.
This should leave you with:
p <- density2 %>%
ggplot(aes(x = value, fill = variable)) +
geom_histogram(binwidth = 1, alpha = 0.5, position = "identity") +
scale_fill_manual(values = c(lightgreen, lightred, darkpurple)) +
theme_bw() +
labs(fill = "") +
theme(panel.grid = element_blank())
p + scale_y_sqrt() +
theme(legend.position = "none") +
labs(y = "data pts", x = "elevation (m)")
I've also done some other clean-up. show.legend = FALSE does not belong inside aes() - and your theme(legend.position = "none") should take care of it.
I did not download your data, save it in my working directory, import it into R, and test this code on it. If you need more help, please post a small subset of your data in a copy/pasteable format (e.g., dput(density2[1:20, ]) for the first 20 rows---choose a suitable subset) and I'll be happy to test and adjust.

R: ggplot2 density plot shows wrong fill colors

I would like to plot densities of two variables ("red_variable", "green_variable") from two independent dataframes on one density plot, using red and green color for the two variables.
This is my attempt at coding:
library(ggplot2)
### Create dataframes
red_dataframe <- data.frame(red_variable = c(10,11,12,13,14))
green_dataframe <- data.frame(green_variable = c(6,7,8,9,10))
mean(red_dataframe$red_variable) # mean is 12
mean(green_dataframe$green_variable) # mean is 8
### Set colors
red_color= "#FF0000"
green_color= "#008000"
### Trying to plot densities with correct colors and correct legend entries
ggplot() +
geom_density(aes(x=red_variable, fill = red_color, alpha=0.5), data=red_dataframe) +
geom_density(aes(x=green_variable, fill = green_color, alpha=0.5), data=green_dataframe) +
scale_fill_manual(labels = c("Density of red_variable", "Density of green_variable"), values = c(red_color, green_color)) +
xlab("X value") +
ylab("Density") +
labs(fill = "Legend") +
guides(alpha=FALSE)
Result: The legend shows correct colors, but the colors on the plot are wrong: The "red" variable is plotted with green color, the "green" variable with red color. The "green" density (mean=8) should appear left and the "red" density (mean=12) on the right on the x-axis. This behavior of the plot doesn't make any sense to me.
I can in fact get the desired result by switching red and green in the code:
### load ggplot2
library(ggplot2)
### Create dataframes
red_dataframe <- data.frame(red_variable = c(10,11,12,13,14))
green_dataframe <- data.frame(green_variable = c(6,7,8,9,10))
mean(red_dataframe$red_variable) # mean is 12
mean(green_dataframe$green_variable) # mean is 8
### Set colors
red_color= "#FF0000"
green_color= "#008000"
### Trying to plot densities with correct colors and correct legend entries
ggplot() +
geom_density(aes(x=red_variable, fill = green_color, alpha=0.5), data=red_dataframe) +
geom_density(aes(x=green_variable, fill = red_color, alpha=0.5), data=green_dataframe) +
scale_fill_manual(labels = c("Density of red_variable", "Density of green_variable"), values = c(red_color, green_color)) +
xlab("X value") +
ylab("Density") +
labs(fill = "Legend") +
guides(alpha=FALSE)
... While the plot makes sense now, the code doesn't. I cannot really trust code doing the opposite of what I would expect it to do. What's the problem here? Am I color blind?
On your code, in order to have color at the right position, you need to specify fill = red_color or fill = green_color (as well as alpha as it is a constant - as pointed out by #Gregor) outside of the aes such as:
...+
geom_density(aes(x=red_variable), alpha=0.5, fill = red_color, data=red_dataframe) +
geom_density(aes(x=green_variable), alpha=0.5, fill = green_color, data=green_dataframe) + ...
Alternatively, you can bind your dataframes together, reshape them into a longer format (much more appropriate to ggplot) and then add color column that you can use with scale_fill_identity function (https://ggplot2.tidyverse.org/reference/scale_identity.html):
df <- cbind(red_dataframe,green_dataframe)
library(tidyr)
library(ggplot2)
library(dplyr)
df <- df %>% pivot_longer(.,cols = c(red_variable,green_variable), names_to = "var",values_to = "val") %>%
mutate(Color = ifelse(grepl("red",var),red_color,green_color))
ggplot(df, aes(val, fill = Color))+
geom_density(alpha = 0.5)+
scale_fill_identity(guide = "legend", name = "Legend", labels = levels(as.factor(df$var)))+
xlab("X value") +
ylab("Density")
Does it answer your question ?
You're trying to use ggplot as if it's base graphics... the mindset shift can take a little while to get used to. dc37's answer shows how you should do it. I'll try to explain what goes wrong in your attempt:
When you put fill = green_color inside aes(), because it's inside aes() ggplot essentially creates a new column of data filled with the green_color values in your green_data_frame, i.e., "#008000", "#008000", "#008000", .... Ditto for the red color values in the red data frame. We can see this if we modify your plot by simply deleting your scale:
ggplot() +
geom_density(aes(x = red_variable, fill = green_color, alpha = 0.5), data =
red_dataframe) +
geom_density(aes(x = green_variable, fill = red_color, alpha = 0.5), data =
green_dataframe) +
xlab("X value") +
ylab("Density") +
labs(fill = "Legend") +
guides(alpha = FALSE)
We can actually get what you want by putting the identity scale, which is designed for the (common in base, rare in ggplot2) case where you actually put color values in the data.
ggplot() +
geom_density(aes(x = red_variable, fill = green_color, alpha = 0.5), data =
red_dataframe) +
geom_density(aes(x = green_variable, fill = red_color, alpha = 0.5), data =
green_dataframe) +
scale_fill_identity() +
xlab("X value") +
ylab("Density") +
labs(fill = "Legend") +
guides(alpha = FALSE)
When you added your scale_fill_manual, ggplot was like "okay, cool, you want to specify colors and labels". But you were thinking in the order that you added the layers to the plot (much like base graphics), whereas ggplot was thinking of these newly created variables "#FF0000" and "#008000", which it ordered alphabetically by default (just as if they were factor or character columns in a data frame). And since you happened to add the layers in reverse alphabetical order, it was switched.
dc37's answer shows a couple better methods. With ggplot you should (a) work with a single, long-format data frame whenever possible (b) don't put constants inside aes() (constant color, constant alpha, etc.), (c) set colors in a scale_fill_* or scale_color_* function when they're not constant.

ggplot2 custom legend combining shape and fill

I am trying to combine fill and color in a ggplot2 legend. Because there are several colors for the x axis, it seems logic that ggplot2 do not know which color to pick in the legend.
For exemple:
library(ggplot2)
ggplot(mpg, aes(fl, hwy)) +
geom_point(aes(color = fl, shape = factor(year), fill = fl)) +
scale_shape_manual(values = c("circle filled", "circle open"))
My goal would be to manually edit the factor(year) legend to look like this:
I played around the guides() function without success.
Edit:
Values for shape can be found by running vignette("ggplot2-specs").
you already had the nearly correct answer with the scale_shape_manual. But somehow the "circle filled" argument is invalid. Since i'm not sure where those values can be looked up, i took the values from a table of a similar question (source):
so with value 20 and 79 you can get the desired result.
ggplot(mpg, aes(fl, hwy)) +
geom_point(aes(color = fl, shape = factor(year), fill = fl)) +
scale_shape_manual(values = c(16,79))
output:
Ok, so here is a very roundabout way of making it look like the image above. Maybe someone else can come up with a more intuitive version:
ggplot(mpg, aes(fl, hwy)) +
geom_point(aes(color = fl, shape = factor(year), fill = factor(year))) +
scale_shape_manual(values = c(16,79), guide = FALSE) +
scale_fill_manual("Year", values=c("grey","white"))+
guides(fill = guide_legend(override.aes = list(shape = c(21,21),
color = c("black", "black"))))
Output:

How do I add a coloured scatterplot to a boxplot without changing the colours of the boxplot? [duplicate]

I am attempting to overlay two different plots. One is geom_boxplot, the other geom_jitter. I would like each to have its own color scale. But when I add the second color scale, I am given the error
"Scale for 'fill' is already present. Adding another scale for 'fill',
which will replace the existing scale."
I am assuming I am doing something wrong. Any advice would be appreciate
This is a rough example of my working code:
P <- ggplot(dat) +
geom_boxplot(aes(x=ve, y=metValue, fill=metric), alpha=.35, w=0.6, notch=FALSE, na.rm = TRUE) +
scale_fill_manual(values=cpalette1) +
geom_hline(yintercept=0, colour="#DD4466", linetype = "longdash") +
theme(legend.position="none")
P + geom_jitter(dat2, aes(x=ve, y=metValue, fill=atd),
size=2, shape=4, alpha = 0.4,
position = position_jitter(width = .03, height=0.03), na.rm = TRUE) +
scale_fill_manual(values=cpalette2)
dat and dat2 have the same schema, but different values.
I found several examples addressing overlaying graphs but none that appeared to address this specific concern.
First, made two sample data frames with the same names as in example.
dat<-data.frame(ve=rep(c("FF","GG"),times=50),
metValue=rnorm(100),metric=rep(c("A","B","D","C"),each=25),
atd=rep(c("HH","GG"),times=50))
dat2<-data.frame(ve=rep(c("FF","GG"),times=50),
metValue=rnorm(100),metric=rep(c("A","B","D","C"),each=25),
atd=rep(c("HH","GG"),times=50))
I assume that you do not need to use argument fill= in the geom_jitter() because color for shape=4 can be set also with colour= argument. Then you can use scale_colour_manual() to set your values. Instead of cpallete just used names of colors.
P <- ggplot(dat) +
geom_boxplot(aes(x=ve, y=metValue, fill=metric), alpha=.35, w=0.6, notch=FALSE, na.rm = TRUE) +
geom_hline(yintercept=0, colour="#DD4466", linetype = "longdash") +
scale_fill_manual(values=c("red","blue","green","yellow"))+
theme(legend.position="none")
P + geom_jitter(data=dat2, aes(x=ve, y=metValue, colour=atd),
size=2, shape=4, alpha = 0.4,
position = position_jitter(width = .03, height=0.03), na.rm = TRUE) +
scale_colour_manual(values=c("red","blue"))

ggplot2 - using two different color scales for overlayed plots

I am attempting to overlay two different plots. One is geom_boxplot, the other geom_jitter. I would like each to have its own color scale. But when I add the second color scale, I am given the error
"Scale for 'fill' is already present. Adding another scale for 'fill',
which will replace the existing scale."
I am assuming I am doing something wrong. Any advice would be appreciate
This is a rough example of my working code:
P <- ggplot(dat) +
geom_boxplot(aes(x=ve, y=metValue, fill=metric), alpha=.35, w=0.6, notch=FALSE, na.rm = TRUE) +
scale_fill_manual(values=cpalette1) +
geom_hline(yintercept=0, colour="#DD4466", linetype = "longdash") +
theme(legend.position="none")
P + geom_jitter(dat2, aes(x=ve, y=metValue, fill=atd),
size=2, shape=4, alpha = 0.4,
position = position_jitter(width = .03, height=0.03), na.rm = TRUE) +
scale_fill_manual(values=cpalette2)
dat and dat2 have the same schema, but different values.
I found several examples addressing overlaying graphs but none that appeared to address this specific concern.
First, made two sample data frames with the same names as in example.
dat<-data.frame(ve=rep(c("FF","GG"),times=50),
metValue=rnorm(100),metric=rep(c("A","B","D","C"),each=25),
atd=rep(c("HH","GG"),times=50))
dat2<-data.frame(ve=rep(c("FF","GG"),times=50),
metValue=rnorm(100),metric=rep(c("A","B","D","C"),each=25),
atd=rep(c("HH","GG"),times=50))
I assume that you do not need to use argument fill= in the geom_jitter() because color for shape=4 can be set also with colour= argument. Then you can use scale_colour_manual() to set your values. Instead of cpallete just used names of colors.
P <- ggplot(dat) +
geom_boxplot(aes(x=ve, y=metValue, fill=metric), alpha=.35, w=0.6, notch=FALSE, na.rm = TRUE) +
geom_hline(yintercept=0, colour="#DD4466", linetype = "longdash") +
scale_fill_manual(values=c("red","blue","green","yellow"))+
theme(legend.position="none")
P + geom_jitter(data=dat2, aes(x=ve, y=metValue, colour=atd),
size=2, shape=4, alpha = 0.4,
position = position_jitter(width = .03, height=0.03), na.rm = TRUE) +
scale_colour_manual(values=c("red","blue"))

Resources