customize two legends inside one graph in ggplot2 - r

I wanted to comment on the following doubt.
Using this code:
Plot<-data.frame(Age=c(0,0,0,0,0),Density=c(0,0,0,0,0),Sensitivity=c(0,0,0,0,0),inf=c(0,0,0,0,0),sup=c(0,0,0,0,0),tde=c(0,0,0,0,0))
Plot[1,]<-c(1,1,0.857,0.793,0.904,0.00209834)
Plot[2,]<-c(1,2,0.771 ,0.74,0.799,0.00348286)
Plot[3,]<-c(1,3,0.763 ,0.717,0.804,0.00577784)
Plot[4,]<-c(1,4,0.724 ,0.653,0.785,0.00504161)
Plot[5,]<-c(2,1,0.906,0.866,0.934,0.00365742)
Plot[6,]<-c(2,2,0.785 ,0.754,0.813,0.00440399)
Plot[7,]<-c(2,3,0.660,0.593,0.722,0.00542849)
Plot[8,]<-c(2,4,0.544,0.425,0.658,0.00433052)
names(Plot)<-c("Age","Mammographyc density","Sensitivity","inf","sup","tde")
Plot$Age<-c("50-59","50-59","50-59","50-59","60-69","60-69","60-69","60-69")
Plot$Density<-c("Almost entirely fat","Scattered fibroglandular density","Heterogeneously dense","Extremely dense","Almost entirely fat","Scattered fibroglandular density","Heterogeneously dense","Extremely dense")
levels(Plot$Age)<-c("50-59","60-69")
levels(Plot$Density)<-c("Almost entirely fat","Scattered fibroglandular density","Heterogeneously dense","Extremely dense")
pd <- position_dodge(0.2) #
Plot$Density <- reorder(Plot$Density, 1-Plot$Sensitivity)
ggplot(Plot, aes(x = Density, y = 100*Sensitivity, colour=Age)) +
geom_errorbar(aes(ymin = 100*inf, ymax = 100*sup), width = .1, position = pd) +
geom_line(position = pd, aes(group = Age), linetype = c("dashed")) +
geom_point(position = pd, size = 4)+
scale_y_continuous(expand = c(0, 0),name = 'Sensitivity (%)',sec.axis = sec_axis(~./5, name = 'Breast cancer detection rate (per 1000 mammograms)', breaks = c(0,5,10,15,20),
labels = c('0‰',"5‰", '10‰', '15‰', '20‰')), limits = c(0,100)) +
geom_line(position = pd, aes(x = Density, y = tde * 5000, colour = Age, group = Age), linetype = c("dashed"), data = Plot) +
geom_point(shape=18,aes(x = Density, y = tde * 5000, colour = Age, group = Age), position = pd, size = 4) +
theme_light() +
scale_color_manual(name="Age (years)",values = c("50-59"= "grey55", "60-69" = "grey15")) +
theme(legend.position="bottom") + guides(colour = guide_legend(), size = guide_legend(),
shape = guide_legend())
I have made the following graph,
in which the axis on the left is the scale of the circles and the axis on the right is the scale of the diamonds. The fact is that I would like to have a legend approximately like this:
But it is impossible for me, I have tried suggestions of other threads like scale_shape and different commands in guides but I have not got success. I just want to make clear the difference in what shape and color represent.
Would someone know how to help me?
Best regards,

What you should do is a panel plot to avoid the confusion of double axes:
library(dplyr)
library(tidyr)
Plot %>%
gather(measure, Result, Sensitivity, tde) %>%
ggplot(aes(x = Density, y = Result, colour=Age)) +
geom_errorbar(aes(ymin = inf, ymax = sup), width = .1, position = pd,
data = . %>% filter(measure == "Sensitivity")) +
geom_line(aes(group = Age), position = pd, linetype = "dashed") +
geom_point(position = pd, size = 4)+
# scale_y_continuous(expand = c(0, 0), limits = c(0, 1)) +
scale_y_continuous(labels = scales::percent) +
facet_wrap(~measure, ncol = 1, scales = "free_y") +
theme_light() +
scale_color_manual(name="Age (years)",values = c("50-59"= "grey55", "60-69" = "grey15")) +
theme(legend.position="bottom")
But to do what you asked, you problem is that you have only 1 non-positional aesthetic mapped so you cannot get more than one legend. To force a second legend, you need to add a second mapping. It can be a dummy mapping that has no effect, as below we map alpha but then manually scale both levels to 100%. This solution is not advisable because, as you have done in your example of a desired legend, it is easy to mix up the mappings and have your viz tell a lie by mislabeling which points are sensitivity and which are detection rate.
ggplot(Plot, aes(x = Density, y = 100*Sensitivity, colour=Age, alpha = Age)) +
geom_errorbar(aes(ymin = 100*inf, ymax = 100*sup), width = .1, position = pd) +
geom_line(position = pd, aes(group = Age), linetype = c("dashed")) +
geom_point(position = pd, size = 4)+
scale_y_continuous(expand = c(0, 0),name = 'Sensitivity (%)',sec.axis = sec_axis(~./5, name = 'Breast cancer detection rate (per 1000 mammograms)', breaks = c(0,5,10,15,20),
labels = c('0‰',"5‰", '10‰', '15‰', '20‰')), limits = c(0,100)) +
geom_line(position = pd, aes(x = Density, y = tde * 5000, colour = Age, group = Age), linetype = c("dashed"), data = Plot) +
geom_point(shape=18,aes(x = Density, y = tde * 5000, colour = Age, group = Age), position = pd, size = 4) +
theme_light() +
scale_color_manual(name="Age (years)",values = c("50-59"= "grey55", "60-69" = "grey15")) +
scale_alpha_manual(values = c(1, 1)) +
guides(alpha = guide_legend("Sensitivity"),
color = guide_legend("Detection Rate", override.aes = list(shape = 18))) +
theme(legend.position="bottom")

Related

geom_ribbon: Fill area between lines - spurious lines connecting groups

I'm trying to build a plot with two lines and fill the area between with geom_ribbon. I've managed to select a fill color (red/blue) depending on the sign of the difference between two lines. First I create two new columns in the dataset for ymax, ymin. It seems to work but some spurious lines appear joining red areas.
Is geom_ribbon appropriate to fill the areas? Is there any problem in the plot code?
This is the code used to create the plot
datos.2022 <- datos.2022 %>% mutate(y1 = SSTm-273.15, y2 = SST.mean.day-273.15)
datos.2022 %>% ggplot(aes(x=fecha)) +
geom_line(aes(y=SSTm-273.15), color = "red") +
geom_line(aes(y=SST.mean.day - 273.15), color = "black") +
geom_ribbon(aes(ymax=y1, ymin = y2, fill = as.factor(sign)), alpha = 0.6) +
scale_fill_manual(guide = "none", values=c("blue","red")) +
scale_y_continuous(limits = c(10,30)) +
scale_x_date(expand = c(0,0), breaks = "1 month", date_labels = "%b" ) +
theme_hc() +
labs(x="",y ="SST",title = "Temperature (2022)") +
theme(text = element_text(size=20,family = "Arial"))
And this is the output
Example data for the plot available at https://www.dropbox.com/s/mkk8w7py2ynuy1t/temperature.dat?dl=0
What if you made two different series to plot as ribbons - one for the positive values where there is no distance between ymin and ymax for the places where the difference is negative. And one for the negative values that works in a similar way.
library(dplyr)
library(ggplot2)
datos.2022 <- datos.2022 %>%
mutate(y1 = SSTm-273.15,
y2 = SST.mean.day-273.15) %>%
rowwise() %>%
mutate(high_pos = max(SST.mean.day - 273.15, y1),
low_neg = min(SSTm-273.15, y2))
datos.2022 %>% ggplot(aes(x=fecha)) +
geom_line(aes(y=SSTm-273.15), color = "red") +
geom_line(aes(y=SST.mean.day - 273.15), color = "black") +
geom_ribbon(aes(ymax=high_pos, ymin = SST.mean.day - 273.15, fill = "b"), alpha = 0.6, col="transparent", show.legend = FALSE) +
geom_ribbon(aes(ymax = SST.mean.day - 273.15, ymin = low_neg, fill = "a"), alpha = 0.6, col="transparent", show.legend = FALSE) +
scale_fill_manual(guide = "none", values=c("blue","red")) +
scale_y_continuous(limits = c(10,30)) +
scale_x_date(expand = c(0,0), breaks = "1 month", date_labels = "%b" ) +
#theme_hc() +
labs(x="",y ="SST",title = "Temperature (2022)") +
theme(text = element_text(size=20,family = "Arial"))

How do I remove certain values from my x-axis?

I'm coding a graph for a project i was wondering if i could get rid of the 2,3 ,and 4 that appear on the x-axis so the bars are only on 0, 1 and 5
this is my code:
strain_colours <- c("dark blue", "light blue")
ggplot(data = bar_sum, aes(x = conc, y = mean, fill = strain)) +
scale_fill_manual(values = strain_colours) +
geom_bar(stat = 'identity',
position = "dodge") +
geom_errorbar(data = bar_sum, aes(x = conc, ymin = mean - se, ymax = mean + se),
width = 0.25,
position = position_dodge(width = 0.9)) +
scale_x_continuous(expand = c(0,0),
name = "Caffeine concentration (mM)",
) +
scale_y_continuous(expand = c(0,0),
name = "Mean distance travelled (mm)") +
theme_classic()
i've also tried setting the x-scale to discrete, and setting the x value to a string of numbers but it doesn't like that

Raincloud plot - histogram?

I would like to create a raincloud plot. I have successfully done it. But I would like to know if instead of the density curve, I can put a histogram (it's better for my dataset).
This is my code if it can be usefull
ATSC <- ggplot(data = data, aes(y = atsc, x = numlecteur, fill = numlecteur)) +
geom_flat_violin(position = position_nudge(x = .2, y = 0), alpha = .5) +
geom_point(aes(y = atsc, color = numlecteur), position = position_jitter(width = .15), size = .5, alpha = 0.8) +
geom_point(data = sumld, aes(x = numlecteur, y = mean), position = position_nudge(x = 0.25), size = 2.5) +
geom_errorbar(data = sumld, aes(ymin = lower, ymax = upper, y = mean), position = position_nudge(x = 0.25), width = 0) +
guides(fill = FALSE) +
guides(color = FALSE) +
scale_color_brewer(palette = "Spectral") +
scale_y_continuous(breaks=c(0,2,4,6,8,10), labels=c("0","2","4","6","8","10"))+
scale_fill_brewer(palette = "Spectral") +
coord_flip() +
theme_bw() +
expand_limits(y=c(0, 10))+
xlab("Lecteur") + ylab("Age total sans check")+
raincloud_theme
I think we can maybe put the "geom_histogram()" but it doesn't work
Thank you in advance for your help !
(sources : https://peerj.com/preprints/27137v1.pdf
https://neuroconscience.wordpress.com/2018/03/15/introducing-raincloud-plots/)
This is actually not quite easy. There are a few challenges.
geom_histogram is "horizontal by nature", and the custom geom_flat_violin is vertical - as are boxplots. Therefore the final call to coord_flip in that tutorial. In order to combine both, I think best is switch x and y, forget about coord_flip, and use ggstance::geom_boxploth instead.
Creating separate histograms for each category is another challenge. My workaround to create facets and "merge them together".
The histograms are scaled way bigger than the width of the points/boxplots. My workaround scale via after_stat function.
How to nudge the histograms to the right position above Boxplot and points - I am converting the discrete scale to a continuous by mapping a constant numeric to the global y aesthetic, and then using the facet labels for discrete labels.
library(tidyverse)
my_data<-read.csv("https://data.bris.ac.uk/datasets/112g2vkxomjoo1l26vjmvnlexj/2016.08.14_AnxietyPaper_Data%20Sheet.csv")
my_datal <-
my_data %>%
pivot_longer(cols = c("AngerUH", "DisgustUH", "FearUH", "HappyUH"), names_to = "EmotionCondition", values_to = "Sensitivity")
# use y = -... to position boxplot and jitterplot below the histogram
ggplot(data = my_datal, aes(x = Sensitivity, y = -.5, fill = EmotionCondition)) +
# after_stat for scaling
geom_histogram(aes(y = after_stat(count/100)), binwidth = .05, alpha = .8) +
# from ggstance
ggstance::geom_boxploth( width = .1, outlier.shape = NA, alpha = 0.5) +
geom_point(aes(color = EmotionCondition), position = position_jitter(width = .15), size = .5, alpha = 0.8) +
# merged those calls to one
guides(fill = FALSE, color = FALSE) +
# scale_y_continuous(breaks = 1, labels = unique(my_datal$EmotionCondition))
scale_color_brewer(palette = "Spectral") +
scale_fill_brewer(palette = "Spectral") +
# facetting, because each histogram needs its own y
# strip position = left to fake discrete labels in continuous scale
facet_wrap(~EmotionCondition, nrow = 4, scales = "free_y" , strip.position = "left") +
# remove all continuous labels from the y axis
theme(axis.title.y = element_blank(), axis.text.y = element_blank(),
axis.ticks.y = element_blank())
Created on 2021-04-15 by the reprex package (v1.0.0)

Reordering Groups in Raincloud Plot [duplicate]

This question already has answers here:
Change stacked bar order when aesthetic fill is based on the interaction of two factors
(1 answer)
ggplot legends - change labels, order and title
(1 answer)
Closed 2 years ago.
Currently, I have a plot that looks like this:
library(ggplot2)
df <- ToothGrowth
df %>%
ggplot(aes(x = supp, y = len, fill = supp)) +
geom_flat_violin(position = position_nudge(x = .2, y = 0),
alpha = .8) +
geom_point(aes(shape = supp),
position = position_jitter(width = .05),
size = 2, alpha = 0.8) +
geom_boxplot(width = .1, outlier.shape = NA, alpha = 0.5) +
coord_flip() +
labs(title = "ToothGrowth Length by Supplement",
y = "Length") +
theme_classic() +
raincloud_theme
I'd like to change the order so that OJ appears above VC. I've tried adding scale_x_discrete before coord_flip(), but that seems to mess up my plot as this is a raincloud plot -- I'd have to move not only the violin plot, but also the points and the box plot. I've also tried adding rev(), which also messed up my plot. What is the best way to reorder this?
EDIT
Thank you for the comment! How do I change the orders in an interaction plot?
df %>%
mutate(Supplement = ifelse(supp == "VC",
"VC",
"OJ"),
Dose = ifelse(dose == "0.5",
"0.5",
"1.0"),
Interaction = factor(str_replace(interaction(Supplement, Dose),
'\\.', '\n'),
ordered=TRUE)) %>%
ggplot(aes(x = Interaction, y = len, fill = Interaction)) +
geom_flat_violin(position = position_nudge(x = .2, y = 0),
alpha = .8) +
geom_point(aes(shape = Dose),
position = position_jitter(width = .05),
size = 2, alpha = 0.8) +
geom_boxplot(width = .1, outlier.shape = NA, alpha = 0.5) +
coord_flip() +
labs(title = "Effect of Supplement and Dose on Length",
y = "Growth Length") +
scale_fill_discrete(guide = guide_legend(override.aes = list(shape = c(".", ".")))) +
scale_shape_discrete(guide = guide_legend(override.aes = list(size = 3))) +
theme_classic() +
raincloud_theme
ggplot2 will interpret the supp factor and the order in the plot correspond to the levels of the factor.
You will need to change the levels of the supp factor.
df <- ToothGrowth
df$supp
df$supp <- relevel(ToothGrowth$supp,ref = "VC")
df$supp
df %>%
ggplot(aes(x = supp, y = len, fill = supp)) +
geom_flat_violin(position = position_nudge(x = .2, y = 0),
alpha = .8) +
geom_point(aes(shape = supp),
position = position_jitter(width = .05),
size = 2, alpha = 0.8) +
geom_boxplot(width = .1, outlier.shape = NA, alpha = 0.5) +
coord_flip() +
labs(title = "ToothGrowth Length by Supplement",
y = "Length") +
theme_classic() +
raincloud_theme

How to scale a Geom_bar to be in line with an overlaid line graph in R ggplot

I am trying to overlay a bar chart with a line graph on a single plot with ggplot in R. My line graph works fine but the data are much larger than the data for the bar chart component.
How could I use an additional scale for this bar chart or do something that will get this to look nice all in one graph.
Here is my plot code thus far:
chart <- data.frame("QuantileName" = 1:5, "AvgLoss" = c(100, 500, 1000, 2500, 3000), "AvgFactor" = c(1.0, 1.1, 1.3, 1.4, 1.5))
Plot <- ggplot(chart, aes(x = 1:5)) +
scale_x_continuous(name = "Quintile", limits = c(0, 5 + .5), breaks = seq(1, 5)) +
geom_line(aes(y = AvgLoss, colour = "AvgLoss")) +
geom_bar(aes(y = AvgFactor, colour = "AvgFactor" ), stat = "identity") +
geom_text(aes(y = AvgLoss, label = round(AvgLoss)), position = position_nudge(x = .3)) +
geom_point(aes(y = AvgLoss)) +
ylab("AvgLoss") +
scale_colour_manual("",breaks = c("AvgLoss","AvgFactor"), values = c("AvgLoss" = "red", "AvgFactor" = "grey")) +
ggtitle("Quintile Plot") +
theme(plot.title = element_text(hjust=0.5))
Plot
Thank you for any help!
Essentialy, multiply your AvgFactor variable by a number
+ geom_bar(aes(y = AvgFactor*1000, colour = "AvgFactor" ), stat = "identity")
and set
+ scale_y_continuous(sec.axis = sec_axis(~ ./1000, name = "AvgFactor"))
so your plot code would look like
Plot <- ggplot(chart, aes(x = 1:5)) +
scale_x_continuous(name = "Quintile", limits = c(0, 5 + .5),
breaks = seq(1, 5)) +
geom_bar(aes(y = AvgFactor*1000, colour = "AvgFactor" ),
stat = "identity") +
geom_line(aes(y = AvgLoss, colour = "AvgLoss")) +
geom_text(aes(y = AvgLoss,
label = round(AvgLoss)),
position = position_nudge(x = .3)) +
geom_point(aes(y = AvgLoss)) +
ylab("AvgLoss") +
scale_colour_manual("",breaks = c("AvgLoss","AvgFactor"),
values = c("AvgLoss" = "red", "AvgFactor" = "grey")) +
ggtitle("Quintile Plot") +
theme(plot.title = element_text(hjust=0.5)) +
scale_y_continuous(sec.axis = sec_axis(~ ./1000, name = "AvgFactor"))
However, I think it is probably more elegant to avoid secondary axes whenever possible.
It may be useful to know that geom_col(...) is shorthand for geom_bar(..., stat = 'identity')

Resources