R geom_bar with overlay and no distance between bars - r

I'm trying to have this plot:
library(ggplot2)
testdata <- data.frame(x=c(1:5),a=c(1:5),b=c(10:6))
ggplot(testdata, aes(x = x)) +
geom_bar(aes(y=b), stat = "identity", fill="darkgrey")+
geom_bar(aes(y=a), linetype="solid", colour="black", stat = "identity", fill=NA)
with a legend. Since I can't get a legend going here (this would be a nice workaround if you know how), I tried to approach the 'correct' way to plot this in ggplot, namely with long data such as:
testdata <- data.frame(x = c(1:5,1:5), y = c(1:5,10:6), group = c(rep("a",5), rep("b",5)))
ggplot(testdata, aes(x = x, y = y, group = group, fill = group)) +
geom_bar(stat = "identity", linetype = "solid", colour = "black", position = position_dodge(width = 0))+
scale_fill_manual(values = c(NA, "darkgrey"))
While I do have a legend here, the bars are much further apart. The usual parameter to change this is the width inside position_dodge but I need that equal to 0 for the 100% overlay. So my question in the ideal world is: can I decrease the distance of the bars in the second plot? If this is not feasible, can I add a legend to the fist plot? Any help would be appreciated!

Try changing the width outside position_dodge.
ggplot(testdata, aes(x = x, y = y, group = group, fill = group)) +
geom_bar(width=1.5,stat = "identity", linetype = "solid", colour = "black", position = position_dodge(width = 0))+
scale_fill_manual(values = c(NA, "darkgrey"))

Related

Manually plotting significance relations between sub-groups on ggplot2 barplot

I've been trying to plot manually labelled significance bars for a subset of groups on a ggplot2 barplot using ggsignif or ggpubr without much luck. The data is something like the following MWE:
set.seed(3)
## create data
df <- data.frame(activity = rep(c("Flying", "Jumping"), 3),
mean = rep(rnorm(6, 50, 25)),
group = c(rep("Ecuador", 2),
rep("Peru", 2),
rep("Brazil", 2)))
## plot it
ggplot(df, aes(x = activity, y = mean, fill = group)) +
geom_bar(position = position_dodge(0.9), stat = "identity",
width = 0.9, colour = "black", size = 0.1) +
xlab("Activity") + ylab("Mean")
Where I'd like to manually specify significance labels, say between Brazil/Ecuador" on "Flying", and Ecuador/Peru on "Jumping". Does anyone know how to properly deal with this kind of data, for example with ggsignif? And is there a way to refer to each bar by name, rather than try to work out its x-axis position?
If you know on which barchart you want to add your significance labels, you can do:
library(ggsignif)
library(ggplot2)
ggplot(df, aes(x = activity, y = mean, fill = group)) +
geom_bar(position = position_dodge(0.9), stat = "identity",
width = 0.9, colour = "black", size = 0.1) +
xlab("Activity") + ylab("Mean")+
geom_signif(y_position = c(60,50), xmin = c(0.7,2), xmax = c(1,2.3),
annotation=c("**", "***"), tip_length=0)
Does it answer your question ?

Add standard error as shaded area instead of errorbars in geom_boxplot

I have my boxplot and I added the mean with stat_summary as a line over the box plot. I want to add the standard error, but I don't want errorbar.
Basically, I want to add the standard error as shaded area, as you can do using geom_ribbon.
I used the PlantGrowth dataset to show you briefly what I've tried.
library(ggplot2)
ggplot(PlantGrowth, aes(group, weight))+
stat_boxplot( geom='errorbar', linetype=1, width=0.5)+
geom_boxplot(fill="yellow4",colour="black",outlier.shape=NA) +
stat_summary(fun.y=mean, colour="black", geom="line", shape=18, size=1,aes(group=1))+
stat_summary(fun.data = mean_se, geom = "errorbar")
I did it using geom_errorbar in stat_summary, and tried to substitute geom_errorbar with geom_ribbon, as I saw in some other examples around the web, but it doesn't work.
Something like this one, but with the error as shaded area instead of error bars (which make it a bit confusing to see)
Layering so many geoms becomes hard to read, but here's a simplified version with a few options. Aside from just paring things down a bit to see what I was editing, I added a tile as a summary geom; tile is similar to rect, except it assumes it will be centered at whatever its x value is, so you don't need to worry about the x-axis placement that geom_rect requires. You might experiment with fill colors and opacity—I made the boxplots white just to illustrate better.
library(ggplot2)
gg <- ggplot(PlantGrowth, aes(x = group, y = weight)) +
stat_boxplot(geom = "errorbar", width = 0.5) +
geom_boxplot(fill = "white", outlier.shape = NA, width = 0.7) +
stat_summary(aes(group = 1), fun.y = mean, geom = "line")
gg +
stat_summary(fun.data = mean_se, geom = "tile", width = 0.7,
fill = "pink", alpha = 0.6)
Based on your comments that you want a ribbon, you could instead use a ribbon with group = 1 the same as for the line.
gg +
stat_summary(aes(group = 1), fun.data = mean_se, geom = "ribbon",
fill = "pink", alpha = 0.6)
The ribbon doesn't make a lot of sense across a discrete variable, but here's an example with some dummy data for a continuous group, where this setup becomes more reasonable (though IMO still hard to read).
pg2 <- PlantGrowth
set.seed(123)
pg2$cont_group <- floor(runif(nrow(pg2), 1, 6))
ggplot(pg2, aes(x = cont_group, y = weight, group = cont_group)) +
stat_boxplot(geom = "errorbar", width = 0.5) +
geom_boxplot(fill = "white", outlier.shape = NA, width = 0.7) +
stat_summary(aes(group = 1), fun.y = mean, geom = "line") +
stat_summary(aes(group = 1), fun.data = mean_se, geom = "ribbon",
fill = "pink", alpha = 0.6)

ggplot2 - using two different color scales for same fill in overlayed plots

A very similar question to the one asked here. However, in that situation the fill parameter for the two plots are different. For my situation the fill parameter is the same for both plots, but I want different color schemes.
I would like to manually change the color in the boxplots and the scatter plots (for example making the boxes white and the points colored).
Example:
require(dplyr)
require(ggplot2)
n<-4*3*10
myvalues<- rexp((n))
days <- ntile(rexp(n),4)
doses <- ntile(rexp(n), 3)
test <- data.frame(values =myvalues,
day = factor(days, levels = unique(days)),
dose = factor(doses, levels = unique(doses)))
p<- ggplot(data = test, aes(x = day, y = values)) +
geom_boxplot( aes(fill = dose))+
geom_point( aes(fill = dose), alpha = 0.4,
position = position_jitterdodge())
produces a plot like this:
Using 'scale_fill_manual()' overwrites the aesthetic on both the boxplot and the scatterplot.
I have found a hack by adding 'colour' to geom_point and then when I use scale_fill_manual() the scatter point colors are not changed:
p<- ggplot(data = test, aes(x = day, y = values)) +
geom_boxplot(aes(fill = dose), outlier.shape = NA)+
geom_point(aes(fill = dose, colour = factor(test$dose)),
position = position_jitterdodge(jitter.width = 0.1))+
scale_fill_manual(values = c('white', 'white', 'white'))
Are there more efficient ways of getting the same result?
You can use group to set the different boxplots. No need to set the fill and then overwrite it:
ggplot(data = test, aes(x = day, y = values)) +
geom_boxplot(aes(group = interaction(day, dose)), outlier.shape = NA)+
geom_point(aes(fill = dose, colour = dose),
position = position_jitterdodge(jitter.width = 0.1))
And you should never use data$column inside aes - just use the bare column. Using data$column will work in simple cases, but will break whenever there are stat layers or facets.

How to differentiate groups in a bar plot but still keep the gradient pattern?

How do I change fill pattern or color group by group? My current code only changes the outline; I want to change the fill pattern or fill color group (not outline) by sex but also keep my fill gradient which is by Eye.
colors <- c("Green", "Blue", "Hazel", "Brown")
data <- data.frame(HairEyeColor)
data$Eye <- as.numeric(factor(data$Eye, labels = 1:4))
data <- data[c(5,6,12,15,17,22,27,28), ]
ggplot(data, aes(x = Hair, y = Freq, fill = Eye, group = Sex)) +
geom_bar(stat = "identity", position = position_dodge(), aes(colour = Sex)) +
scale_fill_continuous(low = "blue", high = "green")
Bar plot from ggplot2 package does not support fill pattern at the moment (and as far as i know it is not possible with other packages neither).
However there are few solution that are going to help spot the difference in sex and eye easily which you can consider:
1.Using different (lighter) fill colours,thicker bar boundaries and theme_bw():
ggplot(data, aes(x = Hair, y = Freq, fill = Eye, group = Sex)) +
geom_bar(stat = "identity", position = position_dodge(), aes(colour = Sex), size=2) +
scale_fill_continuous(low = "white", high = "grey") + theme_bw()
Merging two columns: Sex and Eye to get the new factor column which is going to be used as a fill argument:
data$Sex_Eye <- paste(data$Sex, data$Eye, sep="_")
ggplot(data, aes(x = Hair, y = Freq, fill = Sex_Eye)) +
geom_bar(stat = "identity", position = position_dodge()) + theme_bw()
Using geom_jitter() instead of geom_bar() and setting up shape argument as Sex:
ggplot(data, aes(x = Hair, y = Freq, colour = Eye, shape = Sex)) +
geom_jitter(size=5) + scale_colour_continuous(low = "blue", high = "green") + theme_bw()

r (ggplot2 line graph): changing linetype for errorbars changes them in legend

I have a line graph like this one:
df <- data.frame(x = c(1,1,2,2,1,1,2,2),
y = c(1.5,1.9,2.1,1.6,1.4,1.8,2.0,1.7),
error = c(0.2),
group = c("g1","g2","g1","g2","g3","g4","g3","g4"))
ggplot(df, aes(x = x, y = y, color = group, linetype = group)) +
geom_point() + geom_line() +
geom_errorbar(aes(ymin = y - error, ymax = y + error),
linetype = 1, width = 0.5,
position = position_dodge(width = 0.2)) +
scale_color_manual(values = c("g1"="Black", "g2"="Grey", "g3"="Black", "g4"="Grey")) +
scale_linetype_manual(values=c("g1"=1,"g2"=1,"g3"=2,"g4"=2))
I need to make it black and white, so with several groups, I used both color and linetype. When I change line type, I want to have error bars solid although the lines are dotted, so I overrode the linetype for error bars. For some reason, this also changes the legend, so it is no longer clear which line is which.
I know this somehow depends on the color = group in aes, because when I just had the linetype, the legend was fine. For some reason I just can't find a way to do linetype, color, and solid errorbars at the same time. Anybody know why this is?
Try this:
ggplot(df, aes(x = x, y = y, colour = group, group = group)) +
geom_line(aes(y=y,linetype = group)) +
geom_point()+
geom_errorbar(aes(ymin = y - error, ymax = y + error),
colour = rep(c("black","grey"),4),
width = 0.1,
position = "dodge") +
scale_color_manual(values = c("g1"="Black", "g2"="Grey", "g3"="Black", "g4"="Grey")) +
scale_linetype_manual(values=c("g1"=1,"g2"=1,"g3"=2,"g4"=2))
You don't need the linetype = group inside ggplot, as the error bars will never use this info. You just make it more complex. linetype will be used only by the lines and the legend. error bars need to know the color and the grouping variable, that's why you include them inside ggplot.
Setting the colours by creating a column:
df <- data.frame(x = c(1,1,2,2,1,1,2,2),
y = c(1.5,1.9,2.1,1.6,1.4,1.8,2.0,1.7),
error = c(0.2),
group = c("g1","g2","g1","g2","g3","g4","g3","g4"))
df$group_cols = "black"
df$group_cols[df$group %in% c("g2","g4")] = "grey"
ggplot(df, aes(x = x, y = y, colour = group, group = group)) +
geom_line(aes(y=y,linetype = group)) +
geom_point()+
geom_errorbar(aes(ymin = y - error, ymax = y + error),
colour = df$group_cols,
width = 0.1,
position = "dodge") +
scale_color_manual(values = c("g1"="Black", "g2"="Grey", "g3"="Black", "g4"="Grey")) +
scale_linetype_manual(values=c("g1"=1,"g2"=1,"g3"=2,"g4"=2))

Resources