How to draw mean as a dotted line in boxplot using ggplot? - r

I was wondering if it's possible to draw a dotted line that corresponds to the mean value of my data in a box plot.
I know that there is possible to draw shapes with stat_summary() like for example drawing a + corresponding to the mean with stat_summary(fun.y=mean, shape="+", size=1, color = "black") nearest thing is using the geom="crossbar" but this is not dotted.
The idea is to get this graphed

You could achieve your desired result by setting linetype="dotted":
library(ggplot2)
ggplot(mtcars, aes(factor(cyl), mpg)) +
geom_boxplot() +
stat_summary(geom = "crossbar", fun = "mean", linetype = "dotted", width = .75)

Related

Add mean to grouped box plots in R using ggplot2

I have three Cultures of algae (A,B,C) at two temperatures (27C and 31C) and their densities. I want to make a box plot with Temperature in the x axis, Density in the y axis and the three cultures above each temperature (see picture below). I also need to include a dot with the mean density per culture and temperature. My script plots only one mean above each temperature but what I need is a mean for A#27C, B#27C, C#27 and A#31C, B#31C and C#31C. I tried to adapt some scripts with similar questions but I couldn’t get it to work. Any help would be much appreciated.
graph<-ggplot(Algae, aes(x = Temperature,
y = Density,
fill=Culture))+
geom_boxplot()+
stat_summary(fun=mean,
geom="point",
shape=20,
size=2,
color="red",
fill="red",
position = position_dodge2 (width = 0.5, preserve = "single"))
Remove fill in stat_summary and adapt the width in position
Here is an example with the mtcars data set:
ggplot(mtcars, aes(x = factor(am),
y = mpg,
fill=factor(cyl)))+
geom_boxplot() +
stat_summary(fun=mean,
geom="point",
shape=20,
size=2,
color="red",
position = position_dodge2 (width = 0.7, preserve = "single"))
I fixed it by adding a facet. Here is the script
graph <-ggplot(Algae, aes(x = Temperature,
y = Density,
fill=Culture))+
geom_boxplot()+
stat_summary(fun=mean,
geom="point",
shape=21,
size=2,
color="black",
fill="violet")+
facet_grid(.~Temperature,scales="free")
graph

How can I use different color or linetype aesthetics in same plot with ggplot?

I'm creating a plot with ggplot that uses colored points, vertical lines, and horizontal lines to display the data. Ideally, I'd like to use two different color or linetype scales for the geom_vline and geom_hline layers, but ggplot discourages/disallows multiple variables mapped to the same aesthetic.
# Create example data
library(tidyverse)
library(lubridate)
set.seed(1234)
example.df <- data_frame(dt = seq(ymd("2016-01-01"), ymd("2016-12-31"), by="1 day"),
value = rnorm(366),
grp = sample(LETTERS[1:3], 366, replace=TRUE))
date.lines <- data_frame(dt = ymd(c("2016-04-01", "2016-10-31")),
dt.label = c("April Fools'", "Halloween"))
value.lines <- data_frame(value = c(-1, 1),
value.label = c("Threshold 1", "Threshold 2"))
If I set linetype aesthetics for both geom_*lines, they get put in the
linetype legend together, which doesn't necessarily make logical sense
ggplot(example.df, aes(x=dt, y=value, colour=grp)) +
geom_hline(data=value.lines, aes(yintercept=value, linetype=value.label)) +
geom_vline(data=date.lines, aes(xintercept=as.numeric(dt), linetype=dt.label)) +
geom_point(size=1) +
scale_x_date() +
theme_minimal()
Alternatively, I could set one of the lines to use a colour aesthetic,
but then that again puts the legend lines in an illogical legend
grouping
ggplot(example.df, aes(x=dt, y=value, colour=grp)) +
geom_hline(data=value.lines, aes(yintercept=value, colour=value.label)) +
geom_vline(data=date.lines, aes(xintercept=as.numeric(dt), linetype=dt.label)) +
geom_point(size=1) +
scale_x_date() +
theme_minimal()
The only partial solution I've found is to use a fill aesthetic instead
of colour in geom_pointand setting shape=21 to use a fillable shape,
but that forces a black border around the points. I can get rid of the
border by manually setting color="white, but then the white border
covers up points. If I set colour=NA, no points are plotted.
ggplot(example.df, aes(x=dt, y=value, fill=grp)) +
geom_hline(data=value.lines, aes(yintercept=value, colour=value.label)) +
geom_vline(data=date.lines, aes(xintercept=as.numeric(dt), linetype=dt.label)) +
geom_point(shape=21, size=2, colour="white") +
scale_x_date() +
theme_minimal()
This might be a case where ggplot's "you can't have two variables mapped
to the same aesthetic" rule can/should be broken, but I can't figure out clean way around it. Using fill with geom_point shows the most promise, but there's no way to remove the point borders.
Any ideas for plotting two different color or linetype aesthetics here?

ggplot2 2.0.0 coloured boxplots and jitter with borders

I am trying to make a boxplot filled by a binary variable, with a facet grid. I also want to have jitter on top of the boxplots, but without getting them confused with the outliers. In order to fix this, I have added colour to the jitter, but by doing so, they meld in with the already coloured boxplots, as they are the same colour.
I really want to keep the colours the same, so is there a way to add borders to the jitter (or is there a different way to fix the outlier problem)?
Example code:
plot <- ggplot(mpg, aes(class, hwy))+
geom_boxplot(aes(fill = drv))+
geom_jitter(width = .3, aes(colour =drv))
# facet_grid(. ~some_binary_variable, scales="free")
You can use a filled plotting symbol (21:25, cf. ?pch) and then use a white border to differentiate the points:
ggplot(mpg, aes(class, hwy))+
geom_boxplot(aes(fill = drv))+
geom_jitter(width = .3, aes(fill = drv), shape = 21, color = "white")

Overlaying jittered points on boxplot conditioned by a factor using ggplot2

I am making a boxplot conditioned by a factor similar to this example:
p <- ggplot(mtcars, aes(factor(cyl), mpg))
p + geom_boxplot(aes(fill = factor(am)))
There are few points in the data set, and I'd like to express this visually by overlaying the data points. I want to overlay the points colored by the same factor "am" which I try to do like this:
p + geom_boxplot(aes(fill = factor(am))) + geom_jitter(aes(colour = factor(am)))
The points are colored by the factor "am" but not spaced to lay only over the box plots they are associated with. Rather they mix and cover both.
Does anyone know how the condition the geom_jitter so the points associate with the factor "am"?
Welcome to SO! Here's my attempt. It's a bit clumsy, but does the job. The trick is to map x to a dummy variable with manually constructed offset. I'm adding a fill scale to highlight point positioning.
mtcars$cylpt <- as.numeric(factor(mtcars$cyl)) + ifelse(mtcars$am == 0, -0.2, 0.2)
ggplot(mtcars, aes(factor(cyl), mpg)) +
geom_boxplot(aes(fill = factor(am))) +
geom_point(aes(x = cylpt, colour = factor(am)), position = "jitter") +
scale_fill_manual(values = c("white", "gray"))
I have found this link that solves your problem:
https://datavizpyr.com/how-to-make-grouped-boxplot-with-jittered-data-points-in-ggplot2/
geom_jitter(position = position_jitterdodge())

How to fill boxes in geom_point legend with color of points, not just increasing their size?

I'm having a similar problem as described in here under "2- After having the two legends...", but instead of increasing the point size (which eventually also enlarges the legend itself), I would like fill each box in the legend with the corresponding color. Like in a bar plot's legend. Data & code examples here.
Looking through several other questions here, the ggplot docu, etc., I tried variations of code-snippets I found, but couldn't figure out a solution. The legend always retained the point symbols.
Therefore: If possible, how to tweak or replace the legend of a point/scatter/bubble plot so that it looks like the legend of a bar plot? Or, more generally, how to replace the legend of a given geom in ggplot2 with that of a different one? Thank you for any hints!
Edit: Example with mtcars data
library(ggplot2)
p <- ggplot(mtcars, aes(wt, mpg)) + geom_point(aes(colour = factor(cyl), size = qsec))
p
Adding what I gathered from other SO-answers...
p <- p + guides(colour = guide_legend(override.aes = list(fill = unique(mtcars$cyl))))
p
...keeps the points, instead of expanding the color to fill the legend box, no matter arguments and datasources I try for guides() and list().
On the other hand:
ggplot(mtcars, aes(wt, mpg)) + geom_bar(aes(fill = factor(cyl)), stat="identity")
...draws nicely color-filled boxes to the legend. That's what I'm trying to do for a bubble plot.
You won't be able to get a fill-type legend per se, but you can easily emulate it:
ggplot(mtcars, aes(wt, mpg)) +
geom_point(aes(colour = factor(cyl), size = qsec)) +
guides(col = guide_legend(override.aes = list(shape = 15, size = 10)))

Resources