R plot errorbars with outliers - r

I'm trying to get the same aesthetic as below where the error bars look the same and have outliers shown. geom_errorbar and stat_summary is somewhat similar, but doesn't provide outliers. geom_boxplot provide outliers, but the box takes up too much space and I would prefer the slimmed down appearance below. Does anyone know how to achieve this with ggplot or without?

We can set the width of the boxplot to 0 then use stat_boxplot & stat_summary to produce the rest of the plot in the picture you added
library(ggplot2)
p1 <- ggplot(data = iris, aes(x = Species, y = Sepal.Length)) +
geom_boxplot(width = 0,
outlier.colour = "red") +
stat_boxplot(geom = "errorbar", width = 0.5) +
stat_summary(fun.y = mean, geom = "point", size = 2) +
stat_summary(fun.y = mean, geom = "line", aes(group = 1)) +
theme_bw()
p1
Created on 2018-03-18 by the reprex package (v0.2.0).

Related

Reducing size of error bar caps in ggplot2

I am trying to plot a graph with the following code:
p<-ggplot(averagedf, aes(x=Time, y=average,col=Strain)) +
geom_line() +
geom_point()+
geom_errorbar(aes(ymin=average-sem,ymax=average+sem))+
theme_classic()+
theme(legend.position = "none")
And the graph looks like this- which is all fine, except that the caps(?) of the error bars are too wide:First plot
In order to reduce the width of the caps, I set width to 2, but now the caps are not centred around the the vertical line of the error bar. Does anyone have any idea on how to change the size of the caps without messing up its position?
p<-ggplot(averagedf, aes(x=Time, y=average,col=Strain)) +
geom_line() +
geom_point()+
geom_errorbar(aes(ymin=average-sem,ymax=average+sem,width=2))+
theme_classic()+
theme(legend.position = "none")
Second plot
I wonder if that's just an artifact of the resolution of the graphics render. Without your exact data, it's hard to reproduce your exact plot, but with a similar example using an included R dataset, I don't have this issue.
library(tidyverse)
ChickWeight %>%
ggplot(aes(x = Time, y = weight, color = factor(Diet))) +
stat_summary(fun = mean, geom = "line") +
stat_summary(fun = mean, geom = "point") +
stat_summary(fun.data = mean_se, geom = "errorbar", width = 0.5) +
theme_classic() +
theme(legend.position = "none")
Created on 2021-03-15 by the reprex package (v1.0.0)

Add standard error as shaded area instead of errorbars in geom_boxplot

I have my boxplot and I added the mean with stat_summary as a line over the box plot. I want to add the standard error, but I don't want errorbar.
Basically, I want to add the standard error as shaded area, as you can do using geom_ribbon.
I used the PlantGrowth dataset to show you briefly what I've tried.
library(ggplot2)
ggplot(PlantGrowth, aes(group, weight))+
stat_boxplot( geom='errorbar', linetype=1, width=0.5)+
geom_boxplot(fill="yellow4",colour="black",outlier.shape=NA) +
stat_summary(fun.y=mean, colour="black", geom="line", shape=18, size=1,aes(group=1))+
stat_summary(fun.data = mean_se, geom = "errorbar")
I did it using geom_errorbar in stat_summary, and tried to substitute geom_errorbar with geom_ribbon, as I saw in some other examples around the web, but it doesn't work.
Something like this one, but with the error as shaded area instead of error bars (which make it a bit confusing to see)
Layering so many geoms becomes hard to read, but here's a simplified version with a few options. Aside from just paring things down a bit to see what I was editing, I added a tile as a summary geom; tile is similar to rect, except it assumes it will be centered at whatever its x value is, so you don't need to worry about the x-axis placement that geom_rect requires. You might experiment with fill colors and opacity—I made the boxplots white just to illustrate better.
library(ggplot2)
gg <- ggplot(PlantGrowth, aes(x = group, y = weight)) +
stat_boxplot(geom = "errorbar", width = 0.5) +
geom_boxplot(fill = "white", outlier.shape = NA, width = 0.7) +
stat_summary(aes(group = 1), fun.y = mean, geom = "line")
gg +
stat_summary(fun.data = mean_se, geom = "tile", width = 0.7,
fill = "pink", alpha = 0.6)
Based on your comments that you want a ribbon, you could instead use a ribbon with group = 1 the same as for the line.
gg +
stat_summary(aes(group = 1), fun.data = mean_se, geom = "ribbon",
fill = "pink", alpha = 0.6)
The ribbon doesn't make a lot of sense across a discrete variable, but here's an example with some dummy data for a continuous group, where this setup becomes more reasonable (though IMO still hard to read).
pg2 <- PlantGrowth
set.seed(123)
pg2$cont_group <- floor(runif(nrow(pg2), 1, 6))
ggplot(pg2, aes(x = cont_group, y = weight, group = cont_group)) +
stat_boxplot(geom = "errorbar", width = 0.5) +
geom_boxplot(fill = "white", outlier.shape = NA, width = 0.7) +
stat_summary(aes(group = 1), fun.y = mean, geom = "line") +
stat_summary(aes(group = 1), fun.data = mean_se, geom = "ribbon",
fill = "pink", alpha = 0.6)

Extend an annotation line across multiple facets of ggplot

When I facet a plot I often want to point out interesting comparisons between groups. For instance, in the plot produced by this code I'd like to point out that the second and third columns are nearly identical.
library(tidyverse)
ggplot(mtcars, aes(x = as.factor(am), y = mpg)) +
stat_summary(fun.y = "mean", geom = "col") +
stat_summary(fun.data = mean_se, geom = "errorbar", width = .1) +
facet_grid(~ vs)
Currently I can only make this annotation by exporting my plot to another app like Preview or Powerpoint and manually adding the lines and text across facets.
My efforts to add an annotation across facets results in annotations that do not leave their own facet. See below.
ggplot(mtcars, aes(x = as.factor(am), y = mpg)) +
stat_summary(fun.y = "mean", geom = "col") +
stat_summary(fun.data = mean_se, geom = "errorbar", width = .1) +
facet_grid(~ vs) +
annotate("errorbarh", xmin = 2, xmax = 3, y = 25, height = .5,
color = "red") +
annotate("text", x = 2.5, y = 27, label = "NS", color = "red")
Any advice about how to extend lines and annotations across facets would be greatly appreciated.

R ggplot2: Add means as horizontal line in a boxplot

I have created a boxplot using ggplot2:
library(ggplot2)
dat <- data.frame(study = c(rep('a',50),rep('b',50)),
FPKM = c(rnorm(1:50),rnorm(1:50)))
ggplot(dat, aes(x = study, y = FPKM)) + geom_boxplot()
The boxplot shows the median as a horizontal line across each box.
How do I add a dashed line to the box representing the mean of that group?
Thanks!
You can add horizontal lines to plots by using stat_summary with geom_errorbar. The line is horizontal because the y minimum and maximum are set to be the same as y.
ggplot(dat, aes(x = study, y = FPKM)) +
geom_boxplot() +
stat_summary(fun.y = mean, geom = "errorbar", aes(ymax = ..y.., ymin = ..y..),
width = .75, linetype = "dashed")

Dodging boxplots and error bars with ggplot2

library(ggplot2)
library(Hmisc)
data(mtcars)
myplot <- ggplot(mtcars, aes(x = as.factor(cyl), y = qsec)) +
geom_boxplot() +
stat_summary(fun.y = mean, geom = "point", shape = 5, size = 2) +
stat_summary(fun.data = mean_cl_normal, geom = "errorbar",
width = 0.2)
produces
I'd like to dodge the mean and error bars a bit to the right, such that the error bars don't obscure the IQR line of the boxplot. Specifying position=position_dodge(.5) doesn't seem to work, because geom_errorbardoesn't know about geom_boxplot.
You can introduce a new variable which you use as the x offset for your errorbars:
library(ggplot2)
library(Hmisc)
data(mtcars)
mtcars$cyl.n <- as.numeric(as.factor(mtcars$cyl)) + .5
(myplot <- ggplot(mtcars, aes(x = as.factor(cyl), y = qsec)) +
geom_boxplot() +
stat_summary(aes(x = cyl.n), fun.y = mean, geom = "point", shape = 5, size = 2) +
stat_summary(aes(x = cyl.n), fun.data = mean_cl_normal, geom = "errorbar",
width = 0.2))
The as.numeric(as.factor(.)) makes sure that the new error bar is spaced at the same position as the boxplots but shifted by 0.5 units.

Resources