I have the following data that I'm trying to plot. I'm trying to change the width of the error bar but I run into an error that says Width not defined. Set with position_dodge(width = ?). I tried doing the position_dodge..but it didn't help. Any suggestions?
library(ggplot2)
time <- c("t1","t1","t1","t1","t1","t1","t2","t2","t2","t2","t2","t2")
species <- c(1,1,1,2,2,2,1,1,1,2,2,2)
value <- c(1,2,3,11,12,13,4,5,6,11,12,13)
df <- data.frame(time, species,value)
df$time <- as.factor(df$time)
df$species <- as.factor(df$species)
ggplot(df,aes(x=time, y=value, color = species, group = species)) + # Change fill to color
theme_bw() +
geom_point() +
stat_summary(fun.y=mean, position = "dodge") +
stat_summary(
geom="errorbar",
fun.data= mean_cl_boot,
width = 0.1, size = 0.2, col = "grey57") +
# Lines by species using grouping
stat_summary(aes(group = species), geom = "line", fun.y = mean) +
ylab("Fitness")
Position dodge is used to show all data points when data points overlap, I am not sure if this is of any use in your example and you may find that just removing this argument solves the issue if your data are not overlapping. Keeping it constant alternatively solves the issue:
pd<-position_dodge(0.5)
ggplot(df,aes(x=time, y=value, color = species, group = species)) + # Change fill to color
theme_bw() +
geom_point(position = pd) +
stat_summary(fun.y=mean, position = pd) +
stat_summary(
geom="errorbar",
fun.data= mean_cl_boot,
width = 0.1, size = 0.2, col = "grey57",
position = pd) +
# Lines by species using grouping
stat_summary(aes(group = species), geom = "line", fun.y = mean, position = pd) +
ylab("Fitness")
Just edited to keep everything from breaking apart.
Related
I'd like to draw bar plot like this but in dual Y axis
(https://i.stack.imgur.com/ldMx0.jpg)
the first three indexs range from 0 to 1,
so I want the left y-axis (corresponding to NSE, KGE, VE) to range from 0 to 1,
and the right y-axis (corresponding to PBIAS) to range from -15 to 5.
the following is my data and code:
library("ggplot2")
## data
data <- data.frame(
value=c(0.82,0.87,0.65,-3.39,0.75,0.82,0.63,1.14,0.85,0.87,0.67,-7.03),
sd=c(0.003,0.047,0.006,4.8,0.003,0.028,0.006,4.77,0.004,0.057,0.014,4.85),
index=c("NSE","KGE","VE","PBIAS","NSE","KGE","VE","PBIAS","NSE","KGE","VE","PBIAS"),
period=c("all","all","all","all","calibration","calibration","calibration","calibration","validation","validation","validation","validation")
)
## fix index sequence
data$index <- factor(data$index, levels = c('NSE','KGE','VE',"PBIAS"))
data$period <- factor(data$period, levels = c('all','calibration', 'validation'))
## bar plot
ggplot(data, aes(x=index, y=value, fill=period))+
geom_bar(position="dodge", stat="identity")+
geom_errorbar(aes(ymin=value-sd, ymax=value+sd),
position = position_dodge(0.9), width=0.2 ,alpha=0.5, size=1)+
theme_bw()
I try to scale and shift the second y-axis,
but PBIAS bar plot was removed because of out of scale limit as follow:
(https://i.stack.imgur.com/n6Jfm.jpg)
the following is my code with dual y axis:
## bar plot (scale and shift the second y-axis with slope/intercept in 20/-15)
ggplot(data, aes(x=index, y=value, fill=period))+
geom_bar(position="dodge", stat="identity")+
geom_errorbar(aes(ymin=value-sd, ymax=value+sd),
position = position_dodge(0.9), width=0.2 ,alpha=0.5, size=1)+
theme_bw()+
scale_y_continuous(limits = c(0,1), name = "value", sec.axis = sec_axis(~ 20*.- 15, name="value"))
Any advice for move bar_plot or other solution?
Taking a different approach, instead of using a dual axis one option would be to make two separate plots and glue them together using patchwork. IMHO that is much easier than fiddling around with the rescaling the data (that's the step you missed, i.e. if you want to have a secondary axis you also have to rescale the data) and makes it clearer that the indices are measured on a different scale:
library(ggplot2)
library(patchwork)
data$facet <- data$index %in% "PBIAS"
plot_fun <- function(.data) {
ggplot(.data, aes(x = index, y = value, fill = period)) +
geom_bar(position = "dodge", stat = "identity") +
geom_errorbar(aes(ymin = value - sd, ymax = value + sd),
position = position_dodge(0.9), width = 0.2, alpha = 0.5, size = 1
) +
theme_bw()
}
p1 <- subset(data, !facet) |> plot_fun() + scale_y_continuous(limits = c(0, 1))
p2 <- subset(data, facet) |> plot_fun() + scale_y_continuous(limits = c(-15, 15), position = "right")
p1 + p2 +
plot_layout(guides = "collect", width = c(3, 1))
A second but similar option would be to use ggh4x which via ggh4x::facetted_pos_scales allows to set the limits for facet panels individually. One drawback, the panels have the same width. (I failed in making this approach work with facet_grid and space="free")
library(ggplot2)
library(ggh4x)
data$facet <- data$index %in% "PBIAS"
ggplot(data, aes(x = index, y = value, fill = period)) +
geom_bar(position = "dodge", stat = "identity") +
geom_errorbar(aes(ymin = value - sd, ymax = value + sd),
position = position_dodge(0.9), width = 0.2, alpha = 0.5, size = 1
) +
facet_wrap(~facet, scales = "free") +
facetted_pos_scales(
y = list(
facet ~ scale_y_continuous(limits = c(-15, 15), position = "right"),
!facet ~ scale_y_continuous(limits = c(0, 1), position = "left")
)
) +
theme_bw() +
theme(strip.text.x = element_blank())
I made a simple barplot with ggplot2 comparing the mean lifespan (age) of males and females for 2 insect species.
My code looks like this, with "dataset" being, well, my data set...
gplot(dataset, aes(Species, Age, fill=Sex))+
stat_summary(fun.y = mean, geom = "bar", position = "dodge")+
scale_fill_manual(values = c("Grey25", "Grey"))+
theme(legend.title = element_blank())+
scale_y_continuous(limits = c(0,15))
I tried using the following code to manually enter the value of the mean±SE to set the limits for the error bar. For the sake of simplicity, let's assume mean=10 and SE=0.5 for males of species1.
geom_errorbar(aes(ymin=9.5, ymax=10.5),width=.2,position=position_dodge(.9))
This code does indeed work, but it sets the same error bars for each bar in my plot.
How can I add error bars equal to the corresponding SE for each bar in my plot?
I am fairly new to ggplot and R in general so any help/advice is welcome.
You don't need more than to add stat_summary(geom = "errorbar", fun.data = mean_se, position = "dodge") to your plot:
library(ggplot2)
ggplot(diamonds, aes(cut, price, fill = color)) +
stat_summary(geom = "bar", fun = mean, position = "dodge") +
stat_summary(geom = "errorbar", fun.data = mean_se, position = "dodge")
If you prefer to calculate the values beforehand, you could do it like this:
library(tidyverse)
pdata <- diamonds %>%
group_by(cut, color) %>%
summarise(new = list(mean_se(price))) %>%
unnest(new)
pdata %>%
ggplot(aes(cut, y = y, fill = color)) +
geom_col(position = "dodge") +
geom_errorbar(aes(ymin = ymin, ymax = ymax), position = "dodge")
You can add an error bar on your barplot with the geom_errorbar geom.
You need to supply the ymin and ymax, so you need to compute it manually.
From the geom_errorbar help page:
p + geom_errorbar(aes(ymin = lower, ymax = upper), width = 0.2)
I have my boxplot and I added the mean with stat_summary as a line over the box plot. I want to add the standard error, but I don't want errorbar.
Basically, I want to add the standard error as shaded area, as you can do using geom_ribbon.
I used the PlantGrowth dataset to show you briefly what I've tried.
library(ggplot2)
ggplot(PlantGrowth, aes(group, weight))+
stat_boxplot( geom='errorbar', linetype=1, width=0.5)+
geom_boxplot(fill="yellow4",colour="black",outlier.shape=NA) +
stat_summary(fun.y=mean, colour="black", geom="line", shape=18, size=1,aes(group=1))+
stat_summary(fun.data = mean_se, geom = "errorbar")
I did it using geom_errorbar in stat_summary, and tried to substitute geom_errorbar with geom_ribbon, as I saw in some other examples around the web, but it doesn't work.
Something like this one, but with the error as shaded area instead of error bars (which make it a bit confusing to see)
Layering so many geoms becomes hard to read, but here's a simplified version with a few options. Aside from just paring things down a bit to see what I was editing, I added a tile as a summary geom; tile is similar to rect, except it assumes it will be centered at whatever its x value is, so you don't need to worry about the x-axis placement that geom_rect requires. You might experiment with fill colors and opacity—I made the boxplots white just to illustrate better.
library(ggplot2)
gg <- ggplot(PlantGrowth, aes(x = group, y = weight)) +
stat_boxplot(geom = "errorbar", width = 0.5) +
geom_boxplot(fill = "white", outlier.shape = NA, width = 0.7) +
stat_summary(aes(group = 1), fun.y = mean, geom = "line")
gg +
stat_summary(fun.data = mean_se, geom = "tile", width = 0.7,
fill = "pink", alpha = 0.6)
Based on your comments that you want a ribbon, you could instead use a ribbon with group = 1 the same as for the line.
gg +
stat_summary(aes(group = 1), fun.data = mean_se, geom = "ribbon",
fill = "pink", alpha = 0.6)
The ribbon doesn't make a lot of sense across a discrete variable, but here's an example with some dummy data for a continuous group, where this setup becomes more reasonable (though IMO still hard to read).
pg2 <- PlantGrowth
set.seed(123)
pg2$cont_group <- floor(runif(nrow(pg2), 1, 6))
ggplot(pg2, aes(x = cont_group, y = weight, group = cont_group)) +
stat_boxplot(geom = "errorbar", width = 0.5) +
geom_boxplot(fill = "white", outlier.shape = NA, width = 0.7) +
stat_summary(aes(group = 1), fun.y = mean, geom = "line") +
stat_summary(aes(group = 1), fun.data = mean_se, geom = "ribbon",
fill = "pink", alpha = 0.6)
I'm trying to build a complex figure that overlays individual data points on a boxplot to display both summary statistics as well as dispersion of the raw data. I have 2 questions in rank order of importance:
How do I center the jittered points around the middle of their respective box plot?
How can I remove the dark dots from the "drv" legend?
Code:
library(ggplot2)
library(dplyr)
mpg$cyl <- as.factor(mpg$cyl)
mpg %>% filter(fl=="p" | fl=="r" & cyl!="5") %>% sample_n(100) %>% ggplot(aes(cyl, hwy, fill=drv)) +
stat_boxplot(geom = "errorbar", width=0.5, position = position_dodge(1)) +
geom_boxplot(position = position_dodge(1), outlier.shape = NA)+
geom_point(aes(fill=drv, shape=fl), color="black", show.legend=TRUE, alpha=0.5, size=3, position = position_jitterdodge(dodge.width = 1)) +
scale_shape_manual(values = c(21,23))
It looks like the current dodging for geom_point is based on both fill and shape. Use group to indicate you only want to dodge on drv.
You can use override.aes in guide_legend to remove the points from the fill legend.
mpg %>%
filter(fl=="p" | fl=="r" & cyl!="5") %>%
sample_n(100) %>%
ggplot(aes(cyl, hwy, fill=drv)) +
stat_boxplot(geom = "errorbar", width=0.5, position = position_dodge(1)) +
geom_boxplot(position = position_dodge(1), outlier.shape = NA)+
geom_point(aes(fill = drv, shape = fl, group = drv), color="black",
alpha =0.5, size=3,
position = position_jitterdodge(jitter.width = .1, dodge.width = 1)) +
scale_shape_manual (values = c(21,23) ) +
guides(fill = guide_legend(override.aes = list(shape = NA) ) )
I made a simple barplot with ggplot2 comparing the mean lifespan (age) of males and females for 2 insect species.
My code looks like this, with "dataset" being, well, my data set...
gplot(dataset, aes(Species, Age, fill=Sex))+
stat_summary(fun.y = mean, geom = "bar", position = "dodge")+
scale_fill_manual(values = c("Grey25", "Grey"))+
theme(legend.title = element_blank())+
scale_y_continuous(limits = c(0,15))
I tried using the following code to manually enter the value of the mean±SE to set the limits for the error bar. For the sake of simplicity, let's assume mean=10 and SE=0.5 for males of species1.
geom_errorbar(aes(ymin=9.5, ymax=10.5),width=.2,position=position_dodge(.9))
This code does indeed work, but it sets the same error bars for each bar in my plot.
How can I add error bars equal to the corresponding SE for each bar in my plot?
I am fairly new to ggplot and R in general so any help/advice is welcome.
You don't need more than to add stat_summary(geom = "errorbar", fun.data = mean_se, position = "dodge") to your plot:
library(ggplot2)
ggplot(diamonds, aes(cut, price, fill = color)) +
stat_summary(geom = "bar", fun = mean, position = "dodge") +
stat_summary(geom = "errorbar", fun.data = mean_se, position = "dodge")
If you prefer to calculate the values beforehand, you could do it like this:
library(tidyverse)
pdata <- diamonds %>%
group_by(cut, color) %>%
summarise(new = list(mean_se(price))) %>%
unnest(new)
pdata %>%
ggplot(aes(cut, y = y, fill = color)) +
geom_col(position = "dodge") +
geom_errorbar(aes(ymin = ymin, ymax = ymax), position = "dodge")
You can add an error bar on your barplot with the geom_errorbar geom.
You need to supply the ymin and ymax, so you need to compute it manually.
From the geom_errorbar help page:
p + geom_errorbar(aes(ymin = lower, ymax = upper), width = 0.2)