I have a dataset looking like this:
Flowers Sun_Exposition Value Repl
1: Tulipe mid 87.9 Aa1
2: Tulipe mid 92.8 Aa2
3: Tulipe mid 86.4 Aa3
4: Tulipe mid 83.3 Aa4
5: Tulipe mid 91.3 Aa5-1
6: Tulipe mid 91.4 Aa5-2
Flowers having two categories and there is 4 different Sun exposition. For each combination I have a different number of replicates.
I would like to plot a barplot (with sd) with also the points shaped by replicates.
Here is my code:
# summarize data
dataSum <- data[, .(M = mean(Value, na.rm = T), S = sd(Value, na.rm = T)),
by = .(Flowers, Sun_Exposition)]
# the plot
p <- ggplot(dataSum, aes(x = Sun_Exposition, y = M, fill = Flowers)) +
geom_bar(stat = "identity", color = "black",
position = position_dodge(.9)) +
geom_errorbar(aes(ymin = M, ymax = M + S), width = .2,
position = position_dodge(.9)) +
geom_jitter(data = data,
mapping = aes(x = Sun_Exposition, y = Value,
fill = Flowers,
shape = Repl, color = Flowers),
size = 2,
position = position_dodge(width = 0.9)) +
scale_shape_manual(values = c(19,17,18,15,7,8,6,5,4,2,13,12,3))+
scale_fill_manual(values = c("gray", "lavender")) +
scale_color_manual(values = c("gray30", "mediumpurple"))
which gives me this plot:
My problem is that the "spread" of the points is not as width as the width of the bar. I have tried many combination, such as using geom_point, and position_jitterdodge, putting the jitter width to 0 or negative but it never gave the results I wanted.
Thank you very much for your help!
You have to add a group mapping to the aesthetics of geom_jitter. On your data, I suppose you would have to add group = Flowers.
Here is an example with mtcars:
library(magrittr)
library(dplyr)
library(ggplot2)
library(forcats)
mt1 <- mtcars %>%
group_by(cyl, am) %>%
summarize(mean = mean(hp, na.rm = T),
sd = sd(hp, na.rm = T))
mt2 <- mtcars %>%
mutate(rep = paste0("rep", rep(1:8, each = 4)))
# the plot
ggplot(mt1, aes(x = as_factor(cyl),
y = mean,
fill = as_factor(am))) +
geom_bar(stat = "identity", color = "black",
position = position_dodge(.9)) +
geom_errorbar(aes(ymin = mean, ymax = mean + sd), width = .2,
position = position_dodge(.9)) +
geom_jitter(data = mt2,
aes(x = as_factor(cyl),
y = hp,
fill = as_factor(am),
shape = as_factor(rep),
color = as_factor(am),
group = as_factor(am)),
size = 1,
position = position_jitterdodge(jitter.width = 0.9)) +
scale_shape_manual(values = c(19,17,18,15,7,8,6,5,4,2,13,12,3))+
scale_fill_manual(values = c("gray", "lavender")) +
scale_color_manual(values = c("red", "blue"))
Related
I have the following plot:
Sample code:
dat = data.frame(grp = rep(c("Group1", "Group2"), 24),
label = rep(c(rep("Yes",2), rep("Rather yes",2), rep("Rather no",2), rep("No",2)), 6),
pct = rep(c(25,25,25,25), 12),
grp2 = c(rep("Total", 8),rep("Age", 24),rep("Gender", 16)),
label2 = c(rep("",8), rep("18-29", 8),rep("30-64", 8),rep("65-80", 8),rep("Male", 8),rep("Female", 8)))
dat$grp2 <- factor(dat$grp2, levels = c("Total", "Gender", "Age"))
# Design for facet_manual (ggh4x-Package)
design <- matrix(1:6,3)
heights <- c(8,16,24)
# Plot
plot <- ggplot2::ggplot(data = dat, ggplot2::aes(x = pct, y = label2, fill = label)) +
ggplot2::geom_bar(stat = 'identity', position = 'stack', width = 0.8, color = 'white') +
ggh4x::facet_manual(grp~grp2, design = design, heights = heights, scales = "free_y", strip.position = "top");plot
I would like very much that Group1 or Group2 is written only once, at the top of the plot. One possibility I found is to select what is shown using labeller. But I do not understand the structure of the underlying labeller object (in the example only the inner label is shown, not as I want 1x outer label at the top and all inner labels).
plot <- ggplot2::ggplot(data = dat, ggplot2::aes(x = pct, y = label2, fill = label)) +
ggplot2::geom_bar(stat = 'identity', position = 'stack', width = 0.8, color = 'white') +
ggh4x::facet_manual(grp~grp2, design = design, heights = heights, scales = "free_y", strip.position = "top",
labeller = function(df) {list(as.character(df[,2]))});plot
Does anyone have a solution? Of course it would also be possible to create two plots and then attach them next to each other. But I would be interested in a solution that generates a plot directly with facets.
I already tried using different labeller functions as well as using facet_nested_wrap as an Alternative.
Using the inputs defined in the question define a labeller function which
accepts a data frame whose columns names will be grp and grp2 and whose 6 rows are the levels of each of the 6 facets. Replace those with the names that should be shown so that in this case if grp2 is Total then use the grp name for grp and otherwise use "" for grp.
To be specific this data frame will be passed to the labeller function:
grp grp2
1 Group1 Total
2 Group1 Gender
3 Group1 Age
4 Group2 Total
5 Group2 Gender
6 Group2 Age
and the labeller function will return this data frame:
grp grp2
1 Group1 Total
2 Gender
3 Age
4 Group2 Total
5 Gender
6 Age
The code follows.
library(ggplot2)
library(ggh4x)
label_fun <- function(data) {
transform(data, grp = ifelse(grp2 == "Total", grp, ""),
grp2 = as.character(grp2))
}
plot <- ggplot(data = dat, aes(x = pct, y = label2, fill = label)) +
geom_bar(stat = 'identity', position = 'stack', width = 0.8, color = 'white') +
facet_manual(~ grp + grp2, design = design, heights = heights, scales = "free_y",
strip.position = "top", labeller = label_fun)
plot
Added
Another approach is to collapse grp and grp2 into a single factor grps.
label_fun2 <- function(x) {
transform(x, grps = ifelse(grepl("Total", grps),
sub("\\.", " - ", grps),
sub("\\..*", "", grps)))
}
levs <- c("Total.Group1", "Gender.Group1", "Age.Group1",
"Total.Group2", "Gender.Group2", "Age.Group2")
dat |>
transform(grps = factor(interaction(grp2, grp), levs)) |>
ggplot(aes(x = pct, y = label2, fill = label)) +
geom_bar(stat = 'identity', position = 'stack', width = 0.8, color = 'white') +
facet_manual(~ grps , design = design, heights = heights, scales = "free_y",
strip.position = "top", labeller = label_fun2)
or if we use this labeller function which uses \n in place of - with the same ggplot2 code
label_fun2 <- function(x) {
transform(x, grps = ifelse(grepl("Total", grps),
sub("\\.", " \n ", grps),
sub("\\..*", "", grps)))
}
Note that levs, above, could be computed using:
dat |>
with(expand.grid(grp2 = levels(grp2), grp = levels(factor(grp)))) |>
with(interaction(grp2, grp))
Yet another approach is to use facet_grid2:
ggplot(dat, aes(x = pct, y = label2, fill = label)) +
geom_bar(stat = 'identity', position = 'stack', width = 0.8, color = 'white') +
facet_grid2(grp2 ~ grp, scales = "free_y", switch = "y")
I am using ggplot to plot the following graph (the example attached). What I want to achieve is to:
1-There are grey lines between Missing and power 1-1, power_1-1 and power_1-2 and so on, but no others (see the following graph). How can I have these lines between every bar in the background?
2- How can I change these lines' color (e.g., change to light blue) and line size?
3- Last, is there any way to sort my graph (through coding) based on the mean? (e.g., from -0.2231, then -0.2156, ... to 0.0592)
library(tidyverse)
df <- data.frame(id = c("Missing","power_1-1","power_1-2","power_1-3","power_1-4","power_1-5","power_2","power_3","power_4","power_5"),
mean = c(-0.0823,0.0592,-0.0556,-0.1037,-0.1303,-0.1478,-0.1857,-0.2074,-0.2231,-0.2156),
se = c(0.0609,0.0247,0.0216,0.0206,0.0202,0.0199,0.0194,0.0193,0.0205,0.0242), stringsAsFactors = FALSE)
win.graph(width = 13,height = 6)
df %>%
rowwise() %>%
mutate(CI95 = list(c(mean + 1.96 * se, mean - 1.96 * se)),
CI99 = list(c(mean + 2.58 * se, mean - 2.58 * se))) %>%
unnest(c(CI95, CI99)) %>%
ggplot() +
labs(x = NULL, y = NULL) +
geom_line(aes(x = id, y = CI99, group = id, color = id)) +
geom_line(aes(x = id, y = CI95, group = id, color = id), size = 3) +
geom_point(aes(x = id, y = mean, color = id), fill = "white", shape = 23, size = 3) +
geom_hline(yintercept = 0, linetype = "dashed") +
geom_vline(xintercept=1:3+0.5, colour="grey70") +
theme_classic() +
coord_flip()
Also, is there any way to show a legend as the following:
Here is a potential solution:
library(tidyverse)
df <- data.frame(id = c("Missing","power_1-1","power_1-2","power_1-3","power_1-4","power_1-5","power_2","power_3","power_4","power_5"),
mean = c(-0.0823,0.0592,-0.0556,-0.1037,-0.1303,-0.1478,-0.1857,-0.2074,-0.2231,-0.2156),
se = c(0.0609,0.0247,0.0216,0.0206,0.0202,0.0199,0.0194,0.0193,0.0205,0.0242), stringsAsFactors = FALSE)
colour_scale <- c("red", viridis::mako(9))
p1 <- df %>%
rowwise() %>%
mutate(CI95 = list(c(mean + 1.96 * se, mean - 1.96 * se)),
CI99 = list(c(mean + 2.58 * se, mean - 2.58 * se))) %>%
unnest(c(CI95, CI99)) %>%
mutate(id = factor(reorder(id, -CI99))) %>%
ggplot() +
labs(x = NULL, y = NULL) +
geom_line(aes(x = id, y = CI99, group = id, color = id)) +
geom_line(aes(x = id, y = CI95, group = id, color = id), size = 3) +
geom_point(aes(x = id, y = mean, color = id), fill = "white", shape = 23, size = 3) +
geom_hline(yintercept = 0, linetype = "dashed") +
geom_vline(xintercept=1:9+0.5, colour="lightblue", size = 0.5) +
theme_classic() +
scale_color_manual(values = colour_scale) +
coord_flip()
ggsave(filename = "example_plot.png", plot = p1, width = 18, height = 6, units = "cm")
I am trying to create a plot in ggplot2 similar to this one:
Here is the code I am using:
Dataset %>%
group_by(Participant, Group, Emotion) %>%
ggplot(aes(y = Score, x = Emotion, fill = Group, colour = Group)) +
geom_flat_violin(position = position_nudge(x = .2, y = 0), alpha = .4) +
geom_point(aes(y = Score, color = Group), position = position_jitter(width = .15), size = 3, alpha = 0.4) +
stat_summary(aes(y = Score, group = Emotion), fun.y = mean, geom="line", size = 2.2, alpha = 1.2, width = 0.25, colour = 'gray48') +
stat_summary(fun = mean, geom = 'pointrange', width = 0.2, size = 2, alpha = 1.2, position=position_dodge(width=0.3)) +
stat_summary(fun.data = mean_se, geom='errorbar', width = 0.25, size = 2.2, alpha = 1.2, linetype = "solid",position=position_dodge(width=0.3)) +
guides(color = FALSE) +
scale_color_brewer(palette = "Dark2") +
scale_fill_brewer(palette = "Dark2") +
ylim(0, 100) +
graph_theme
What I am failing to do is set up the stat_summary(geom = 'line') to connect the green and orange means within each emotion on the x-axis. Could anyone give any pointers on this? I'd also like all the other features to stay the same if possible (e.g., I wouldn't like to use facet_grid or facet_wrap).
Thank you!
When I change the group argument in stat_summary to 'Group' instead of 'Emotion', means for each group are connected across emotions, but I can't figure out how to connect means of different groups within each emotion:
This is a tricky one because your line needs to connect points that have different x values but even if you jitter in the point layer, they still technically have the same x value so the line doesn't know how to connect them. What others have done is to manually add the jitter to force the points to have a different x position. For more inspiration check out this, this and this. Here's an example:
library(tidyverse)
set.seed(1)
emotion <- c("anger", "fear", "sadness")
group <- letters[1:2]
participant <- 1:10
dat <- expand_grid(emotion, group, participant) %>%
mutate(across(everything(), as.factor),
score = sample(x = 1:100, size = nrow(.), replace = T))
dat %>%
mutate(new_emot = case_when(
group == "a" ~as.numeric(emotion) - 0.125,
group == "b" ~as.numeric(emotion) + 0.125
)) %>%
ggplot(aes(x = emotion, y = score)) +
stat_summary(aes(color = group), fun = mean, geom = "point", position = position_dodge(width = 0.5)) +
stat_summary(aes(color = group), fun.data = mean_se, geom = "errorbar", width = 0.5, position = position_dodge(width = 0.5)) +
stat_summary(aes(x = new_emot, group = emotion), fun = mean, geom = "line") +
theme_bw()
Created on 2021-03-24 by the reprex package (v1.0.0)
Setting geom_line to the same position as pointrange and errorbar will solve the problem.
i.e.,
stat_summary(aes(y = Score, group = Emotion), fun.y = mean, geom="line", size = 2.2, alpha = 1.2, width = 0.25, colour = 'gray48', position=position_dodge(width=0.3))
I'm trying to overlay 2 the bars from geom_bar derived from 2 separate data.frames.
dEQ
lab perc
1 lmP 55.9
2 lmN 21.8
3 Nt 0.6
4 expG 5.6
5 expD 0.0
6 prbN 11.2
7 prbP 5.0
and
LMD
lab perc
1 lmP 16.8
2 lmN 8.9
3 Nt 0.0
4 expG 0.0
5 expD 0.0
6 prbN 0.0
7 prbP 0.0
The first plot is:
p <- ggplot(dEQ, aes(lab, perc)) +
xlab(xlabel) + ylab(ylabel) +
geom_bar(stat="identity", colour="blue", fill="darkblue") +
geom_text(aes(vecX, vecYEQ+1.5, label=vecYlbEQ), data=dEQ, size=8.5) +
theme_bw() +
opts(axis.text.x = theme_text(size = 20, face = "bold", colour = "black")) +
opts(axis.text.y = theme_text(size = 20, face = "bold", colour = "black")) +
coord_flip() +
scale_y_continuous(breaks=c(0,10,20,30,40,50,60),
labels=c("0","","20","","40","","60"),
limits = c(0, 64), expand = c(0,0))
print(p)
but I want to overplot with another geom_bar from data.frame LMD
ggplot(LMD, aes(lab, perc)) +
geom_bar(stat="identity", colour="blue", fill="red", add=T)
and I want to have a legend.
here is an example:
p <- ggplot(NULL, aes(lab, perc)) +
geom_bar(aes(fill = "dEQ"), data = dEQ, alpha = 0.5) +
geom_bar(aes(fill = "LMD"), data = LMD, alpha = 0.5)
p
but I recommend to rbind them and plot it by dodging:
dEQ$name <- "dEQ"
LMD$name <- "LMD"
d <- rbind(dEQ, LMD)
p <- ggplot(d, aes(lab, perc, fill = name)) + geom_bar(position = "dodge")
Though the answer is not directly the requirement of OP, but as this question is linked to many subsequent questions on SO that have been closed by giving the link of this question, I am proposing a method for bar(s) within bar plot construction method in ggplot2.
Example for two bars (group-wise division) within one bigger bar plot.
library(tidyverse)
set.seed(40)
df <- data_frame(name = LETTERS[1:10], provision = rnorm(mean = 100, sd = 20, n = 10),
expenditure = provision - rnorm(mean = 25, sd = 10, n = 10))
df %>% mutate(savings = provision - expenditure) %>%
pivot_longer(cols = c("expenditure", "savings"), names_to = "Exp", values_to = "val") %>%
ggplot() + geom_bar(aes(x= name, y = provision/2), stat = "identity", fill = "blue", width = 0.9, alpha = 0.3) +
geom_col(aes(x=name,y=val, fill = Exp), position ="dodge", width = 0.7) +
scale_y_continuous(name = "Amount in \u20b9")
Another option to overlay your bars without lowering transparency using alpha is to group_by the data based on your fill variable and arrange(desc()) your y variable, using position = position_identity() to overlay your bars and have the highest value bars behind and lower values in front. Then you don't need to change the transparency. Here is a reproducible example:
# Add name for fill aesthetic
dEQ$name <- "dEQ"
LMD$name <- "LMD"
library(dplyr)
library(ggplot2)
dEQ %>%
rbind(LMD) %>%
group_by(name) %>%
arrange(desc(perc)) %>%
ggplot(aes(x = lab, y = perc, fill = name)) +
geom_bar(stat="identity", position = position_identity())
Created on 2022-11-02 with reprex v2.0.2
As you can see the bars overlay while keeping the origin transparency.
I've been trying to superimpose a normal curve over my histogram with ggplot 2.
My formula:
data <- read.csv (path...)
ggplot(data, aes(V2)) +
geom_histogram(alpha=0.3, fill='white', colour='black', binwidth=.04)
I tried several things:
+ stat_function(fun=dnorm)
....didn't change anything
+ stat_density(geom = "line", colour = "red")
...gave me a straight red line on the x-axis.
+ geom_density()
doesn't work for me because I want to keep my frequency values on the y-axis, and want no density values.
Any suggestions?
Solution found!
+geom_density(aes(y=0.045*..count..), colour="black", adjust=4)
Think I got it:
library(ggplot2)
set.seed(1)
df <- data.frame(PF = 10*rnorm(1000))
ggplot(df, aes(x = PF)) +
geom_histogram(aes(y =..density..),
breaks = seq(-50, 50, by = 10),
colour = "black",
fill = "white") +
stat_function(fun = dnorm, args = list(mean = mean(df$PF), sd = sd(df$PF)))
This has been answered here and partially here.
The area under a density curve equals 1, and the area under the histogram equals the width of the bars times the sum of their height ie. the binwidth times the total number of non-missing observations. To fit both on the same graph, one or other needs to be rescaled so that their areas match.
If you want the y-axis to have frequency counts, there are a number of options:
First simulate some data.
library(ggplot2)
set.seed(1)
dat_hist <- data.frame(
group = c(rep("A", 200), rep("B",150)),
value = c(rnorm(200, 20, 5), rnorm(150,25,10)))
# Set desired binwidth and number of non-missing obs
bw = 2
n_obs = sum(!is.na(dat_hist$value))
Option 1: Plot both histogram and density curve as density and then rescale the y axis
This is perhaps the easiest approach for a single histogram.
Using the approach suggested by Carlos, plot both histogram and density curve as density
g <- ggplot(dat_hist, aes(value)) +
geom_histogram(aes(y = ..density..), binwidth = bw, colour = "black") +
stat_function(fun = dnorm, args = list(mean = mean(dat_hist$value), sd = sd(dat_hist$value)))
And then rescale the y axis.
ybreaks = seq(0,50,5)
## On primary axis
g + scale_y_continuous("Counts", breaks = round(ybreaks / (bw * n_obs),3), labels = ybreaks)
## Or on secondary axis
g + scale_y_continuous("Density", sec.axis = sec_axis(
trans = ~ . * bw * n_obs, name = "Counts", breaks = ybreaks))
Option 2: Rescale the density curve using stat_function
With code tidied as per PatrickT's answer.
ggplot(dat_hist, aes(value)) +
geom_histogram(colour = "black", binwidth = bw) +
stat_function(fun = function(x)
dnorm(x, mean = mean(dat_hist$value), sd = sd(dat_hist$value)) * bw * n_obs)
Option 3: Create an external dataset and plot using geom_line.
Unlike the above options, this one works with facets. (EDITED to provide dplyr rather than plyr based solution). Note, the summarised dataset is being used as the primary, and the raw passed in for the histogram only.
library(tidyverse)
dat_hist %>%
group_by(group) %>%
nest(data = c(value)) %>%
mutate(y = map(data, ~ dnorm(
.$value, mean = mean(.$value), sd = sd(.$value)
) * bw * sum(!is.na(.$value)))) %>%
unnest(c(data,y)) %>%
ggplot(aes(x = value)) +
geom_histogram(data = dat_hist, binwidth = bw, colour = "black") +
geom_line(aes(y = y)) +
facet_wrap(~ group)
Option 4: Create external functions to edit the data on the fly
A bit over the top perhaps, but might be useful for someone?
## Function to create scaled dnorm data along full x axis range
dnorm_scaled <- function(data, x = NULL, binwidth = 1, xlim = NULL) {
.x <- na.omit(data[,x])
if(is.null(xlim))
xlim = c(min(.x), max(.x))
x_range = seq(xlim[1], xlim[2], length.out = 101)
setNames(
data.frame(
x = x_range,
y = dnorm(x_range, mean = mean(.x), sd = sd(.x)) * length(.x) * binwidth),
c(x, "y"))
}
## Function to apply over groups
dnorm_scaled_group <- function(data, x = NULL, group = NULL, binwidth = NULL, xlim = NULL) {
dat_hists <- lapply(
split(data, data[, group]), dnorm_scaled,
x = x, binwidth = binwidth, xlim = xlim)
for(g in names(dat_hists))
dat_hists[[g]][, "group"] <- g
setNames(do.call(rbind, dat_hists), c(x, "y", group))
}
## Single histogram
ggplot(dat_hist, aes(value)) +
geom_histogram(binwidth = bw, colour = "black") +
geom_line(data = ~ dnorm_scaled(., "value", binwidth = bw),
aes(y = y))
## With a single faceting variable
ggplot(dat_hist, aes(value)) +
geom_histogram(binwidth = 2, colour = "black") +
geom_line(data = ~ dnorm_scaled_group(
., x = "value", group = "group", binwidth = 2, xlim = c(0,50)),
aes(y = y)) +
facet_wrap(~ group)
This is an extended comment on JWilliman's answer. I found J's answer very useful. While playing around I discovered a way to simplify the code. I'm not saying it is a better way, but I thought I would mention it.
Note that JWilliman's answer provides the count on the y-axis and a "hack" to scale the corresponding density normal approximation (which otherwise would cover a total area of 1 and have therefore a much lower peak).
Main point of this comment: simpler syntax inside stat_function, by passing the needed parameters to the aesthetics function, e.g.
aes(x = x, mean = 0, sd = 1, binwidth = 0.3, n = 1000)
This avoids having to pass args = to stat_function and is therefore more user-friendly. Okay, it's not very different, but hopefully someone will find it interesting.
# parameters that will be passed to ``stat_function``
n = 1000
mean = 0
sd = 1
binwidth = 0.3 # passed to geom_histogram and stat_function
set.seed(1)
df <- data.frame(x = rnorm(n, mean, sd))
ggplot(df, aes(x = x, mean = mean, sd = sd, binwidth = binwidth, n = n)) +
theme_bw() +
geom_histogram(binwidth = binwidth,
colour = "white", fill = "cornflowerblue", size = 0.1) +
stat_function(fun = function(x) dnorm(x, mean = mean, sd = sd) * n * binwidth,
color = "darkred", size = 1)
This code should do it:
set.seed(1)
z <- rnorm(1000)
qplot(z, geom = "blank") +
geom_histogram(aes(y = ..density..)) +
stat_density(geom = "line", aes(colour = "bla")) +
stat_function(fun = dnorm, aes(x = z, colour = "blabla")) +
scale_colour_manual(name = "", values = c("red", "green"),
breaks = c("bla", "blabla"),
labels = c("kernel_est", "norm_curv")) +
theme(legend.position = "bottom", legend.direction = "horizontal")
Note: I used qplot but you can use the more versatile ggplot.
Here's a tidyverse informed version:
Setup
library(tidyverse)
Some data
d <- read_csv("https://vincentarelbundock.github.io/Rdatasets/csv/openintro/speed_gender_height.csv")
Preparing data
We'll use a "total" histogram for the whole sample, to that end, we'll need to remove the grouping information from the data.
d2 <-
d |>
select(-gender)
Here's a data set with summary data:
d_summary <-
d %>%
group_by(gender) %>%
summarise(height_m = mean(height, na.rm = T),
height_sd = sd(height, na.rm = T))
d_summary
Plot it
d %>%
ggplot() +
aes() +
geom_histogram(aes(y = ..density.., x = height, fill = gender)) +
facet_wrap(~ gender) +
geom_histogram(data = d2, aes(y = ..density.., x = height),
alpha = .5) +
stat_function(data = d_summary %>% filter(gender == "female"),
fun = dnorm,
#color = "red",
args = list(mean = filter(d_summary,
gender == "female")$height_m,
sd = filter(d_summary,
gender == "female")$height_sd)) +
stat_function(data = d_summary %>% filter(gender == "male"),
fun = dnorm,
#color = "red",
args = list(mean = filter(d_summary,
gender == "male")$height_m,
sd = filter(d_summary,
gender == "male")$height_sd)) +
theme(legend.position = "none",
axis.title.y = element_blank(),
axis.text.y = element_blank(),
axis.ticks.y = element_blank()) +
labs(title = "Facetted histograms with overlaid normal curves",
caption = "The grey histograms shows the whole distribution (over) both groups, i.e. females and men") +
scale_fill_brewer(type = "qual", palette = "Set1")