ggplot2: Highlight area depending on dnorm function - r

I want to highlight the area between a vertical line and a normal distributed function. I know how it works with discrete values, but the stat_function confuses me. The code looks something like this:
library(ggplot2)
n1 <- 5
ggplot(data.frame(x = c(-2, 2)), aes(x)) +
stat_function(fun = dnorm, args = list(sd = 1/sqrt(n1))) +
geom_vline(xintercept = 0.5, linetype = "dashed", color = "red", size = 1) +
geom_vline(xintercept = -0.5, linetype = "dashed", color = "red", size = 1) +
ylim(c(0, 1.5)) +
theme_light() +
geom_rect(aes(xmin = 0.5, xmax = Inf, ymax = Inf, ymin = 0), fill = "grey", alpha = .3)
I know I need to change ymax to the values of x > 0.5. The question is how?
EDIT:
I looked into the question which is supposed to be the same as mine. When I rewrite the code the way they did, the highlighting works but it doesn't give me a proper normal distribution anymore, as you can see here:
library(dplyr)
set.seed(123)
range <- seq(from = -2, to = 2, by = .01)
norm <- rnorm(range, sd = 1 / sqrt(n1))
df <- data_frame(x = density(norm)$x, y = density(norm)$y)
ggplot(data_frame(values = norm)) +
stat_density(aes(x = values), geom = "line") +
geom_vline(xintercept = 0.5, linetype = "dashed", color = "red", size = 1) +
geom_vline(xintercept = -0.5, linetype = "dashed", color = "red", size = 1) +
ylim(c(0, 1.5)) +
theme_light() +
geom_ribbon(data = filter(df, x > 0.5),
aes(x = x, ymax = y), ymin = 0, fill = "red", alpha = .5)
When I stick with stat_function and use geom_ribbon with subsetting as proposed in the very same question, it highlights buggy, as you can see here:
ggplot(data_frame(x = c(-2, 2)), aes(x)) +
stat_function(fun = dnorm, args = list(sd = 1/sqrt(n1))) +
geom_vline(xintercept = 0.5, linetype = "dashed", color = "red", size = 1) +
geom_vline(xintercept = -0.5, linetype = "dashed", color = "red", size = 1) +
ylim(c(0, 1.5)) +
theme_light() +
geom_ribbon(data = filter(df, x > 0.5),
aes(x = x, ymax = y), ymin = 0, fill = "red", alpha = .5)
Not satisfying yet.

Here is an approach:
library(ggplot2)
n1 <- 5
ggplot(data.frame(x = c(-2, 2)), aes(x)) +
stat_function(fun = dnorm, geom = "area", fill = "grey", alpha = 0.3, args = list(sd = 1/sqrt(n1)), xlim = c(-0.5,0.5)) +
stat_function(fun = dnorm, args = list(sd = 1/sqrt(n1))) +
geom_vline(xintercept = 0.5, linetype = "dashed", color = "red", size = 1) +
geom_vline(xintercept = -0.5, linetype = "dashed", color = "red", size = 1) +
ylim(c(0, 1.5)) +
theme_light()
in stat_function one can define different geom, just pick the ones that suits your needs.

Related

shaded area between two lines in ggplot

I would like to create a shaded area in color blue between the two dotted lines (-0.5 and 0.5), tried with geom_polygon() but didn't work.
How can this be done in the best possible way?
model <- lm(Sepal.Width ~ Petal.Length, data = iris)
ggplot(data.frame(x = seq(model$residuals), y = model$residuals)) +
geom_point(aes(x, y)) +
geom_hline(yintercept = 0, linetype = "dashed") +
geom_hline(yintercept = 0.5, linetype = "dotted") +
geom_hline(yintercept = -0.5, linetype = "dotted") +
labs(x = "Index", y = "Residuals",
title = paste("Residuals of", format(model$call)))
You can use geom_ribbon
ggplot(data.frame(x = seq(model$residuals), y = model$residuals)) +
geom_point(aes(x, y)) +
geom_ribbon(aes(x, ymin = -0.5, ymax = 0.5), alpha = 0.3, fill = 'steelblue')+
geom_hline(yintercept = 0, linetype = "dashed") +
geom_hline(yintercept = 0.5, linetype = "dotted") +
geom_hline(yintercept = -0.5, linetype = "dotted") +
labs(x = "Index", y = "Residuals",
title = paste("Residuals of", format(model$call)))
With annotate:
annotate("rect", xmin = -Inf, xmax = Inf, ymin = -0.5, ymax = 0.5, alpha = .2, fill = "blue")
Output:
ggplot(data.frame(x = seq(model$residuals), y = model$residuals)) +
geom_point(aes(x, y)) +
geom_hline(yintercept = 0, linetype = "dashed") +
geom_hline(yintercept = c(-0.5, 0.5), linetype = "dotted") +
annotate("rect", xmin = -Inf, xmax = Inf, ymin = -0.5, ymax = 0.5, alpha = .2, fill = "blue") +
labs(x = "Index", y = "Residuals",
title = paste("Residuals of", format(model$call)))

Adding labels to this ggridge plot

I can't see, to figure out how to add labels to this plot:
ggplot(input_cleaned, aes(x =DAYS_TO_FA, y = fct_rev(DATE_TEXT), group = fct_rev(DATE_TEXT))) +
geom_density_ridges2(stat="binline", bins = 75, scale = 0.95, draw_baseline = FALSE) +
labs(title = 'Monthly Plots of Time to First Nose Pickin', y='Month Tracked', x = 'Days to First Pickin Action') +
theme(plot.title = element_text(hjust = 0.5), plot.subtitle = element_text(hjust = 0.5)) +
scale_x_continuous(breaks=seq(0,130,5)) +
geom_segment(aes(x=50, xend = 50, y=1,yend=5),
linetype = "dashed", size = 1.5,
color = "black") +
geom_segment(aes(x=75, xend = 75, y=5,yend=30),
linetype = "dashed", size = 1.5,
color = "black")
which produces:
I have tried this:
ggplot(input_cleaned, aes(x =DAYS_TO_FA, y = fct_rev(DATE_TEXT), group = fct_rev(DATE_TEXT))) +
geom_density_ridges2(stat="binline", bins = 75, scale = 0.95, draw_baseline = FALSE) +
geom_text(stat = "bin",
aes(y = fct_rev(input_cleaned$DATE_TEXT) + 0.95*(..count../max(..count..)), label = ifelse(..count..>0, ..count.., "")),
vjust = 1.4, size = 3, color = "white", binwidth = 1) +
labs(title = 'Monthly Plots of Time to First Nose Pickin', y='Month Tracked', x = 'Days to First Pickin Action') +
theme(plot.title = element_text(hjust = 0.5), plot.subtitle = element_text(hjust = 0.5)) +
scale_x_continuous(breaks=seq(0,130,5)) +
geom_segment(aes(x=50, xend = 50, y=1,yend=5),
linetype = "dashed", size = 1.5,
color = "black") +
geom_segment(aes(x=75, xend = 75, y=5,yend=30),
linetype = "dashed", size = 1.5,
color = "black")
based on the example found here:
Visualization of Groups of Poisson random samples using ggridges
but I can't get it to work. Nothing changes.
I know it may not be a good idea for this graph, but I am interested in seeing how it looks and more or less learning how to apply it.

Change geom_text label angle in plotly

I have the following chart I have built using ggplot + ggplotly.
I am trying to add labels to the red (median) and blue (percentile 90%) vertical lines without luck.
Please advise how should I fix it.
The code I have used:
p1 <- ggplot(users_d_total %>% filter(isSame, D_rank == 2), aes(x = D, fill = as.factor(train_user_id))) +
geom_density(alpha = .3) +
labs(title = paste0("Without Normalization Analysis [K = 2]")) +
scale_fill_discrete(name = "Users") +
scale_x_continuous(breaks = by_two) +
geom_vline(aes(xintercept = median(D)), col = 'red', linetype = 1, size = 1) +
geom_text(aes(x = median(D), y = 1, label = "Median"), hjust = 1, angle = 90, colour= "red") +
geom_vline(aes(xintercept = quantile(D, probs = .9)), col = 'blue', linetype = 1, size = 1) +
geom_text(aes(x = quantile(D, probs = .9), y = 1, label = "90th Percentile"), hjust = 1, angle = 90, colour = "blue") +
theme(axis.text.x = element_text(angle = 90, hjust = 1))
ggplotly(p1)
I want the text to be vertical but using the answer from How to add legend for vertical lines in ggplot? didn't help me.

Transparent masking in ggplot2

I'm interested in ways to only include panel grid lines right near the ribbon--I can do this manually, in a trivial example
library(ggplot2)
d1 <- data.frame(x = seq(0, 1, length.out = 200))
d1$y1 <- -3*(d1$x-.5)^2 + 1
d1$y2 <- -3*(d1$x-.5)^2 + 2
ggplot(d1) +
geom_ribbon(aes(x, ymin = y1, ymax = y2),
alpha = .25) +
geom_ribbon(aes(x, ymax = y1),
ymin = .25,
fill = "white") +
geom_ribbon(aes(x, ymin = y2),
ymax = 2,
fill = "white") +
scale_y_continuous(limits = c(.25, 2.0),
expand = c(0, 0))+
scale_x_continuous(limits = c(0, 1),
expand = c(0, 0))+
theme_bw() +
theme(panel.grid = element_line(linetype = 1, color = "black"))
is there some less hacky way to have a transparent mask for these gridlines, so they only appear underneath a ribbon?
If gridlines the same color as the background are acceptable, you can remove the actual gridlines, then use geom_hline() and geom_vline() to make your own "gridlines" that will show on ribbons but be invisible against the background
d1$y3 <- d1$x + 0.3
d1$y4 <- d1$x + 0.4
ggplot(d1) +
geom_ribbon(aes(x, ymin = y1, ymax = y2), alpha = 0.25) +
geom_ribbon(aes(x, ymin = y3, ymax = y4), alpha = 0.25, fill = "blue") +
# use geom_vline and geom_hline to plot "gridlines" on top of ribbons
geom_hline(yintercept = seq(0, 2, by = 0.25), colour = "white") +
geom_vline(xintercept = seq(0, 1, by = 0.25), colour = "white") +
scale_y_continuous(limits = c(.25, 2.0), expand = c(0, 0)) +
scale_x_continuous(limits = c(0, 1), expand = c(0, 0)) +
theme_bw() +
theme(panel.grid.minor = element_blank(), # remove actual gridlines
panel.grid.major = element_blank())
produces this:
This is still a workaround, and will only make gridlines that match the background color, but it is easy to use with a variety of plots, such as the situation you mentioned with multiple ribbons (I've added a second ribbon to demonstrate that this will work)

Add legend to manually added lines using ggplot

I'm trying to add the corresponding legend for 3 manually added lines using ggplot. My code is the following:
library(ggplot2)
df = data.frame(error = c(0.0832544999, 0.0226680026, 0.0082536264, 0.0049199958, 0.0003917755, 0.0003859976, 0.0003888253, 0.0003953918, 0.0003958398), sDev = c(8.188111e-03, 2.976161e-03, 1.466221e-03, 2.141425e-03, 2.126976e-05, 2.139364e-05, 2.169059e-05, 2.629895e-05, 2.745938e-05))
minimum <- 6
best.model <- 5
gplot <- ggplot(df, aes(x=1:length(error), y=error)) +
scale_x_continuous(breaks = seq_along(df$error)) +
geom_point(size = 3) +
geom_line() +
geom_errorbar(data = df, aes(x = 1:length(error), ymin = error - sDev, ymax = error + sDev),
width = 0.1) +
geom_hline(data = df, aes(yintercept = error[minimum] + sDev[minimum]), linetype = "dashed") +
geom_vline(xintercept = minimum, linetype = "dotted", color = "red", size = 1) +
geom_vline(xintercept = best.model, linetype = "dotted", color = "blue", size = 1) +
theme_gray(base_size = 18) +
theme(axis.text = element_text(color = "black")) +
labs(x = "# of parameters", fontface = "bold") +
labs(y = "CV error") +
labs(title = "Cross-validation error curve")
I'd like to know how to add the legends for the 3 dotted lines in black, red, and blue.
Thanks a lot in advance!
The trick is to use appropriate mapping:
gplot <- ggplot(df, aes(x=1:length(error), y=error)) +
scale_x_continuous(breaks = seq_along(df$error)) +
geom_point(size = 3) +
geom_line() +
geom_errorbar(data = df, aes(x = 1:length(error), ymin = error - sDev, ymax = error + sDev),
width = 0.1) +
geom_hline(data = df, aes(yintercept = error[minimum] + sDev[minimum], linetype="a", colour="a")) +
geom_vline(data= data.frame(type="b", col="b", minimum=minimum),
aes(linetype=type, colour=col, xintercept = minimum), size = 1, show_guide = TRUE) +
geom_vline(data= data.frame(type="b", col="b", best.model=best.model),
aes(linetype="c", colour="c", xintercept = best.model), size = 1, show_guide = TRUE) +
scale_colour_manual(name="Legend", values = c("a" = "black", "b" = "red", "c" = "blue")) +
scale_linetype_manual(name="Legend", values = c("a" = "dashed", "b" = "dotted", "c" = "dotted")) +
theme_gray(base_size = 18) +
theme(axis.text = element_text(color = "black"),
legend.key.height = grid::unit(0.1, "npc")) +
labs(x = "# of parameters", fontface = "bold") +
labs(y = "CV error") +
labs(title = "Cross-validation error curve")

Resources