How I can makea plot which I have customized in R - r

I want to plot the following plot
The x-axis ranges from 1 to 9, and the y-axis ranges from -0.5 to +0.5. I have also specified colours within the boxes

First I created some reproducible data with Y factors and X values. You could define the correct and incorrect colors in a new column using case_when. To create bars use geom_col and scale_fill_manual to define the labels for your colors. Here is a reproducible example:
# Data
df <- data.frame(Y = rep(c(0.3, -0.1, -0.3), each = 9),
X = rep(c(1:9), n = 3))
library(dplyr)
library(ggplot2)
df %>%
mutate(color = case_when(Y == 0.3 | Y == -0.3 ~ 'orange',
TRUE ~ 'grey')) %>%
ggplot(aes(x = X, y = factor(Y), fill = color)) +
geom_col(width = 1) +
scale_fill_manual('', values = c('orange' = 'orange', 'grey' = 'grey'),
labels = c('Correct', 'Incorrect')) +
theme_classic() +
labs(y = 'Y', x = '')
Created on 2022-12-03 with reprex v2.0.2
Update
Slightly modify the data:
df <- data.frame(Y = rep(c(0.45, 0.25, 0.05, -0.05, -0.25, -0.45), each = 9),
X = rep(c(1:9), n = 6))
library(dplyr)
library(ggplot2)
df %>%
mutate(color = case_when(Y %in% c(-0.45, 0.45, -0.25, 0.25) ~ 'orange',
TRUE ~ 'grey')) %>%
ggplot(aes(x = X, y = factor(Y), fill = color)) +
geom_col(width = 1) +
scale_fill_manual('', values = c('orange' = 'orange', 'grey' = 'grey'),
labels = c('Correct', 'Incorrect')) +
theme_classic() +
labs(y = 'Y', x = '')
Created on 2022-12-03 with reprex v2.0.2
Update to axis
You can use the following code:
df <- data.frame(Y = c(0.45, 0.25, 0.05, -0.05, -0.25, -0.45),
X = rep(9, n = 6))
library(dplyr)
library(ggplot2)
df %>%
mutate(color = case_when(Y %in% c(-0.45, 0.45, -0.25, 0.25) ~ 'orange',
TRUE ~ 'grey')) %>%
ggplot(aes(x = X, y = factor(Y), fill = color)) +
geom_col(width = 1) +
scale_fill_manual('', values = c('orange' = 'orange', 'grey' = 'grey'),
labels = c('Correct', 'Incorrect')) +
theme_classic() +
labs(y = 'Y', x = '') +
coord_cartesian(expand = FALSE, xlim = c(1, NA)) +
scale_x_continuous(breaks = seq(1, 9, by = 1))
Created on 2022-12-03 with reprex v2.0.2

Related

Joining 2 bar columns in barcharts with curved line

I have below ggplot:
library(ggplot2)
data = rbind(data.frame('val' = c(10, 30, 15), 'name' = c('A', 'B', 'C'), group = 'gr1'), data.frame('val' = c(30, 40, 12), 'name' = c('A', 'B', 'C'), group = 'gr2'))
ggplot(data, # Draw barplot with grouping & stacking
aes(x = group,
y = val,
fill = name)) +
geom_bar(stat = "identity",
position = "stack", width = .1)
With this, I am getting below plot
However, I want to connect these bars with a curved area where the area would be equal to the value of the corresponding bar-component. A close example could be like,
Is there any way to achieve this with ggplot?
Any pointer will be very helpful.
This is something like an alluvial plot. There are various extension packages that could help you create such a plot, but it is possible to do it in ggplot directly using a bit of data manipulation.
library(tidyverse)
alluvia <- data %>%
group_by(name) %>%
summarize(x = seq(1, 2, 0.01),
val = pnorm(x, 1.5, 0.15) * diff(val) + first(val))
ggplot(data,
aes(x = as.numeric(factor(group)),
y = val,
fill = name)) +
geom_bar(stat = "identity",
position = "stack", width = .1) +
geom_area(data = alluvia, aes(x = x), position = "stack", alpha = 0.5) +
scale_x_continuous(breaks = 1:2, labels = levels(factor(data$group)),
name = "Group", expand = c(0.25, 0.25)) +
scale_fill_brewer(palette = "Set2") +
theme_light(base_size = 20)
EDIT
A more generalized solution for more than 2 groups would be
library(tidyverse)
alluvia <- data %>%
mutate(group = as.numeric(factor(group)),
name = factor(name)) %>%
arrange(group) %>%
group_by(name) %>%
mutate(next_group = lead(group),
next_val = lead(val)) %>%
filter(!is.na(next_val)) %>%
group_by(name, group) %>%
summarise(x = seq(group + 0.01, next_group - 0.01, 0.01),
val = (next_val - val) * pnorm(x, group + 0.5, 0.15) + val)
ggplot(data,
aes(x = as.numeric(factor(group)),
y = val,
fill = name)) +
geom_bar(stat = "identity",
position = "stack", width = .1) +
geom_area(data = alluvia, aes(x = x), position = "stack", alpha = 0.5) +
scale_x_continuous(breaks = seq(length(unique(data$group))),
labels = levels(factor(data$group)),
name = "Group", expand = c(0.25, 0.25)) +
scale_fill_brewer(palette = "Set2") +
theme_light(base_size = 20)

add pvalue bars to facet plot with "fill" sub-group

I'm looking for a solution since too much time without finding it, so it's time to ask for some help...
I would like to add pValue to boxplots organized with facet_wrap (ggplot2). Similar to what you obtain with the script I add to this post (the first part of the script is the exemple of what I want and it's working well for 1 plot, the second part is related to facet and doesn't work).
I would like to add pvalue between all "dose" values of "OJ", same for "VC", but also between, for exemple "dose"=1 of OJ and VC (as in the plot). It's working well for 1 plot, but not in facet_wrap. The error message is:
Error: Assigned data value must be compatible with existing data.
x Existing data has 6 rows.
x Assigned data has 60 rows.
ℹ Only vectors of size 1 are recycled.
Thanks for your help (if only...)
The script:
################# DATAFRAME
data("ToothGrowth")
df <- ToothGrowth
vec <- c("A","B")
df$dose <- as.character(df$dose)
df$facet <- rep(sample(vec, 2),replace=T, nrow(df)/2)
view(df)
################### STAT
df_pval <- df %>%
rstatix::group_by(dose) %>%
rstatix::wilcox_test(len ~ supp) %>%
rstatix::add_xy_position()
df_pval2 <- df %>%
rstatix::group_by(supp) %>%
rstatix::wilcox_test(len ~ dose) %>%
rstatix::add_xy_position(x = "supp", dodge = 0.8)
################### PLOT
plotx <- ggplot(df, aes(x = supp, y = len)) +
geom_boxplot(aes(fill = dose)) +
stat_pvalue_manual(df_pval,
label = "{p}",
color = "dose",
fontface = "bold",
step.group.by = "dose",
step.increase = 0.1,
tip.length = 0,
bracket.colour = "black",
show.legend = FALSE) +
stat_pvalue_manual(df_pval2,
label = "{p}",
color = "black",
fontface = "bold",
step.group.by = "supp",
step.increase = 0.1,
tip.length = 0,
bracket.colour = "black",
show.legend = FALSE)
plot(plotx)
################### STAT FACET
df_pval3 <- df %>%
rstatix::group_by(dose, facet) %>%
rstatix::wilcox_test(len ~ supp) %>%
rstatix::add_xy_position()
df_pval4 <- df %>%
rstatix::group_by(supp, facet) %>%
rstatix::wilcox_test(len ~ dose) %>%
rstatix::add_xy_position(x = "supp", dodge = 0.8)
print(df_pval)
print(df_pval2)
###################### PLOT FACET
ploty <- ggplot(df, aes(x = supp, y = len)) +
geom_boxplot(aes(fill = dose)) +
facet_wrap(~df[,4]) + stat_pvalue_manual(df_pval3,
label = "{p}",
color = "dose",
fontface = "bold",
step.group.by = "dose",
step.increase = 0.1,
tip.length = 0,
bracket.colour = "black",
show.legend = FALSE) +
stat_pvalue_manual(df_pval4,
label = "{p}",
color = "black",
fontface = "bold",
step.group.by = "supp",
step.increase = 0.1,
tip.length = 0,
bracket.colour = "black",
show.legend = FALSE)
plot(ploty)

plot density plots with confidence intervals of 95% in R

I'm trying draw multiple density plots in one plot for comparison porpuses. I wanted them to have their confidence interval of 95% like in the following figure. I'm working with ggplot2 and my df is a long df of observations for a certain location that I would like to compare for different time intervals.
I've done some experimentation following this example but I don't have the coding knowledge to achieve what I want.
What i managed to do so far:
library(magrittr)
library(ggplot2)
library(dplyr)
build_object <- ggplot_build(
ggplot(data=ex_long, aes(x=val)) + geom_density())
plot_credible_interval <- function(
gg_density, # ggplot object that has geom_density
bound_left,
bound_right
) {
build_object <- ggplot_build(gg_density)
x_dens <- build_object$data[[1]]$x
y_dens <- build_object$data[[1]]$y
index_left <- min(which(x_dens >= bound_left))
index_right <- max(which(x_dens <= bound_right))
gg_density + geom_area(
data=data.frame(
x=x_dens[index_left:index_right],
y=y_dens[index_left:index_right]),
aes(x=x,y=y),
fill="grey",
alpha=0.6)
}
gg_density <- ggplot(data=ex_long, aes(x=val)) +
geom_density()
gg_density %>% plot_credible_interval(tab$q2.5[[40]], tab$q97.5[[40]])
Help would be much apreaciated.
This is obviously on a different set of data, but this is roughly that plot with data from 2 t distributions. I've included the data generation in case it is of use.
library(tidyverse)
x1 <- seq(-5, 5, by = 0.1)
t_dist1 <- data.frame(x = x1,
y = dt(x1, df = 3),
dist = "dist1")
x2 <- seq(-5, 5, by = 0.1)
t_dist2 <- data.frame(x = x2,
y = dt(x2, df = 3),
dist = "dist2")
t_data = rbind(t_dist1, t_dist2) %>%
mutate(x = case_when(
dist == "dist2" ~ x + 1,
TRUE ~ x
))
p <- ggplot(data = t_data,
aes(x = x,
y = y )) +
geom_line(aes(color = dist))
plot_data <- as.data.frame(ggplot_build(p)$data)
bottom <- data.frame(plot_data) %>%
mutate(dist = case_when(
group == 1 ~ "dist1",
group == 2 ~ "dist2"
)) %>%
group_by(dist) %>%
slice_head(n = ceiling(nrow(.) * 0.1)) %>%
ungroup()
top <- data.frame(plot_data) %>%
mutate(dist = case_when(
group == 1 ~ "dist1",
group == 2 ~ "dist2"
)) %>%
group_by(dist) %>%
slice_tail(n = ceiling(nrow(.) * 0.1)) %>%
ungroup()
segments <- t_data %>%
group_by(dist) %>%
summarise(x = mean(x),
y = max(y))
p + geom_area(data = bottom,
aes(x = x,
y = y,
fill = dist),
alpha = 0.25,
position = "identity") +
geom_area(data = top,
aes(x = x,
y = y,
fill = dist),
alpha = 0.25,
position = "identity") +
geom_segment(data = segments,
aes(x = x,
y = 0,
xend = x,
yend = y,
color = dist,
linetype = dist)) +
scale_color_manual(values = c("red", "blue")) +
scale_linetype_manual(values = c("dashed", "dashed"),
labels = NULL) +
ylab("Density") +
xlab("\U03B2 for AQIv") +
guides(color = guide_legend(title = "p.d.f \U03B2",
title.position = "right",
labels = NULL),
linetype = guide_legend(title = "Mean \U03B2",
title.position = "right",
labels = NULL,
override.aes = list(color = c("red", "blue"))),
fill = guide_legend(title = "Rej. area \U03B1 = 0.05",
title.position = "right",
labels = NULL)) +
annotate(geom = "text",
x = c(-4.75, -4),
y = 0.35,
label = c("RK", "OK")) +
theme(panel.background = element_blank(),
panel.border = element_rect(fill = NA,
color = "black"),
legend.position = c(0.2, 0.7),
legend.key = element_blank(),
legend.direction = "horizontal",
legend.text = element_blank(),
legend.title = element_text(size = 8))

ggplot scale alpha to only one variable

Is there a straightforward way to use alpha on only one variable using ggplot2?
I would have imagined that scale_alpha_manual(values = c(0, 1)) would work like scale_color_manual(). Ultimately, I am interested in doing an animation where a colour appears gradually.
df = data.frame(time = 1:100, x1 = rnorm(100, 1, 5), x2 = rnorm(100, 1, 5)) %>%
melt(id.vars = 'time')
df %>%
ggplot(aes(time, value, colour = variable)) +
geom_line() +
scale_color_manual(values = c('black', 'blue')) +
scale_alpha_manual(values = c(0, 1))
I am trying to get something like this but with an alpha
You could use the alpha as an aesthetic:
df = data.frame(time = 1:100, x1 = rnorm(100, 1, 5), x2 = rnorm(100, 1, 5)) %>%
melt(id.vars = 'time')
df %>%
ggplot(aes(time, value, colour = variable, alpha=variable)) +
geom_line() +
scale_color_manual(values = c('black', 'blue')) +
scale_alpha_manual(values = c(0.3, 1))

R - (ggplot) Make geom_step jumps dashed

I'm plotting a discrete CDF. I have a few questions regarding geom_step which I'm not finding by using Google.
Is it possible to make the line segment representing the jump dashed
rather than solid to better show whats going on?
Is it possible to add geom_point more efficiently than I do? (less
c/p).
Below is my current solution:
library(tidyverse)
library(ggthemes)
theme_set(theme_few())
x0 <- seq(-0.5, -0.01, by = 0.01)
x1 <- seq(0, 0.99, by = 0.02)
x2 <- seq(1, 1.99, by = 0.02)
x3 <- seq(2, 2.99, by = 0.02)
x35 <- seq(3, 3.49, by = 0.01)
x4 <- seq(3.5, 3.99, by = 0.01)
tibble_ex <- tibble(
x0 = x0,
x1 = x1,
x2 = x2,
x3 = x3,
x35 = x35,
x4 = x4
)
tibble_ex %>%
gather(x, xax, x0:x4) %>%
mutate(cdf = case_when(x == 'x0' ~ 0,
x == 'x1' ~ 1/2,
x == 'x2' ~ 3/5,
x == 'x3' ~ 4/5,
x == 'x35' ~ 9/10,
x == 'x4' ~ 1)) %>%
ggplot(aes(x = xax, y = cdf)) +
geom_step() +
geom_point(aes(x = 0, y = 0), size = 3, shape = 21, fill = 'white') +
geom_point(aes(x = 1, y = 0.5), size = 3, shape = 21, fill = 'white') +
geom_point(aes(x = 2, y = 3/5), size = 3, shape = 21, fill = 'white') +
geom_point(aes(x = 3, y = 4/5), size = 3, shape = 21, fill = 'white') +
geom_point(aes(x = 3.5, y = 9/10), size = 3, shape = 21, fill = 'white') +
geom_point(aes(x = 0, y = 0.5), size = 3, shape = 21, fill = 'black') +
geom_point(aes(x = 1, y = 3/5), size = 3, shape = 21, fill = 'black') +
geom_point(aes(x = 2, y = 4/5), size = 3, shape = 21, fill = 'black') +
geom_point(aes(x = 3, y = 9/10), size = 3, shape = 21, fill = 'black') +
geom_point(aes(x = 3.5, y = 1), size = 3, shape = 21, fill = 'black') +
labs(x = 'x', y = 'F(x)')
ggplot will be more powerful to use if you can put your data into a data frame and structure it so that the characteristics of your data can be mapped directly.
Here's a way to take your data and augment it with additional rows that represent the connecting points, by matching each x with the prior cdf value. I added a column, type, to keep track of which is which. I also arrange df so that geom_segment plots the points in the right order.
new_steps <-
tibble(x = c(0:3, 3.5, 4),
cdf = c(0, .5, .6, .8, .9, 1))
df <- new_steps %>%
mutate(type = "cdf") %>%
bind_rows(new_steps %>%
mutate(type = "prior",
cdf = lag(cdf))) %>%
drop_na() %>%
arrange(x, desc(type))
Then we can map the points' fill and the geom_segments' linetype to type.
ggplot(df) +
geom_point(aes(x, cdf, fill = type),
shape = 21) +
scale_fill_manual(values = c("black", "white")) +
geom_segment(aes(x = lag(x), y = lag(cdf),
xend = x, yend = cdf,
lty = type)) +
scale_linetype_manual(values = c("dashed", "solid"))
(1) No, there is not a built-in way to make the geom_step half-dashed. But if you post this as a separate question, perhaps someone will help create a new geom for this.
(2) The answer is to put the points you want plotted in a data frame, like anything else you might want to plot:
point_data = data.frame(x = rep(c(0, 1, 2, 3, 3.5), 2),
y = c(0, rep(c(.5, .6, .8, .9), 2), 1),
z = rep(c("a", "b"), each = 5))
# calling your gathered/mutated version of tibble_ex df
ggplot(df, aes(x = xax, y = cdf)) +
geom_step() +
geom_point(data = point_data, aes(x = x, y = y, fill = z), shape = 21) +
scale_fill_manual(values = c("white", "black"), guide = FALSE) +
labs(x = 'x', y = 'F(x)')
For the second part of your question, you can put all the coordinates in a separate data frame and call geom_point only once:
ddf <- data.frame(xax = rep(c(0:3, 3.5), 2),
cdf = c(0, .5, .6, .8, .9, .5, .6, .8, .9, 1),
col = rep(c("white", "black"), each = 5))
dev.new()
tibble_ex %>%
gather(x, xax, x0:x4) %>%
mutate(cdf = case_when(x == 'x0' ~ 0,
x == 'x1' ~ 1/2,
x == 'x2' ~ 3/5,
x == 'x3' ~ 4/5,
x == 'x35' ~ 9/10,
x == 'x4' ~ 1)) %>%
ggplot(aes(x = xax, y = cdf)) +
geom_step() +
geom_point(data = ddf, aes(fill = I(col)), size = 3, shape = 21) +
labs(x = 'x', y = 'F(x)')

Resources