ggdist stat_halfeye not scaling correctly - r

I seem to have an error in the way my distribution looks. The bottom ridges of each of the facetted graphs are not at the same scale as the other ridges above, or relative to the number of counts (i.e. scale dots shown).
Is there a way to scale all distributions relative to one another?
season_names <- c(`0` = "COOL", `1` = "HOT-DRY",`2` = "HOT-WET")
dCLEAN %>%
ggplot(aes(x = tdb, y = as.factor(tsv), fill = as.factor(season))) +
ggdist::stat_halfeye(
adjust = 0.9,
justification = -0.15,
.width = 0,
point_colour = NA) +
geom_boxplot(
width = 0.2,
outlier.colour = NA,
alpha = 0.5)+
ggdist::stat_dots(
side = "left",
justification = 1.18,
binwidth = 0.1) +
facet_wrap(~ season, labeller = as_labeller(season_names)) +
theme_bw() +
theme(strip.background = element_rect(fill="white")) +
theme(legend.position = "none") +
scale_color_grey()+
scale_fill_grey()
Image of current graph (Errors seem to be in Cool graph -2 distribution, Hot-Dry graph -1 distribution, Hot-Wet graph -1 distribution

By default, the densities are scaled to have equal area regardless of the number of observations. If you wish to scale the areas according to the number of observations, you can set aes(thickness = stat(pdf*n)) in stat_halfeye(). This sets the thickness of the slab according to the product of two computed variables generated by stat_halfeye(): the density (pdf) and the number of observations per group (n).
Here's an example from the raincloud plots section of the dotsinterval vignette:
set.seed(12345) # for reproducibility
data.frame(
abc = c("a", "b", "b", "c"),
value = rnorm(200, c(1, 8, 8, 3), c(1, 1.5, 1.5, 1))
) %>%
ggplot(aes(y = abc, x = value, fill = abc)) +
stat_slab(aes(thickness = stat(pdf*n)), scale = 0.7) +
stat_dotsinterval(side = "bottom", scale = 0.7, slab_size = NA) +
scale_fill_brewer(palette = "Set2")
This example uses stat_slab() in place of stat_halfeye(). stat_slab() is a shortcut stat equivalent to stat_halfeye() but without the point and the interval (so it saves you passing .width = 0 and point_color = NA to stat_halfeye()).

Related

how to plot a bidirectional flip bargraph with count data?

I am analysing some data with a binomial distribution.
We have 2 possible choices for a stimulus, and patients (female and male) have to decide whether they feel pain (1) or not (0).
I would like to plot a bargraph showing the number of patients who choose 0 or 1, in a rotated way.
An idea of the graph I am looking for is the following, from Sevarika et al, 2022.
#my data
id<-c(1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10)
trt<-c("C","E","C","E","C","E","C","E","C","E","C","E","C","E","C","E","C","E","E","C")
response<-c(0,1,0,1,0,1,1,1,0,1,0,1,0,1,1,1,0,1,0,1)
sex<-c(rep("male",5),rep("female",5))
data<-data.frame(id,trt,response,sex)
So my objective is a flipped boxplot where females and males are separated, and the number of 1 or 0 is shown on each side of the axis. I mean, where it says control, let it say 0, where it says treatment let it say 1, and the top bar should be males and the bottom bar should be females.
Thank you very much, best regards
Base R way:
# compute percentages
tab <- t(table(data$response, data$sex) * c(-1, 1))
tab <- tab / rowSums(abs(tab)) * 100
# positions of x axis labels
lab.x <- seq(-100, 100, 25)
# initiate new plot
frame()
par(mar=c(2.5, 1, 2, 1))
plot.window(range(lab.x), c(0, 1))
# draw x axis
axis(1, at=lab.x, labels=abs(lab.x))
# draw vertical lines
abline(v=0)
abline(v=c(-1, 1)*50, lty=2)
# bar middle y coordinates
bar.mid <- c(.3, .7)
# bar height
bar.ht <- .25
# draw bars
rect(c(tab), bar.mid - bar.ht/2, 0, bar.mid + bar.ht/2,
col=rep(gray(c(.8, .2)), each=2))
mtext(c('No pain', 'Pain'), 3, at=c(-1, 1)*40, cex=1.3, line=.5)
# print bar labels (male, female)
text(min(lab.x), bar.mid, rownames(tab), adj=0)
You probably need to wrangle your data into a more appropriate format for plotting. Here's one method of doing it:
library(tidyverse)
data %>%
count(response, sex) %>%
mutate(n = ifelse(response == 0, -n, n)) %>%
ggplot(aes(sex, n, fill = factor(response))) +
geom_hline(yintercept = 0) +
geom_hline(yintercept = c(-3, 3), linetype = 2, size = 0.2) +
geom_col(position = 'identity', color = 'black', width = 0.5) +
coord_flip() +
scale_y_continuous(breaks = seq(-7, 7), name = 'count', limits = c(-7, 7)) +
scale_fill_manual(values = c("#bebebe", "#2a2a2a"), guide = 'none') +
annotate('text', y = c(-4, 4), x = c(2.8, 2.8), vjust = 1, size = 6,
label = c('RESPONSE = 0', 'RESPONSE = 1'), fontface = 2) +
scale_x_discrete(expand = c(0, 1), name = NULL) +
theme_minimal(base_size = 16) +
theme(axis.line.x = element_line(),
axis.ticks.x = element_line(),
panel.grid = element_blank())

Combined bar plot and points with offset in points when X values are not numbers ggplot2

I'm trying to obtain a plot like the one shown in Combined bar plot and points in ggplot2, with the points off to the side of the bars. One of the answers suggests to subtract an offset from the x values in geom_point(), and that works; but my problem comes when x values are not numbers since I can't subtract a number to them.
For example, this works:
df = data.frame(Xval = c(2, 4, 6), Yval = c(5, 6.1, 5.4))
ggplot() +
geom_bar(df, mapping = aes(Xval, Yval), stat = "identity", width = 0.5, color = "black", fill = "#92DAB8") +
geom_point(df, mapping = aes(Xval-.5, Yval))
But this does not work:
df2 = data.frame(Xval = c("A", "B", "C"), Yval = c(5, 6.1, 5.4))
ggplot() +
geom_bar(df2, mapping = aes(Xval, Yval), stat = "identity", width = 0.5, color = "black", fill = "#92DAB8") +
geom_point(df2, mapping = aes(Xval-.5, Yval))
Is there any way to do the offset like in the first plot? Ideally, I would like to have a solution that works in both cases since I want to make a "plotter" function and you wouldn't know beforehand whether the values ​​of X are numbers, but any solution (even one that implies doing the trick in two different ways depending on the X values) will be helpful. Maybe there is a way to get the actual X values in the plot, but I searched for solutions following this idea and found nothing. Thanks in advance!
In your case one option would be to use position_nudge to shift the points:
plot_fun <- function(.data, nudge_x = 0) {
ggplot(.data) +
geom_bar(aes(Xval, Yval), stat = "identity", width = 0.5, color = "black", fill = "#92DAB8") +
geom_point(aes(Xval, Yval), position = position_nudge(x = nudge_x))
}
library(ggplot2)
df = data.frame(Xval = c(2, 4, 6), Yval = c(5, 6.1, 5.4))
df2 = data.frame(Xval = c("A", "B", "C"), Yval = c(5, 6.1, 5.4))
plot_fun(df, nudge_x = -.5)
plot_fun(df2, nudge_x = -.5)

How to draw both positive mirror bar graph in r?

I would like to draw a graph similar to the image here:
I tried to find similar mirror bar graphs on google, but I could not find similar graph to the image above.
Tricky parts of the graph are that 1) both +ve and -ve y axis have positive values, and 2) both +ve and -ve y axis have different y-axis labellings.
Thank you in advance for your help.
This is as close as I could get so far to that graph.
It's really tricky.
The Y axis has to be positive on the negative side
On the negative side numbers have to look 5 times smaller because of the number on the Y axis being 5 times smaller [from 1 to 5 instead of 1 to 25]
uncertainty bars need to drawn
X labels are doubled
What I couldn't do:
set up the Y axis names in a proper manner, [if anyone knows and can help..!]
understand what a and b are and with which logic to place them [you need to explain this one better]
library(dplyr)
library(ggplot2)
# your data
n <- 100
set.seed(42)
df <- tibble(var1 = factor(rep(c("Mamou", "Crowley"), each = 8 * n), levels = c("Mamou", "Crowley"), ordered = TRUE),
var2 = factor(rep(c("RWW-M1", "RWW-M2", "RWW-C1", "RWW-C2"), each = 4* n), levels = c("RWW-M1", "RWW-M2", "RWW-C1", "RWW-C2"), ordered = TRUE),
var3 = factor(rep(rep(c("Shoot dry weight (g)", "Root dry weight (g)"), each = 2*n), 4), levels = c("Shoot dry weight (g)", "Root dry weight (g)"), ordered = TRUE),
varc = rep(rep(c("white", "black"), each = n), 8),
value = abs(c(
rnorm(2*n, mean = 5 , sd = 0.2),
rnorm(2*n, mean = 3 , sd = 0.04),
rnorm(2*n, mean = 15 , sd = 0.2),
rnorm(2*n, mean = 4 , sd = 0.04),
rnorm(2*n, mean = 5 , sd = 0.2),
rnorm(2*n, mean = 2.5, sd = 0.04),
rnorm(2*n, mean = 5 , sd = 0.2),
rnorm(2*n, mean = 2.5, sd = 0.04))))
# edit your data this way [a little trick to set bars up and down the line and make them look like 5 times bigger]
df <- df %>% mutate(value = if_else(var3 == "Root dry weight (g)", -value*5, value))
# calculate statistics you want to plot
df <- df %>%
group_by(var1, var2, var3, varc) %>%
summarise(mean = mean(value), min = min(value), max = max(value)) %>%
ungroup()
df %>%
ggplot(aes(x = var2)) +
# plot dodged bars
geom_col(aes(y = mean, fill = varc),
position = position_dodge(width = 0.75),
colour = "black", width = 0.5) +
# plot dodged errorbars
geom_errorbar(aes(ymin = min, ymax = max, group = varc),
position = position_dodge(width = 0.75), width = 0.2, size = 1) +
# make line on zero more visible
geom_hline(aes(yintercept = 0)) +
# set up colour of the bars, don't show legend
scale_fill_manual(values = c("white", "gray75"), guide = FALSE) +
# set up labels of y axis
# dont change positive, make negative look positive and 5 times smaller
# set up breaks every 5 [ggplot will calc labels after breaks]
scale_y_continuous(labels = function(x) if_else(x<0, -x/5, x),
breaks = function(x) as.integer(seq(x[1]-x[1]%%5, x[2]-x[2]%%5, 5))) +
# put labels and x axis on top
scale_x_discrete(position = "top") +
# set up var1 labels on top
facet_grid( ~ var1, space = 'free', scales = 'free') +
# show proper axis names
labs(x = "", y = "Root dry weight (g) Shoot dry weight (g)") +
# set up theme
theme_classic() +
theme(axis.line.x = element_blank(),
axis.ticks.x = element_blank(),
panel.grid = element_blank(),
# this is to put names of facet grid on top
strip.placement = 'outside',
# this is to remove background from labels on facet grid
strip.background = element_blank(),
# this is to make facets close to each other
panel.spacing.x = unit(0,"line"))
Something like this perhaps?
library(ggplot2)
df <- data.frame(x = rep(letters[1:3], each = 4),
y = c(2, -2, 3, -3, 4, -4, 5, -5, 2, -2, 3, -3),
dodgegroup = factor(rep(rep(1:2, each = 2), 3)))
ggplot(df, aes(x, y, fill = dodgegroup)) +
geom_col(position = position_dodge(width = 0.75),
colour = "black", width = 0.5) +
geom_hline(aes(yintercept = 0)) +
scale_fill_manual(values = c("white", "gray75")) +
scale_y_continuous(breaks = 0:10 - 5,
labels = c(5:0, 5 * 1:5)) +
theme_classic()
Created on 2020-08-07 by the reprex package (v0.3.0)
Try this. While the answer by Edo looks most like what you have asked for, this method does not need you to transform your data. However, the scale on both sides of the axis are the same.
Call geom_col twice but with - before the values for Root, then we use labels=abs to make both sides of the y-axis positive numbers:
Edit - fixed the y-axis
library(ggplot2)
df <- data.frame(x = rep(c("RWW-M1", "RWW-M2", "RWW-C1", "RWW-C2"), each = 2),
Shoot = c(5, 6, 7, 8, 4, 5, 5, 7),
Root = c(1, 2, 3, 4, 2, 3, 1, 2),
Condition = rep(c("control", "test"), each = 1))
p <- ggplot(df, aes(x=x, fill=Condition)) +
geom_col(aes(y=Shoot), position = position_dodge(width = 0.75), width = 0.5, colour = "black")+
geom_col(aes(y=-Root), position = position_dodge(width = 0.75), width = 0.5, colour = "black")+
geom_hline(aes(yintercept = 0)) +
scale_fill_manual(values = c("white", "gray75")) +
ylab("Root weight (g) / Shoot weight (g)")+
xlab("")+
scale_y_continuous(breaks = 0:15 - 5, labels=abs) +
theme_bw()
p

How to stop ggrepel labels moving between gganimate frames in R/ggplot2?

I would like to add labels to the end of lines in ggplot, avoid them overlapping, and avoid them moving around during animation.
So far I can put the labels in the right place and hold them static using geom_text, but the labels overlap, or I can prevent them overlapping using geom_text_repel but the labels do not appear where I want them to and then dance about once the plot is animated (this latter version is in the code below).
I thought a solution might involve effectively creating a static layer in ggplot (p1 below) then adding an animated layer (p2 below), but it seems not.
How do I hold some elements of a plot constant (i.e. static) in an animated ggplot? (In this case, the labels at the end of lines.)
Additionally, with geom_text the labels appear as I want them - at the end of each line, outside of the plot - but with geom_text_repel, the labels all move inside the plotting area. Why is this?
Here is some example data:
library(dplyr)
library(ggplot2)
library(gganimate)
library(ggrepel)
set.seed(99)
# data
static_data <- data.frame(
hline_label = c("fixed_label_1", "fixed_label_2", "fixed_label_3", "fixed_label_4",
"fixed_label_5", "fixed_label_6", "fixed_label_7", "fixed_label_8",
"fixed_label_9", "fixed_label_10"),
fixed_score = c(2.63, 2.45, 2.13, 2.29, 2.26, 2.34, 2.34, 2.11, 2.26, 2.37))
animated_data <- data.frame(condition = c("a", "b")) %>%
slice(rep(1:n(), each = 10)) %>%
group_by(condition) %>%
mutate(time_point = row_number()) %>%
ungroup() %>%
mutate(score = runif(20, 2, 3))
and this is the code I am using for my animated plot:
# colours for use in plot
condition_colours <- c("red", "blue")
# plot static background layer
p1 <- ggplot(static_data, aes(x = time_point)) +
scale_x_continuous(breaks = seq(0, 10, by = 2), expand = c(0, 0)) +
scale_y_continuous(breaks = seq(2, 3, by = 0.10), limits = c(2, 3), expand = c(0, 0)) +
# add horizontal line to show existing scores
geom_hline(aes(yintercept = fixed_score), alpha = 0.75) +
# add fixed labels to the end of lines (off plot)
geom_text_repel(aes(x = 11, y = fixed_score, label = hline_label),
hjust = 0, size = 4, direction = "y", box.padding = 1.0) +
coord_cartesian(clip = 'off') +
guides(col = F) +
labs(title = "[Title Here]", x = "Time", y = "Mean score") +
theme_minimal() +
theme(panel.grid.minor = element_blank(),
plot.margin = margin(5.5, 120, 5.5, 5.5))
# animated layer
p2 <- p1 +
geom_point(data = animated_data,
aes(x = time_point, y = score, colour = condition, group = condition)) +
geom_line(data = animated_data,
aes(x = time_point, y = score, colour = condition, group = condition),
show.legend = FALSE) +
scale_color_manual(values = condition_colours) +
geom_segment(data = animated_data,
aes(xend = time_point, yend = score, y = score, colour = condition),
linetype = 2) +
geom_text(data = animated_data,
aes(x = max(time_point) + 1, y = score, label = condition, colour = condition),
hjust = 0, size = 4) +
transition_reveal(time_point) +
ease_aes('linear')
# render animation
animate(p2, nframes = 50, end_pause = 5, height = 1000, width = 1250, res = 120)
Suggestions for consideration:
The specific repelling direction / amount / etc. in geom_text_repel is determined by a random seed. You can set seed to a constant value in order to get the same repelled positions in each frame of animation.
I don't think it's possible for repelled text to go beyond the plot area, even if you turn off clipping & specify some repel range outside plot limits. The whole point of that package is to keep text labels away from one another while remaining within the plot area. However, you can extend the plot area & use geom_segment instead of geom_hline to plot the horizontal lines, such that these lines stop before they reach the repelled text labels.
Since there are more geom layers using animated_data as their data source, it would be cleaner to put animated_data & associated common aesthetic mappings in the top level ggplot() call, rather than static_data.
Here's a possible implementation. Explanation in annotations:
p3 <- ggplot(animated_data,
aes(x = time_point, y = score, colour = condition, group = condition)) +
# static layers (assuming 11 is the desired ending point)
geom_segment(data = static_data,
aes(x = 0, xend = 11, y = fixed_score, yend = fixed_score),
inherit.aes = FALSE, colour = "grey25") +
geom_text_repel(data = static_data,
aes(x = 11, y = fixed_score, label = hline_label),
hjust = 0, size = 4, direction = "y", box.padding = 1.0, inherit.aes = FALSE,
seed = 123, # set a constant random seed
xlim = c(11, NA)) + # specify repel range to be from 11 onwards
# animated layers (only specify additional aesthetic mappings not mentioned above)
geom_point() +
geom_line() +
geom_segment(aes(xend = time_point, yend = score), linetype = 2) +
geom_text(aes(x = max(time_point) + 1, label = condition),
hjust = 0, size = 4) +
# static aesthetic settings (limits / expand arguments are specified in coordinates
# rather than scales, margin is no longer specified in theme since it's no longer
# necessary)
scale_x_continuous(breaks = seq(0, 10, by = 2)) +
scale_y_continuous(breaks = seq(2, 3, by = 0.10)) +
scale_color_manual(values = condition_colours) +
coord_cartesian(xlim = c(0, 13), ylim = c(2, 3), expand = FALSE) +
guides(col = F) +
labs(title = "[Title Here]", x = "Time", y = "Mean score") +
theme_minimal() +
theme(panel.grid.minor = element_blank()) +
# animation settings (unchanged)
transition_reveal(time_point) +
ease_aes('linear')
animate(p3, nframes = 50, end_pause = 5, height = 1000, width = 1250, res = 120)

ggplot2 - separating box plot labels by colour

I am trying to create a box plot with labels for some of the individal data. The box plot is separated by two variables, mapped to x and colour. However when I add labels using geom_text_repel from the ggrepel package (necessary for the real data) they separate by x but not colour. See this minimal reproducible example:
library(ggplot2)
library(ggrepel)
## create dummy data frame
rep_id <- c("a", "a", "b", "b", "c", "c", "d", "d", "e", "e")
dil <- c(1, 1, 1, 1, 2, 2, 2, 2, 2, 2)
bleach_time <- c(0, 24, 0, 24, 0, 24, 0, 24, 0, 24)
a_i <- c(0.1, 0.2, 0.35, 0.2, 0.01, 0.4, 0.23, 0.1, 0.2, 0.5)
iex <- data_frame(rep_id, dil, bleach_time, a_i)
rm(rep_id, dil, bleach_time, a_i)
## Plot bar chart of a_i separated by bleach_time and dil
p <- ggplot(iex, aes(x = as.character(bleach_time), y = a_i, fill = as.factor(dil))) +
geom_boxplot() +
geom_text_repel(aes(label = rep_id, colour = as.factor(dil)), na.rm = TRUE, segment.alpha = 0)
p
As you can see the labels are colour coded, but they are all lined up around the centre of each pair of plots rather than separated by the plots. I've tried nudge_x but that moves all the labels together. Is there a way I can move each set of labels individually?
For comparison here is the plot of my full data set with the outliers labelled - you can see how each set of labels isn't centred around the points it's labelling, complicating interpretation:
It looks like geom_text_repel needs position = position_dodge(width = __), not just the position = "dodge" shorthand I'd suggested, hence the error. You can mess around with setting the width; 0.7 looked okay to me.
library(tidyverse)
library(ggrepel)
ggplot(iex, aes(x = as.character(bleach_time), y = a_i, fill = as.factor(dil))) +
geom_boxplot() +
geom_text_repel(aes(label = rep_id, colour = as.factor(dil)), na.rm = TRUE,
segment.alpha = 0, position = position_dodge(width = 0.7))
Since you're plotting distributions, it might be important to keep positions along the y-axis the same, and only let geom_text_repel jitter along the x-axis, so I repeated the plot with direction = "x", which made me notice something interesting...
ggplot(iex, aes(x = as.character(bleach_time), y = a_i, fill = as.factor(dil))) +
geom_boxplot() +
geom_text_repel(aes(label = rep_id, colour = as.factor(dil)), na.rm = TRUE,
segment.alpha = 0, position = position_dodge(width = 0.7), direction = "x")
There are a couple texts being obscured by the fact that they have the same color as the fill of the boxplots! You can fix this with a better combination of color + fill palettes. The quick fix I did was turning down the luminosity of the color and turning up the luminosity of the fill in the scale_*_discrete calls to make them distinct (but also pretty ugly).
ggplot(iex, aes(x = as.character(bleach_time), y = a_i, fill = as.factor(dil))) +
geom_boxplot() +
geom_text_repel(aes(label = rep_id, colour = as.factor(dil)), na.rm = TRUE,
segment.alpha = 0, position = position_dodge(width = 0.7), direction = "x") +
scale_color_discrete(l = 30) +
scale_fill_discrete(l = 100)
Note that you can also adjust the force used in the repel, so if you need the labels to not overlap but to also hug closer to the middles of the boxplots, you can mess around with that setting as well.

Resources