I am analysing some data with a binomial distribution.
We have 2 possible choices for a stimulus, and patients (female and male) have to decide whether they feel pain (1) or not (0).
I would like to plot a bargraph showing the number of patients who choose 0 or 1, in a rotated way.
An idea of the graph I am looking for is the following, from Sevarika et al, 2022.
#my data
id<-c(1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10)
trt<-c("C","E","C","E","C","E","C","E","C","E","C","E","C","E","C","E","C","E","E","C")
response<-c(0,1,0,1,0,1,1,1,0,1,0,1,0,1,1,1,0,1,0,1)
sex<-c(rep("male",5),rep("female",5))
data<-data.frame(id,trt,response,sex)
So my objective is a flipped boxplot where females and males are separated, and the number of 1 or 0 is shown on each side of the axis. I mean, where it says control, let it say 0, where it says treatment let it say 1, and the top bar should be males and the bottom bar should be females.
Thank you very much, best regards
Base R way:
# compute percentages
tab <- t(table(data$response, data$sex) * c(-1, 1))
tab <- tab / rowSums(abs(tab)) * 100
# positions of x axis labels
lab.x <- seq(-100, 100, 25)
# initiate new plot
frame()
par(mar=c(2.5, 1, 2, 1))
plot.window(range(lab.x), c(0, 1))
# draw x axis
axis(1, at=lab.x, labels=abs(lab.x))
# draw vertical lines
abline(v=0)
abline(v=c(-1, 1)*50, lty=2)
# bar middle y coordinates
bar.mid <- c(.3, .7)
# bar height
bar.ht <- .25
# draw bars
rect(c(tab), bar.mid - bar.ht/2, 0, bar.mid + bar.ht/2,
col=rep(gray(c(.8, .2)), each=2))
mtext(c('No pain', 'Pain'), 3, at=c(-1, 1)*40, cex=1.3, line=.5)
# print bar labels (male, female)
text(min(lab.x), bar.mid, rownames(tab), adj=0)
You probably need to wrangle your data into a more appropriate format for plotting. Here's one method of doing it:
library(tidyverse)
data %>%
count(response, sex) %>%
mutate(n = ifelse(response == 0, -n, n)) %>%
ggplot(aes(sex, n, fill = factor(response))) +
geom_hline(yintercept = 0) +
geom_hline(yintercept = c(-3, 3), linetype = 2, size = 0.2) +
geom_col(position = 'identity', color = 'black', width = 0.5) +
coord_flip() +
scale_y_continuous(breaks = seq(-7, 7), name = 'count', limits = c(-7, 7)) +
scale_fill_manual(values = c("#bebebe", "#2a2a2a"), guide = 'none') +
annotate('text', y = c(-4, 4), x = c(2.8, 2.8), vjust = 1, size = 6,
label = c('RESPONSE = 0', 'RESPONSE = 1'), fontface = 2) +
scale_x_discrete(expand = c(0, 1), name = NULL) +
theme_minimal(base_size = 16) +
theme(axis.line.x = element_line(),
axis.ticks.x = element_line(),
panel.grid = element_blank())
Related
I'm trying to slightly reposition the labels of a discrete colorbar so that they don't overlap, without changing the values of the breaks themselves. In the below plot, the two center labels (bracketing the near-zero data) are too close together, so that it looks like '-11' instead of '-1' and '1'. I'd like to nudge them to either side, or change the justification of each half of the scale (left justify the negatives and right justify the positives), or anything to create more space between the labels while retaining the spacing of the actual colorbar. (Making the colorbar wider is not an option in my actual figure.)
Here is the code used to create this plot:
library(dplyr)
library(ggplot2)
library(scales)
df <- data.frame(
x = runif(1000),
y = runif(1000),
z1 = rnorm(100)*10
)
df %>% ggplot() +
geom_point(aes(x=x,y=y, color=z1)) +
scale_color_steps2(low = muted("darkblue"), mid = "white", high = muted("darkred"),
midpoint = 0, guide_colorbar(barwidth = 20),
breaks = c(-20, -10, -5, -1, 1, 5, 10, 20)) +
theme_minimal() +
theme(legend.position = 'bottom') +
labs(x='', y='', color='')
Always a bit hacky and you get a warning but one option would be to pass a vector to hjust argument of element_text to align the -1 to the right and the 1 to the left:
library(ggplot2)
set.seed(123)
df <- data.frame(
x = runif(1000),
y = runif(1000),
z1 = rnorm(100)*10
)
ggplot(df) +
geom_point(aes(x=x,y=y, color=z1)) +
scale_color_steps2(low = scales::muted("darkblue"), mid = "white", high = scales::muted("darkred"),
midpoint = 0, guide = guide_colorbar(barwidth = 20),# horizontal_legend,
breaks = c(-20, -10, -5, -1, 1, 5, 10, 20)) +
theme_minimal() +
theme(legend.position = 'bottom') +
labs(x='', y='', color='') +
theme(legend.text = element_text(hjust = c(rep(.5, 3), 1, 0, rep(.5, 3))))
#> Warning: Vectorized input to `element_text()` is not officially supported.
#> Results may be unexpected or may change in future versions of ggplot2.
I seem to have an error in the way my distribution looks. The bottom ridges of each of the facetted graphs are not at the same scale as the other ridges above, or relative to the number of counts (i.e. scale dots shown).
Is there a way to scale all distributions relative to one another?
season_names <- c(`0` = "COOL", `1` = "HOT-DRY",`2` = "HOT-WET")
dCLEAN %>%
ggplot(aes(x = tdb, y = as.factor(tsv), fill = as.factor(season))) +
ggdist::stat_halfeye(
adjust = 0.9,
justification = -0.15,
.width = 0,
point_colour = NA) +
geom_boxplot(
width = 0.2,
outlier.colour = NA,
alpha = 0.5)+
ggdist::stat_dots(
side = "left",
justification = 1.18,
binwidth = 0.1) +
facet_wrap(~ season, labeller = as_labeller(season_names)) +
theme_bw() +
theme(strip.background = element_rect(fill="white")) +
theme(legend.position = "none") +
scale_color_grey()+
scale_fill_grey()
Image of current graph (Errors seem to be in Cool graph -2 distribution, Hot-Dry graph -1 distribution, Hot-Wet graph -1 distribution
By default, the densities are scaled to have equal area regardless of the number of observations. If you wish to scale the areas according to the number of observations, you can set aes(thickness = stat(pdf*n)) in stat_halfeye(). This sets the thickness of the slab according to the product of two computed variables generated by stat_halfeye(): the density (pdf) and the number of observations per group (n).
Here's an example from the raincloud plots section of the dotsinterval vignette:
set.seed(12345) # for reproducibility
data.frame(
abc = c("a", "b", "b", "c"),
value = rnorm(200, c(1, 8, 8, 3), c(1, 1.5, 1.5, 1))
) %>%
ggplot(aes(y = abc, x = value, fill = abc)) +
stat_slab(aes(thickness = stat(pdf*n)), scale = 0.7) +
stat_dotsinterval(side = "bottom", scale = 0.7, slab_size = NA) +
scale_fill_brewer(palette = "Set2")
This example uses stat_slab() in place of stat_halfeye(). stat_slab() is a shortcut stat equivalent to stat_halfeye() but without the point and the interval (so it saves you passing .width = 0 and point_color = NA to stat_halfeye()).
I would like to draw a graph similar to the image here:
I tried to find similar mirror bar graphs on google, but I could not find similar graph to the image above.
Tricky parts of the graph are that 1) both +ve and -ve y axis have positive values, and 2) both +ve and -ve y axis have different y-axis labellings.
Thank you in advance for your help.
This is as close as I could get so far to that graph.
It's really tricky.
The Y axis has to be positive on the negative side
On the negative side numbers have to look 5 times smaller because of the number on the Y axis being 5 times smaller [from 1 to 5 instead of 1 to 25]
uncertainty bars need to drawn
X labels are doubled
What I couldn't do:
set up the Y axis names in a proper manner, [if anyone knows and can help..!]
understand what a and b are and with which logic to place them [you need to explain this one better]
library(dplyr)
library(ggplot2)
# your data
n <- 100
set.seed(42)
df <- tibble(var1 = factor(rep(c("Mamou", "Crowley"), each = 8 * n), levels = c("Mamou", "Crowley"), ordered = TRUE),
var2 = factor(rep(c("RWW-M1", "RWW-M2", "RWW-C1", "RWW-C2"), each = 4* n), levels = c("RWW-M1", "RWW-M2", "RWW-C1", "RWW-C2"), ordered = TRUE),
var3 = factor(rep(rep(c("Shoot dry weight (g)", "Root dry weight (g)"), each = 2*n), 4), levels = c("Shoot dry weight (g)", "Root dry weight (g)"), ordered = TRUE),
varc = rep(rep(c("white", "black"), each = n), 8),
value = abs(c(
rnorm(2*n, mean = 5 , sd = 0.2),
rnorm(2*n, mean = 3 , sd = 0.04),
rnorm(2*n, mean = 15 , sd = 0.2),
rnorm(2*n, mean = 4 , sd = 0.04),
rnorm(2*n, mean = 5 , sd = 0.2),
rnorm(2*n, mean = 2.5, sd = 0.04),
rnorm(2*n, mean = 5 , sd = 0.2),
rnorm(2*n, mean = 2.5, sd = 0.04))))
# edit your data this way [a little trick to set bars up and down the line and make them look like 5 times bigger]
df <- df %>% mutate(value = if_else(var3 == "Root dry weight (g)", -value*5, value))
# calculate statistics you want to plot
df <- df %>%
group_by(var1, var2, var3, varc) %>%
summarise(mean = mean(value), min = min(value), max = max(value)) %>%
ungroup()
df %>%
ggplot(aes(x = var2)) +
# plot dodged bars
geom_col(aes(y = mean, fill = varc),
position = position_dodge(width = 0.75),
colour = "black", width = 0.5) +
# plot dodged errorbars
geom_errorbar(aes(ymin = min, ymax = max, group = varc),
position = position_dodge(width = 0.75), width = 0.2, size = 1) +
# make line on zero more visible
geom_hline(aes(yintercept = 0)) +
# set up colour of the bars, don't show legend
scale_fill_manual(values = c("white", "gray75"), guide = FALSE) +
# set up labels of y axis
# dont change positive, make negative look positive and 5 times smaller
# set up breaks every 5 [ggplot will calc labels after breaks]
scale_y_continuous(labels = function(x) if_else(x<0, -x/5, x),
breaks = function(x) as.integer(seq(x[1]-x[1]%%5, x[2]-x[2]%%5, 5))) +
# put labels and x axis on top
scale_x_discrete(position = "top") +
# set up var1 labels on top
facet_grid( ~ var1, space = 'free', scales = 'free') +
# show proper axis names
labs(x = "", y = "Root dry weight (g) Shoot dry weight (g)") +
# set up theme
theme_classic() +
theme(axis.line.x = element_blank(),
axis.ticks.x = element_blank(),
panel.grid = element_blank(),
# this is to put names of facet grid on top
strip.placement = 'outside',
# this is to remove background from labels on facet grid
strip.background = element_blank(),
# this is to make facets close to each other
panel.spacing.x = unit(0,"line"))
Something like this perhaps?
library(ggplot2)
df <- data.frame(x = rep(letters[1:3], each = 4),
y = c(2, -2, 3, -3, 4, -4, 5, -5, 2, -2, 3, -3),
dodgegroup = factor(rep(rep(1:2, each = 2), 3)))
ggplot(df, aes(x, y, fill = dodgegroup)) +
geom_col(position = position_dodge(width = 0.75),
colour = "black", width = 0.5) +
geom_hline(aes(yintercept = 0)) +
scale_fill_manual(values = c("white", "gray75")) +
scale_y_continuous(breaks = 0:10 - 5,
labels = c(5:0, 5 * 1:5)) +
theme_classic()
Created on 2020-08-07 by the reprex package (v0.3.0)
Try this. While the answer by Edo looks most like what you have asked for, this method does not need you to transform your data. However, the scale on both sides of the axis are the same.
Call geom_col twice but with - before the values for Root, then we use labels=abs to make both sides of the y-axis positive numbers:
Edit - fixed the y-axis
library(ggplot2)
df <- data.frame(x = rep(c("RWW-M1", "RWW-M2", "RWW-C1", "RWW-C2"), each = 2),
Shoot = c(5, 6, 7, 8, 4, 5, 5, 7),
Root = c(1, 2, 3, 4, 2, 3, 1, 2),
Condition = rep(c("control", "test"), each = 1))
p <- ggplot(df, aes(x=x, fill=Condition)) +
geom_col(aes(y=Shoot), position = position_dodge(width = 0.75), width = 0.5, colour = "black")+
geom_col(aes(y=-Root), position = position_dodge(width = 0.75), width = 0.5, colour = "black")+
geom_hline(aes(yintercept = 0)) +
scale_fill_manual(values = c("white", "gray75")) +
ylab("Root weight (g) / Shoot weight (g)")+
xlab("")+
scale_y_continuous(breaks = 0:15 - 5, labels=abs) +
theme_bw()
p
I would like to add labels to the end of lines in ggplot, avoid them overlapping, and avoid them moving around during animation.
So far I can put the labels in the right place and hold them static using geom_text, but the labels overlap, or I can prevent them overlapping using geom_text_repel but the labels do not appear where I want them to and then dance about once the plot is animated (this latter version is in the code below).
I thought a solution might involve effectively creating a static layer in ggplot (p1 below) then adding an animated layer (p2 below), but it seems not.
How do I hold some elements of a plot constant (i.e. static) in an animated ggplot? (In this case, the labels at the end of lines.)
Additionally, with geom_text the labels appear as I want them - at the end of each line, outside of the plot - but with geom_text_repel, the labels all move inside the plotting area. Why is this?
Here is some example data:
library(dplyr)
library(ggplot2)
library(gganimate)
library(ggrepel)
set.seed(99)
# data
static_data <- data.frame(
hline_label = c("fixed_label_1", "fixed_label_2", "fixed_label_3", "fixed_label_4",
"fixed_label_5", "fixed_label_6", "fixed_label_7", "fixed_label_8",
"fixed_label_9", "fixed_label_10"),
fixed_score = c(2.63, 2.45, 2.13, 2.29, 2.26, 2.34, 2.34, 2.11, 2.26, 2.37))
animated_data <- data.frame(condition = c("a", "b")) %>%
slice(rep(1:n(), each = 10)) %>%
group_by(condition) %>%
mutate(time_point = row_number()) %>%
ungroup() %>%
mutate(score = runif(20, 2, 3))
and this is the code I am using for my animated plot:
# colours for use in plot
condition_colours <- c("red", "blue")
# plot static background layer
p1 <- ggplot(static_data, aes(x = time_point)) +
scale_x_continuous(breaks = seq(0, 10, by = 2), expand = c(0, 0)) +
scale_y_continuous(breaks = seq(2, 3, by = 0.10), limits = c(2, 3), expand = c(0, 0)) +
# add horizontal line to show existing scores
geom_hline(aes(yintercept = fixed_score), alpha = 0.75) +
# add fixed labels to the end of lines (off plot)
geom_text_repel(aes(x = 11, y = fixed_score, label = hline_label),
hjust = 0, size = 4, direction = "y", box.padding = 1.0) +
coord_cartesian(clip = 'off') +
guides(col = F) +
labs(title = "[Title Here]", x = "Time", y = "Mean score") +
theme_minimal() +
theme(panel.grid.minor = element_blank(),
plot.margin = margin(5.5, 120, 5.5, 5.5))
# animated layer
p2 <- p1 +
geom_point(data = animated_data,
aes(x = time_point, y = score, colour = condition, group = condition)) +
geom_line(data = animated_data,
aes(x = time_point, y = score, colour = condition, group = condition),
show.legend = FALSE) +
scale_color_manual(values = condition_colours) +
geom_segment(data = animated_data,
aes(xend = time_point, yend = score, y = score, colour = condition),
linetype = 2) +
geom_text(data = animated_data,
aes(x = max(time_point) + 1, y = score, label = condition, colour = condition),
hjust = 0, size = 4) +
transition_reveal(time_point) +
ease_aes('linear')
# render animation
animate(p2, nframes = 50, end_pause = 5, height = 1000, width = 1250, res = 120)
Suggestions for consideration:
The specific repelling direction / amount / etc. in geom_text_repel is determined by a random seed. You can set seed to a constant value in order to get the same repelled positions in each frame of animation.
I don't think it's possible for repelled text to go beyond the plot area, even if you turn off clipping & specify some repel range outside plot limits. The whole point of that package is to keep text labels away from one another while remaining within the plot area. However, you can extend the plot area & use geom_segment instead of geom_hline to plot the horizontal lines, such that these lines stop before they reach the repelled text labels.
Since there are more geom layers using animated_data as their data source, it would be cleaner to put animated_data & associated common aesthetic mappings in the top level ggplot() call, rather than static_data.
Here's a possible implementation. Explanation in annotations:
p3 <- ggplot(animated_data,
aes(x = time_point, y = score, colour = condition, group = condition)) +
# static layers (assuming 11 is the desired ending point)
geom_segment(data = static_data,
aes(x = 0, xend = 11, y = fixed_score, yend = fixed_score),
inherit.aes = FALSE, colour = "grey25") +
geom_text_repel(data = static_data,
aes(x = 11, y = fixed_score, label = hline_label),
hjust = 0, size = 4, direction = "y", box.padding = 1.0, inherit.aes = FALSE,
seed = 123, # set a constant random seed
xlim = c(11, NA)) + # specify repel range to be from 11 onwards
# animated layers (only specify additional aesthetic mappings not mentioned above)
geom_point() +
geom_line() +
geom_segment(aes(xend = time_point, yend = score), linetype = 2) +
geom_text(aes(x = max(time_point) + 1, label = condition),
hjust = 0, size = 4) +
# static aesthetic settings (limits / expand arguments are specified in coordinates
# rather than scales, margin is no longer specified in theme since it's no longer
# necessary)
scale_x_continuous(breaks = seq(0, 10, by = 2)) +
scale_y_continuous(breaks = seq(2, 3, by = 0.10)) +
scale_color_manual(values = condition_colours) +
coord_cartesian(xlim = c(0, 13), ylim = c(2, 3), expand = FALSE) +
guides(col = F) +
labs(title = "[Title Here]", x = "Time", y = "Mean score") +
theme_minimal() +
theme(panel.grid.minor = element_blank()) +
# animation settings (unchanged)
transition_reveal(time_point) +
ease_aes('linear')
animate(p3, nframes = 50, end_pause = 5, height = 1000, width = 1250, res = 120)
For teaching purposes I'm looking to create and plot multiple distributions on to one graph. The code I've been using to do this is:
library(ggplot2)
library(ggfortify)
# Create an initial graph with 1 distribution
p3 <- ggdistribution(dnorm,
seq(-5, 10,length=1000),
colour='blue',
mean=0.15,
sd=0.24,
fill='blue')
# Update p3 with second distribution
p3 <- ggdistribution(dnorm, seq(-5, 10,length=1000),
mean = 1.11,
sd = 0.55,
colour='green',
fill='green',p=p3)
# View p3
p3
Initially, this seems great because it produces a graph with both distributions:
The problems start when I try to change the appearance of the graph.
(1) First when I attempt to change the y-axis scale so that it ranges from 0 to 1 instead of the percentages it shows by default, I am able to do so, but something happens to the distributions. Here is the code I am using:
p3 <- p3 + ylim(0,1) + xlim (-2, 6) + labs(title="Plotting Multiple Distributions", x="Mean difference", y="Density")
And this returns the following graph:
Any advice on how I can change the y-axis without ruining the distribution would be very appreciated!
(2) Second, when I try to add 2 lines along the axes using this code:
p3 <- p3 + geom_segment(aes(x=0, y=0, xend=0, yend=0.98),
size=1,
arrow = arrow(length = unit(0.4,"cm")))
p3 <- p3 + geom_segment(aes(x=-2, y=0, xend=6, yend=0),
size=1)
...R returns the following error message:
Error in eval(expr, envir, enclos) : object 'ymin' not found
Any advice as to how I might add these lines to improve the aesthetics of the graph would be very appreciated.
Thank you in advance for your time.
Sounds like you wish to change the y-axis labels to the range (0, 1), without actually changing the underlying distribution. Here's one approach:
# after obtaining p3 from the two ggdistribution() functions
# get the upper limit for p3's current y-axis range, rounded up
y.orig <- layer_scales(p3)$y$range$range[2] # = 1.662259 in my case, yours may
# differ based on the distribution
y.orig <- ceiling(y.orig * 10) / 10 # = 1.7
p3 +
xlim(-2, 6) +
scale_y_continuous(breaks = seq(0, y.orig, length.out = 5),
labels = scales::percent(seq(0, 1, length.out = 5))) +
annotate("segment", x = 0, y = 0, xend = 0, yend = y.orig,
size = 1, arrow = arrow(length = unit(0.4, "cm"))) +
annotate("segment", x = -2, y = 0, xend = 6, yend = 0,
size = 1)
Or if you prefer to keep labels close to the fake axis created from line segments, include expand = c(0, 0) for x / y:
p3 +
scale_x_continuous(limits = c(-2, 6), expand = c(0, 0)) +
scale_y_continuous(breaks = seq(0, y.orig, length.out = 5),
labels = scales::percent(seq(0, 1, length.out = 5)),
expand = c(0, 0)) +
annotate("segment", x = 0, y = 0, xend = 0, yend = y.orig,
size = 1, arrow = arrow(length = unit(0.4, "cm"))) +
annotate("segment", x = -2, y = 0, xend = 6, yend = 0,
size = 1)