ggplot2 heatmap with tile height and width as aes()

ggplot2 heatmap with tile height and width as aes() - r

I'm trying to create a heat map for an OD matrix, but I wanted to scale the rows and columns by certain weights. Since these weights are constant across each category I would expect the plot would keep the rows and columns structure.
# Tidy OD matrix
df <- data.frame (origin = c(rep("A", 3), rep("B", 3),rep("C", 3)),
destination = rep(c("A","B","C"),3),
value = c(0, 1, 10, 5, 0, 11, 15, 6, 0))
# Weights
wdf <- data.frame(region = c("A","B","C"),
w = c(1,2,3))
# Add weights to the data.
plot_df <- df %>%
merge(wdf %>% rename(w_origin = w), by.x = 'origin', by.y = 'region') %>%
merge(wdf %>% rename(w_destination = w), by.x = 'destination', by.y = 'region')
Here's how the data looks like:
> plot_df
destination origin value w_origin w_destination
1 A A 0 1 1
2 A C 15 3 1
3 A B 5 2 1
4 B A 1 1 2
5 B B 0 2 2
6 B C 6 3 2
7 C B 11 2 3
8 C A 10 1 3
9 C C 0 3 3
However, when passing the weights as width and height in the aes() I get this:
ggplot(plot_df,
aes(x = destination,
y = origin)) +
geom_tile(
aes(
width = w_destination,
height = w_origin,
fill = value),
color = 'black')
It seems to be working for the size of the columns (width), but not quite because the proportions are not the right. And the rows are all over the place and not aligned.
I'm only using geom_tile because I could pass height and width as aesthetics, but I accept other suggestions.

The issue is that your tiles are overlapping. The reason is that while you could pass the width and the heights as aesthetics, geom_tile will not adjust the x and y positions of the tiles for you. As your are mapping a discrete variable on x and y your tiles are positioned on a equidistant grid. In your case the tiles are positioned at .5, 1.5 and 2.5. The tiles are then drawn on these positions with the specified width and height.
This could be easily seen by adding some transparency to your plot:
library(ggplot2)
library(dplyr)
ggplot(plot_df,
aes(x = destination,
y = origin)) +
geom_tile(
aes(
width = w_destination,
height = w_origin,
fill = value), color = "black", alpha = .2)
To achieve your desired result you have to manually compute the x and y positions according to the desired widths and heights to prevent the overlapping of the boxes. To this end you could switch to a continuous scale and set the desired breaks and labels via scale_x/y_ continuous:
breaks <- wdf %>%
mutate(cumw = cumsum(w),
pos = .5 * (cumw + lag(cumw, default = 0))) %>%
select(region, pos)
plot_df <- plot_df %>%
left_join(breaks, by = c("origin" = "region")) %>%
rename(y = pos) %>%
left_join(breaks, by = c("destination" = "region")) %>%
rename(x = pos)
ggplot(plot_df,
aes(x = x,
y = y)) +
geom_tile(
aes(
width = w_destination,
height = w_origin,
fill = value), color = "black") +
scale_x_continuous(breaks = breaks$pos, labels = breaks$region, expand = c(0, 0.1)) +
scale_y_continuous(breaks = breaks$pos, labels = breaks$region, expand = c(0, 0.1))

So I think I have a partial solution for you. After playing arround with geom_tile, it appears that the order of your dataframe matters when you are using height and width.
Here is some example code I came up with off of yours (run your code first). I converted your data_frame to a tibble (part of dplyr) to make it easier to sort by a column.
# Converted your dataframe to a tibble dataframe
plot_df_tibble = tibble(plot_df)
# Sorted your dataframe by your w_origin column:
plot_df_tibble2 = plot_df_tibble[order(plot_df_tibble$w_origin),]
# Plotted the sorted data frame:
ggplot(plot_df_tibble2,
aes(x = destination,
y = origin)) +
geom_tile(
aes(
width = w_destination,
height = w_origin,
fill = value),
color = 'black')
And got this plot:
Link to image I made
I should note that if you run the converted tibble before you sort that you get the same plot you posted.
It seems like the height and width arguements may not be fully developed for this portion of geom_tile, as I feel that the order of the df should not matter.
Cheers

Related

Visualizing stacked bar chart in the format of Jonathan A. Schwabish (JEP 2014)

I am trying to plot the following data (df_input) in the format of a stacked bar graph where we can also see the change over time by line. Any idea how to do it?
df_input <- data.frame( Year= c(2010,2010,2010,2010,2020,2020,2020,2020), village= c("A","B","C","D","A","B","C","D"), share = c(40,30,20,10,30,30,25,15))
df_input_2 <- data.frame( Year= c(2010,2010,2010,2010,2015,2015,2015,2015,2020,2020,2020,2020), village= c("A","B","C","D","A","B","C","D","A","B","C","D"), share = c(40,30,20,10,30,30,25,15,20,10,30,40))

One option to achieve that would be via a geom_col and a geom_line. For the geom_line you have to group by the variable mapped on fill, set position to "stack" and adjust the start/end positions to account for the widths of the bars. Additionally you have to manually set the orientation for the geom_line to y:
library(ggplot2)
width <- .6 # Bar width
ggplot(df_input, aes(share, factor(Year), fill = village)) +
geom_col(width = width) +
geom_line(aes(x = share,
y = as.numeric(factor(Year)) + ifelse(Year == 2020, -width / 2, width / 2),
group = village), position = "stack", orientation = "y")
EDIT With more than two years things get a bit trickier. In that case I would switch to ´geom_segment`. Additionally we have to do some data wrangling to prepare the data for use with ´geom_segment´:
library(ggplot2)
library(dplyr)
# Example data with three years
df_input_2 <- data.frame( Year= c(2010,2010,2010,2010,2015,2015,2015,2015,2020,2020,2020,2020), village= c("A","B","C","D","A","B","C","D","A","B","C","D"), share = c(40,30,20,10,30,30,25,15,20,10,30,40))
width = .6
# Data wrangling
df_input_2 <- df_input_2 %>%
group_by(Year) %>%
arrange(desc(village)) %>%
mutate(share_cum = cumsum(share)) %>%
group_by(village) %>%
arrange(Year) %>%
mutate(Year = factor(Year),
Year_lead = lead(Year), share_cum_lead = lead(share_cum))
ggplot(df_input_2, aes(share, factor(Year), fill = village)) +
geom_col(width = width) +
geom_segment(aes(x = share_cum, xend = share_cum_lead, y = as.numeric(Year) + width / 2, yend = as.numeric(Year_lead) - width / 2, group = village))
#> Warning: Removed 4 rows containing missing values (geom_segment).

Is there a way to make a line plot that connects emperical pairs of words with ggplot2?

Im not sure what the correct name for this type of plot would be, but lets say we have a list of names (or letters here): data <- data.frame(letters[1:10])
Lets also say that we want to illustrate which of these names are connected based on some empirical decision, so we have a list of observations we want to connect in a plot like the following (done in powerpoint):
Can this be done in ggplot?

Yes, it can be done in ggplot.
Let's start by setting up a data frame of letters, with associated positions on the x and y axis of a plot. We'll make the x values 1 and 2 (though this is arbitrary), and the y values 1:10 (also arbitrary, as long as they are evenly spaced)
labels <- data.frame(x = c(rep(1, 10), rep(2, 10)),
y = rep(1:10, 2),
labs = rep(LETTERS[10:1], 2),
stringsAsFactors = FALSE)
Now we also need some way of deciding which letters will be joined. Let's do this by having a simple data frame of "left" and "right" values, where each row describes which two letters will be joined:
set.seed(69)
joins <- data.frame(left = sample(LETTERS[1:10], 6, TRUE),
right = sample(LETTERS[1:10], 6, TRUE),
stringsAsFactors = FALSE)
joins
#> left right
#> 1 A G
#> 2 B B
#> 3 H J
#> 4 G D
#> 5 G J
#> 6 F B
Now we can assign start and end x and y co-ordinates for the lines by matching the letters in these two columns to the columns in our labels data frame:
joins$x <- rep(1.05, nrow(joins))
joins$xend <- rep(1.9, nrow(joins))
joins$y <- labels$y[match(joins$left, labels$labs)]
joins$yend <- labels$y[match(joins$right, labels$labs)]
This just leaves the plot. We want to get rid of all the axes, titles and legends so we use theme_void:
library(ggplot2)
ggplot(labels, aes(x, y)) +
geom_text(aes(label = labs), size = 8) +
geom_segment(data = joins, aes(xend = xend, yend = yend, color = left),
arrow = arrow(type = "closed", length = unit(0.02, "npc"))) +
coord_cartesian(xlim = c(0.5, 2.5)) +
theme_void() +
theme(legend.position = "none")
Created on 2020-07-10 by the reprex package (v0.3.0)

This solution could be tidied up, but gives a start using geom_segment
library(tidyverse)
tibble(x0 = 0, x1 = 1, y0 = sample(letters[1:10]), y1 = sample(letters[1:10])) %>%
mutate(y0 = factor(y0, levels = rev(letters[1:10])),
y1 = factor(y1, levels = rev(letters[1:10]))) %>%
ggplot(aes(x = x0, xend = x1, y = y0, yend = y1)) +
geom_segment(arrow = arrow(length = unit(0.03, "npc"))) +
geom_text(aes(x = x1, y = y1, label = y1), nudge_x = 0.01)

matching of shape, color and legend in bubble plot with subset of variable

I have some data
library(data.table)
wide <- data.table(id=c("A","C","B"), var1=c(1,6,1), var2=c(2,6,5), size1=c(11,12,13), size2=c(10,12,10), flag=c(FALSE,TRUE,FALSE))
> wide
id var1 var2 size1 size2 flag
1: A 1 2 11 10 FALSE
2: C 6 6 12 12 TRUE
3: B 1 5 13 10 FALSE
which I would like to plot as bubble plots where id is ordered by var2, and bubbles are as follows:
ID A and B: var1 is plotted in size1 and "empty bubbles" and var2 is plotted in size2 with "filled" bubbles.
ID C is flagged because there is only one value (this is why var1=var2) and it should have a "filled bubble" of a different color.
I have tried this as follows:
cols <- c("v1"="blue", "v2"="red", "flags"="green")
shapes <- c("v1"=16, "v2"=21, "flags"=16)
p1 <- ggplot(data = wide, aes(x = reorder(id,var2))) + scale_size_continuous(range=c(5,15))
p1 <- p1 + geom_point(aes(size=size1, y = var1, color = "v1", shape = "v1"))
p1 <- p1 + geom_point(aes(size=size2, y = var2, color = "v2", shape = "v2", stroke=1.5))
p1 <- p1 + geom_point(data=subset(wide,flag), aes(size=size2[flag], y=var2[flag], color= "flags", shape="flags"))
p1 <- p1 + scale_color_manual(name = "test",
values = cols,
labels = c("v1", "v2", "flags"))
p1 <- p1 + scale_shape_manual(name = "test",
values = shapes,
labels = c("v1", "v2", "flags"))
which gives (in my theme)
but two questions remain:
What happened to the order in the legend? I have followed the recipe of the bottom solution in Two geom_points add a legend but somehow the order does not match.
How to get rid of the stroke around the green bubble and why is it there?
Overall, something appears to go wrong in matching shape and color.

I admit, it took me a while to understand your slightly convoluted plot. Forgive me, but I have allowed myself to change the way to plot, and make (better?) use of ggplot.
The data shape is less than ideal. ggplot works extremely well with long data.
It was a bit of a guesswork to reshape your data, and I decided to go the quick and dirty way to simply bind the rows from selected columns.
Now you can see, that you can achieve the new plot with a single call to geom_point. The rest is "scale_aesthetic" magic...
In order to combine the shape and color legend, safest is to use override.aes. But beware! It does not take named vectors, so the order of the values needs to be in the exact order given by your legend keys - which is usually alphabetic, if you don't have the factor levels defined.
update re: request to order x labels
This hugely depends on the actual data structure. if it is originally as you have presented, I'd first make id a factor with the levels ordered based on your var2. Then, do the data shaping.
library(tidyverse)
# data reshape
wide <- data.frame(id=c("C","B","A"), var1=c(1,6,1), var2=c(2,6,5), size1=c(11,12,13), size2=c(10,12,10), flag=c(FALSE,TRUE,FALSE))
wide <- wide %>% mutate(id = reorder(id, var2))
wide1 <- wide %>% filter(!flag) %>%select(id, var = var1, size = size1)
wide2 <- wide %>% filter(!flag) %>% select(id, var = var2, size = size2)
wide3 <- wide %>% filter(flag) %>% select(id, var = flag, size = size2) %>%
mutate(var = 6)
long <- bind_rows(list(v1 = wide1, v2 = wide2, flag = wide3), .id = "var_id")
# rearrange the vectors for scales aesthetic
cols <- c(flag="green", v1 ="blue", v2="red" )
shapes <- c(flag=16, v1=16, v2 =21 )
ggplot(data = long, aes(x = id, y = var)) +
geom_point(aes(size=size, shape = var_id, color = var_id), stroke=1.5) +
scale_size_continuous(limits = c(5,15),breaks = seq(5,15,5)) +
scale_shape_manual(name = "test", values = shapes) +
scale_color_manual(values = cols, guide = FALSE) +
guides(shape = guide_legend(override.aes = list(color = cols)))
P.S. the reason for the red stroke around the green bubble in your plot is that you also plotted the 'var2' behind your flag.
Created on 2020-04-08 by the reprex package (v0.3.0)

Geom_freqpoly with Predefined Count

I can plot geom_freqpoly without problems using the number of observation
ggplot(data=demo) +
geom_freqpoly(mapping=aes(x = value))
But I'd like to use the precalculated obeservation count contained in the data.
I tried using stat = "identity" but it apparently doesn't work.
ggplot(data=demo) +
geom_freqpoly(mapping=aes(x = value, y = cnt), stat = "identity")
This is my sample data
demo <- tribble(
~value, ~cnt,
.25, 20,
.25, 30,
.1, 40
)

TL;DR: You didn't get the graph you want, because the data of pre-calculated counts you passed to ggplot was NOTHING like what was used to produce the freqpoly graph.
Since you didn't include code for the original demo used to generate graph 1, I'll venture a guess:
demo.orig <- data.frame(value = c(0.25, 0.25, 0.1))
p <- ggplot(demo.orig, aes(x = value)) +
geom_freqpoly()
p # show plot to verify its appearance, which matches the graph in the question
layer_data(p) # look at the calculated data used by geom_freqpoly
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
y count x xmin xmax width density ncount ndensity PANEL group colour size linetype alpha
1 0 0 0.09310345 0.09051724 0.09568966 0.005172414 0.00000 0.0 0.0 1 -1 black 0.5 1 NA
2 1 1 0.09827586 0.09568966 0.10086207 0.005172414 64.44444 0.5 0.5 1 -1 black 0.5 1 NA
3 0 0 0.10344828 0.10086207 0.10603448 0.005172414 0.00000 0.0 0.0 1 -1 black 0.5 1 NA
... (omitted to conserve space)
30 0 0 0.24310345 0.24051724 0.24568966 0.005172414 0.00000 0.0 0.0 1 -1 black 0.5 1 NA
31 2 2 0.24827586 0.24568966 0.25086207 0.005172414 128.88889 1.0 1.0 1 -1 black 0.5 1 NA
32 0 0 0.25344828 0.25086207 0.25603448 0.005172414 0.00000 0.0 0.0 1 -1 black 0.5 1 NA
From a small dataframe with only two unique values, stat_bin generated a much larger dataframe with the x-axis split into 30 bins (the default number), and count / y = 0 everywhere except for the two bins containing the original values.
> geom_freqpoly
function (mapping = NULL, data = NULL, stat = "bin", position = "identity",
..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE)
{
params <- list(na.rm = na.rm, ...)
if (identical(stat, "bin")) {
params$pad <- TRUE
}
layer(data = data, mapping = mapping, stat = stat, geom = GeomPath,
position = position, show.legend = show.legend, inherit.aes = inherit.aes,
params = params)
}
A quick check by printing geom_freqpoly to console shows that its underlying geom is simply GeomPath, which plots x/y pairs in sequential order.
In other words, if you want to get the peaks from graph 1, you need to provide a similar dataset, with rows indicating where y should drop to 0. While it's certainly possible to calculate this by digging into the code for StatBin$compute_group, I'd think it's simpler to expand from the data of pre-calculated counts and let ggplot do its normal job:
demo %>%
tidyr::uncount(cnt) %>%
ggplot(aes(x = value)) +
geom_freqpoly() +
theme_minimal()
Edit: solution without fully expanding dataframe of aggregated counts
Sample dataset with 2 groups:
demo <- data.frame(value = c(0.25, 0.5, 0.1, 0.25, 0.75, 0.1),
cnt = c(5, 2, 4, 3, 8, 7) * 10e8,
group = rep(c("a", "b"), each = 3))
Code:
library(ggplot2)
library(dplyr)
demo %>%
rename(x = value, y = cnt) %>% # rename here so approach below can be easily applied
# to other datasets with different column names
tidyr::nest(data = c(x, y)) %>% # nest to apply same approach for each group
mutate(data = purrr::map(
data,
function(d) ggplot2:::bin_vector( # cut x's range into appropriate bins
x = d$x,
bins = ggplot2:::bin_breaks_bins(
x_range = range(d$x),
bins = 30), # default bin count is 30; change if desired
pad = TRUE) %>%
select(x, xmin, xmax) %>%
# place y counts into the corresponding x bins (this is probably similar
# to interval join, but I don't have that package installed on my machine)
tidyr::crossing(d %>% rename(x2 = x)) %>%
mutate(y = ifelse(x2 >= xmin & x2 < xmax, y, 0)) %>%
select(-x2) %>%
group_by(x) %>%
filter(y == max(y)) %>%
ungroup() %>%
unique())) %>%
tidyr::unnest(cols = c(data)) %>% # unnest to get one flat dataframe back
ggplot(aes(x = x, y = y, colour = group)) + # plot as per normal
geom_path() +
theme_bw()
# package versions used: dplyr 1.0.0, ggplot2 3.3.1, tidyr 1.1.0, purrr 0.3.4

Based on the similar problem for histograms the solution seems to be as simple as to use the weight parameter in the aesthetics.
The solution using the sample data from the other answer would be
demo <- data.frame(value = c(0.25, 0.5, 0.1, 0.25, 0.75, 0.1),
cnt = c(5, 2, 4, 3, 8, 7) * 10e8,
group = rep(c("a", "b"), each = 3))
ggplot(demo, aes(value, weight = cnt, color = group)) + geom_freqpoly()

overlay/superimpose grouped bar plots in ggplot2

I'd like to make a bar plot featuring an overlay of data from two time points, 'before' and 'after'.
At each time point, participants were asked two questions ('pain' and 'fear'), which they would answer by stating a score of 1, 2, or 3.
My existing code plots the counts for the data from the 'before' time point nicely, but I can't seem to add the counts for the 'after' data.
This is a sketch of what I'd like the plot to look like with the 'after' data added, with the black bars representing the 'after' data:
I'd like to make the plot in ggplot2() and I've tried to adapt code from How to superimpose bar plots in R? but I can't get it to work for grouped data.
Many thanks!
#DATA PREP
library(dplyr)
library(ggplot2)
library(tidyr)
df <- data.frame(before_fear=c(1,1,1,2,3),before_pain=c(2,2,1,3,1),after_fear=c(1,3,3,2,3),after_pain=c(1,1,2,3,1))
df <- df %>% gather("question", "answer_option") # Get the counts for each answer of each question
df2 <- df %>%
group_by(question,answer_option) %>%
summarise (n = n())
df2 <- as.data.frame(df2)
df3 <- df2 %>% mutate(time = factor(ifelse(grepl("before", question), "before", "after"),
c("before", "after"))) # change classes and split data into two data frames
df3$n <- as.numeric(df3$n)
df3$answer_option <- as.factor(df3$answer_option)
df3after <- df3[ which(df3$time=='after'), ]
df3before <- df3[ which(df3$time=='before'), ]
# CODE FOR 'BEFORE' DATA ONLY PLOT - WORKS
ggplot(df3before, aes(fill=answer_option, y=n, x=question)) + geom_bar(position="dodge", stat="identity")
# CODE FOR 'BEFORE' AND 'AFTER' DATA PLOT - DOESN'T WORK
ggplot(mapping = aes(x, y,fill)) +
geom_bar(data = data.frame(x = df3before$question, y = df3before$n, fill= df3before$index_value), width = 0.8, stat = 'identity') +
geom_bar(data = data.frame(x = df3after$question, y = df3after$n, fill=df3after$index_value), width = 0.4, stat = 'identity', fill = 'black') +
theme_classic() + scale_y_continuous(expand = c(0, 0))

I think the clue is to set the width of the "after" bars, but to dodge them as if their width are 0.9 (i.e. the same (default) width as the "before" bars). In addition, because we don't map fill of the "after" bars, we need to use the group aesthetic instead to achieve the dodging.
I prefer to have only one data set and just subset it in each call to geom_col.
ggplot(mapping = aes(x = question, y = n, fill = factor(ans))) +
geom_col(data = d[d$t == "before", ], position = "dodge") +
geom_col(data = d[d$t == "after", ], aes(group = ans),
fill = "black", width = 0.5, position = position_dodge(width = 0.9))
Data:
set.seed(2)
d <- data.frame(t = rep(c("before", "after"), each = 6),
question = rep(c("pain", "fear"), each = 3),
ans = 1:3, n = sample(12))
Alternative data preparation using data.table, starting with your original 'df':
library(data.table)
d <- melt(setDT(df), measure.vars = names(df), value.name = "ans")
d[ , c("t", "question") := tstrsplit(variable, "_")]
Either pre-calculate the counts and proceed as above with geom_col
# d2 <- d[ , .N, by = .(question, ans)]
Or let geom_bar do the counting:
ggplot(mapping = aes(x = question, fill = factor(ans))) +
geom_bar(data = d[d$t == "before", ], position = "dodge") +
geom_bar(data = d[d$t == "after", ], aes(group = ans),
fill = "black", width = 0.5, position = position_dodge(width = 0.9))
Data:
df <- data.frame(before_fear = c(1,1,1,2,3), before_pain = c(2,2,1,3,1),
after_fear = c(1,3,3,2,3),after_pain = c(1,1,2,3,1))

My solution is very similar to #Henrik's, but I wanted to point out a few things.
First, you're building your data frames inside your geom_cols, which is probably messier than you need it to be. If you've already created df3after, etc., you might as well use it inside your ggplot.
Second, I had a hard time following your tidying. I think there are a couple tidyr functions that might make this task easier on you, so I went a different route, such as using separate to create the columns of time and measure, rather than essentially searching for them manually, making it more scalable. This also lets you put "pain" and "fear" on your x-axis, rather than still having "before_pain" and "before_fear", which are no longer accurate representations once you have "after" values on the plot as well. But feel free to disregard this and stick with your own method.
library(tidyverse)
df <- data.frame(before_fear = c(1,1,1,2,3),
before_pain = c(2,2,1,3,1),
after_fear = c(1,3,3,2,3),
after_pain = c(1,1,2,3,1))
df_long <- df %>%
gather(key = question, value = answer_option) %>%
mutate(answer_option = as.factor(answer_option)) %>%
count(question, answer_option) %>%
separate(question, into = c("time", "measure"), sep = "_", remove = F)
df_long
#> # A tibble: 12 x 5
#> question time measure answer_option n
#> <chr> <chr> <chr> <fct> <int>
#> 1 after_fear after fear 1 1
#> 2 after_fear after fear 2 1
#> 3 after_fear after fear 3 3
#> 4 after_pain after pain 1 3
#> 5 after_pain after pain 2 1
#> 6 after_pain after pain 3 1
#> 7 before_fear before fear 1 3
#> 8 before_fear before fear 2 1
#> 9 before_fear before fear 3 1
#> 10 before_pain before pain 1 2
#> 11 before_pain before pain 2 2
#> 12 before_pain before pain 3 1
I split this into before & after datasets, as you did, then plotted them with 2 geom_cols. I still put df_long into ggplot, treating it almost as a dummy to get uniform x and y aesthetics. Like #Henrik said, you can use different width in the geom_col and in its position_dodge to dodge the bars at a width of 90% but give the bars themselves a width of only 40%.
df_before <- df_long %>% filter(time == "before")
df_after <- df_long %>% filter(time == "after")
ggplot(df_long, aes(x = measure, y = n)) +
geom_col(aes(fill = answer_option),
data = df_before, width = 0.9,
position = position_dodge(width = 0.9)) +
geom_col(aes(group = answer_option),
data = df_after, fill = "black", width = 0.4,
position = position_dodge(width = 0.9))
What you could instead of making the two separate data frames is to filter inside each geom_col. This is generally my preference unless the filtering is more complex. This code will get the same plot as above.
ggplot(df_long, aes(x = measure, y = n)) +
geom_col(aes(fill = answer_option),
data = . %>% filter(time == "before"), width = 0.9,
position = position_dodge(width = 0.9)) +
geom_col(aes(group = answer_option),
data = . %>% filter(time == "after"), fill = "black", width = 0.4,
position = position_dodge(width = 0.9))