gganimate: two layers with different geometries and timepoints - r

The problem is similar to this question but here the two layers use different geometries, geom_tile and geom_point. The idea is to have tiles show up at different locations only in frames 2, 5, 8, and the point move along the diagonal in every frame.
When trying to run the following example, I get the error:
Error: time data must be the same class in all layers
Example
require(data.table)
require(ggplot2)
require(gganimate)
# 3 tiles along x = 10-y; present at time points 2, 5, 8
dtP1 = data.table(x = c(1, 5, 9),
y = c(9, 5, 1),
t = c(2, 5, 8))
# 9 points along x=y; present at every time point
dtP2 = data.table(x = 1:9,
y = 1:9,
t = 1:9)
p = ggplot() +
geom_tile(data = dtP1,
aes(x = x,
y = y),
color = "#000000") +
geom_point(data = dtP2,
aes(x = x,
y = y),
color = "#FF0000") +
gganimate::transition_time(t) +
gganimate::ease_aes('linear')
pAnim = gganimate::animate(p,
renderer = av_renderer("~/test.mp4"),
fps = 1,
nframes = 9,
height = 400, width = 400)

Does the following work for you?
library(dplyr)
p <- rbind(dtP1 %>% mutate(group = "group1"),
dtP2 %>% mutate(group = "group2")) %>%
tidyr::complete(t, group) %>%
ggplot(aes(x = x, y = y)) +
geom_tile(data = . %>% filter(group == "group1"),
color = "black") +
geom_point(data = . %>% filter(group == "group2"),
color = "red") +
ggtitle("{frame_time}") + # added this to show the frame explicitly; optional
transition_time(t) +
ease_aes('linear')
animate(p, nframes = 9, fps = 1)

Related

Why geom_line legend if show.legend = FALSE and why different colours

After executing the code here below, I was wondering:
1- Why "A.line" and "B.line" variables appear in the geom_point() legend.
2- why there are four colors in the legend.
I guess both answers are related, but I can not tell what is going on.
I would like to have the legend just with "A.points" and "B.points".
I would also like the same colors in both lines and points (I guess this I can do manually).
Thanks in advance for your help.
Best,
David
data.frame(x = rep(1:2,2),
names.points = rep(c("A.point","B.point"), 2),
y.point = c(2, 4, 7, 9),
names.lines = rep(c("A.line","B.line"), each = 2),
y.line = c(3, 3, 8, 8)) %>%
ggplot() +
geom_point(aes(x = x, y = y.point, group = names.points, colour = names.points), size = 5) +
geom_line(aes(x = x, y = y.line, group = names.lines, colour = names.lines), show.legend = FALSE)
Legends are not related to geoms but to the scales and display the categories (or the range of the values) mapped on an aesthetic. Hence, you get four colors because you have four categories mapped on the color aesthetic. The geoms used are only displayed in the legend key via the so called key glyph which is a point for geom_point and a line for geom_line. And show.legend=FALSE only means to not display the key glyph for geom_line in the legend key, i.e. the legend keys shows only a point but no line.
To remove the categories related to the lines from your legend use e.g. the breaks argument of scale_color_discrete instead.
library(ggplot2)
library(dplyr)
data.frame(
x = rep(1:2, 2),
names.points = rep(c("A.point", "B.point"), 2),
y.point = c(2, 4, 7, 9),
names.lines = rep(c("A.line", "B.line"), each = 2),
y.line = c(3, 3, 8, 8)
) %>%
ggplot() +
geom_point(aes(x = x, y = y.point, group = names.points, colour = names.points), size = 5) +
geom_line(aes(x = x, y = y.line, group = names.lines, colour = names.lines), show.legend = FALSE) +
scale_color_discrete(breaks = c("A.point", "B.point"))
UPDATE To fix your issue with the colors you could use a named color vector:
pal_col <- rep(c("darkblue","darkred"), 2)
names(pal_col) <- c("A.point", "B.point", "A.line", "B.line")
data.frame(
x = rep(1:2, 2),
names.points = rep(c("A.point", "B.point"), 2),
y.point = c(2, 4, 7, 9),
names.lines = rep(c("A.line", "B.line"), each = 2),
y.line = c(3, 3, 8, 8)
) %>%
ggplot() +
geom_point(aes(x = x, y = y.point, group = names.points, colour = names.points), size = 5) +
geom_line(aes(x = x, y = y.line, group = names.lines, colour = names.lines), show.legend = FALSE) +
scale_color_manual(breaks = c("A.point", "B.point"),
values = pal_col)

ggplot2: scale_fill_manual with symbol and shape

How do I assign circle and period shapes in ggplot2? Right now I can do two shapes or two symbols but not both.
data.frame(
x = rnorm(10),
y = rnorm(10),
group = gl(2, 5, labels = c("circle", "period"))
) %>%
ggplot(aes(x = x, y = y, shape = group)) +
geom_point(size = 4) +
# scale_shape_manual(values = c("a", ".")) + # okay
# scale_shape_manual(values = c("circle", "square")) + # okay
scale_shape_manual(values = c("circle", ".")) # error
Based on this comment I don't think you can mix symbols and number references. But (the same comment) shows all possible shapes and their corresponding numbers. The period is 46 and circle is 1.
Using the numbers for the shapes you want:
data.frame(
x = rnorm(10),
y = rnorm(10),
group = gl(2, 5, labels = c("circle", "period"))
) %>%
ggplot(aes(x = x, y = y, shape = group)) +
geom_point(size = 4) +
scale_shape_manual(values = c(1, 46))
I like using a named character vector to define the manual shapes for each group.
plotting_shapes <- c("group_1" = 1,
"group_2" = 46)
data.frame(
x = rnorm(10),
y = rnorm(10),
group = gl(2, 5, labels = c("group_1", "group_2"))
) %>%
ggplot(aes(x = x, y = y, shape = group)) +
geom_point(size = 4) +
scale_shape_manual(values = plotting_shapes)
And so if you really wanted to use "." to refer to shape 46, then you could achieve this through the following:
plotting_shapes <- c("circle" = 1,
"." = 46)
data.frame(
x = rnorm(10),
y = rnorm(10),
group = gl(2, 5, labels = c("group_1", "group_2"))
) %>%
ggplot(aes(x = x, y = y, shape = group)) +
geom_point(size = 4) +
scale_shape_manual(values = plotting_shapes[c("circle", ".")] %>%
unname()) # unname() is required because
# the names don't correspond to
# the group names "group_1" and "group_2"

Visualizing Vargha and Delaney's A with ggplot

I would like to visualize Vargha & Delaney's A in ggplot for educational purposes.
A is an effect size used to compare ordinal data of two groups that depend on each data point's upward/downward/sideways comparison to all data points of the other group.
For this, I would like to be able to show all upward, downward, and equal comparisons of data points in different colors. For an example of what I'm looking for, check out this rough scribble
For reproducibility's sake here is some data to try it with:
library(tidyverse)
data_VD <- tibble(
A = c(1, 2, 3, 6),
B = c(1, 3, 7, 9)
)
For reference to how A is calculated, see https://journals.sagepub.com/doi/10.3102/10769986025002101, though it shouldn't be necessary for creating the plot.
You could do:
library(tidyverse)
long_dat <- data_VD %>%
{expand.grid(A = .$A, B = .$B)} %>%
mutate(change = factor(sign(B - A)))
ggplot(pivot_longer(data_VD, everything()), aes(x = name, y = value)) +
geom_segment(data = long_dat, size = 1.5,
aes(x = 'A', xend = 'B', y = A, yend = B, color = change)) +
geom_point(size = 4) +
scale_color_manual(values = c('#ed1e26', '#fff205', '#26b24f')) +
theme_classic(base_size = 20) +
scale_y_continuous(breaks = 1:10) +
labs(x = '', y = '') +
theme(legend.position = 'none')

2D summary plot with counts as labels

I have measurements of a quantity (value) at specific points (lon and lat), like the example data below:
library(ggplot2)
set.seed(1)
dat <- data.frame(lon = runif(1000, 1, 15),
lat = runif(1000, 40, 60),
value = rnorm(1000))
I want to make a 2D summary (e.g. mean) of the measured values with color in space and on top of that I want to show the counts as labels.
I can plot the labels and to the summary plot
## Left plot
ggplot(dat) +
aes(x = lon, y = lat, z = value) +
stat_summary_hex(bins = 5, fun = "mean", geom = "hex")
## Right plot
ggplot(dat) +
aes(x = lon, y = lat, z = value) +
stat_binhex(aes(label = ..count..), bins = 5, geom = "text")
But when I combine both I loose the summary:
ggplot(dat) +
aes(x = lon, y = lat, z = value) +
stat_summary_hex(bins = 5, fun = "mean", geom = "hex") +
stat_binhex(aes(label = ..count..), bins = 5, geom = "text")
I can achieve the opposite, counts as color and summary as labels:
ggplot(dat, aes(lon, lat, z = value)) +
geom_hex(bins = 5) +
stat_summary_hex(aes(label=..value..), bins = 5,
fun = function(x) round(mean(x), 3),
geom = "text")
While writing the question, which took some hours of testing, I found a solution: adding a fill=NULL, or fill=mean(value) in the text one gives me what I want. Below the code and their resulting plots; the only difference is the label of the legend.
But it feels very hacky, so I would appreciate a better solution.
ggplot(dat) +
aes(x = lon, y = lat, z = value) +
stat_summary_hex(bins = 5, fun = "mean", geom = "hex") +
stat_binhex(aes(label = ..count.., fill = NULL), bins = 5, geom = "text") +
theme_bw()
ggplot(dat) +
aes(x = lon, y = lat, z = value) +
stat_summary_hex(bins = 5, fun = "mean", geom = "hex") +
stat_binhex(aes(label = ..count.., fill = mean(value)), bins = 5, geom = "text") +
theme_bw()
I propose a completely different approach to this problem. However, it needs to be clarified a bit first. You write "I have measurements of a quantity (value) at specific points (lon and lat)" but you do not specify these points exactly. Your data (generated) contains 1000 lon points and the same number of lat points.
Anyway, see for yourself.
library(tidyverse)
set.seed(1)
dat <-
tibble(
lon = runif(1000, 1, 15),
lat = runif(1000, 40, 60),
value = rnorm(1000)
)
dat %>% distinct(lon) %>% nrow() #1000
dat %>% distinct(lat) %>% nrow() #1000
My guess is that for real data you have a much smaller set of values for lon and lat.
Let me break it down to an accuracy of 2.
grid = 2
dat %>% mutate(
lon = round(lon/grid)*grid,
lat = round(lat/grid)*grid,
) %>%
group_by(lon, lat) %>%
summarise(
mean = mean(value),
label = n()
)
As you can see after rounding, the data was grouped according to these two variables and then I calculated the statistics you are interested in (mean and number of observations).
Also note that these statistics are generated at the intersection of lon and lat, so we have a square grid. In your solution, this is not the case at all. You are not getting the number of observations at these points and your grid is not square.
So let's make a graph.
dat %>% ggplot(aes(lon,lat,z=mean)) +
geom_contour_filled(binwidth = 0.25) +
geom_text(aes(label = label)) +
theme_bw()
Nothing stands in the way of increasing your grid a bit, let's say 4.
grid = 4
datg = dat %>% mutate(
lon = round(lon/grid)*grid,
lat = round(lat/grid)*grid,
) %>%
group_by(lon, lat) %>%
summarise(
mean = mean(value),
label = n()
)
datg %>% ggplot(aes(lon,lat,z=mean)) +
geom_contour_filled(binwidth = 0.25) +
geom_text(aes(label = label)) +
theme_bw()
Using such a solution, we can easily supplement the labels in the points of interest to us, e.g. with the average value. This time we will use grid = 1.5.
grid = 1.5
datg = dat %>% mutate(
lon = round(lon/grid)*grid,
lat = round(lat/grid)*grid,
) %>%
group_by(lon, lat) %>%
summarise(
mean = mean(value),
label = n(),
lab2 = paste0("(", round(mean, 2), ")")
)
datg %>% ggplot(aes(lon,lat,z=mean)) +
geom_contour_filled(binwidth = 0.25) +
geom_text(aes(label = label)) +
geom_text(aes(label = lab2), nudge_y = -.5, size = 3) +
theme_bw()
Hope this solution fits your needs much better than the stat_binhex based solution.
The problem here is that both plots share the same legend scale.
As the scales ranges are different : 0-40 vs -1.5 - 0.5, the biggest range makes values of the smallest range appear with (almost) the same color.
This is why displaying count as color works, but the opposite doesn't seem to work.
As an illustration, if you rescale the mean calculation, colors variations are visible:
rescaled_mean <- function(x) mean(x)*40
ggplot(dat) +
aes(x = lon, y = lat, z = value) +
stat_summary_hex(bins = 5, fun = "rescaled_mean", geom = "hex")+
stat_binhex(aes(label = ..count..), bins = 5, geom = "text") +
theme_bw()
To be fair, I find this a very strange behaviour. I like your solution though - I really don't find it very hacky to add fill = NULL. In contrary, I find this very elegant. Here a more hacky approach, basically resulting the same, but with one more line. It's using ggnewscale.
library(ggplot2)
set.seed(1)
dat <- data.frame(lon = runif(1000, 1, 15),
lat = runif(1000, 40, 60),
value = rnorm(1000))
ggplot(dat) +
aes(x = lon, y = lat,z = value) +
stat_summary_hex(bins = 5, fun = "mean", geom = "hex") +
ggnewscale::new_scale_fill() +
stat_binhex(aes(label = ..count..), bins = 5, geom = "text")
Created on 2022-02-17 by the reprex package (v2.0.1)

Add count as label to points in geom_count

I used geom_count to visualise overlaying points as sized groups, but I also want to add the actual count as a label to the plotted points, like this:
However, to achieve this, I had to create a new data frame containing the counts and use these data in geom_text as shown here:
#Creating two data frames
data <- data.frame(x = c(2, 2, 2, 2, 3, 3, 3, 3, 3, 4),
y = c(1, 2, 2, 2, 2, 2, 3, 3, 3, 3),
id = c("a", "b", "b", "b", "c",
"c", "d", "d", "d", "e"))
data2 <- data %>%
group_by(id) %>%
summarise(x = mean(x), y = mean(y), count = n())
# Creating the plot
ggplot(data = data, aes(x = x, y = y)) +
geom_count() +
scale_size_continuous(range = c(10, 15)) +
geom_text(data = data2,
aes(x = x, y = y, label = count),
color = "#ffffff")
Is there any way to achieve this in a more elegant way (i.e. without the need for the second data frame)? I know that you can access the count in geom_count using ..n.., yet if I try to access this in geom_text, this is not working.
Are you expecting this:
ggplot(data %>%
group_by(id) %>%
summarise(x = mean(x), y = mean(y), count = n()),
aes(x = x, y = y)) + geom_point(aes(size = count)) +
scale_size_continuous(range = c(10, 15)) +
geom_text(aes(label = count),
color = "#ffffff")
update:
If the usage of geom_count is must, then the expected output can be achieved using:
p <- ggplot(data = data, aes(x = x, y = y)) +
geom_count() + scale_size_continuous(range = c(10, 15))
p + geom_text(data = ggplot_build(p)$data[[1]],
aes(x, y, label = n), color = "#ffffff")
here would be a solution for a code with discrete values
f<-ggplot(data = STest, aes(x = x, y = y)) + geom_count()+scale_x_discrete(labels = c("strong decrease","decrease","no change","increase","strong increase","no opinion"))+scale_y_discrete(labels = c("strong decrease","decrease","no change","increase","strong increase","no opinion"))
f + geom_text(data = ggplot_build(p)$data[[1]],aes(x, y, label = n,vjust= -2))
Thank you so much!
A much easier way to change this is to use the labs() function so in this case it would be ...labs(size = "Count") + ....
That should be all you need.

Resources