Possible to animate polygon fill using gganimate in R? - r

I have county level data recording the year an invasive insect pest was first detected in that county between 2002 and 2018. I created a map using ggplot2 and the maps package that fills the county polygons with a color according to the year the pest was detected.
**Is there a way to use the gganimate package to animate this map with the first frame filling in only polygons with a detection date of 2002, the second frame filling polygons with a detection date of 2003 or earlier (so 2002 and 2003), a third frame for detection dates of 2004 or earlier (2002, 2003, 2004), etc.? **
Clarification: I'd like it so all the county polygons are always visible and filled in with white initially and each frame of the animation adds fills in counties based on the year of detection.
I've tried using the transition_reveal(data$detect_year) with the static plot but get an error that "along data must either be integer, numeric, POSIXct, Date, difftime, orhms".
Here's some code for a reproducible example:
library(dplyr)
library(purrr)
library(maps)
library(ggplot2)
library(gganimate)
# Reproducible example
set.seed(42)
map_df <- map_data("county") %>%
filter(region == "minnesota")
map_df$detection_year <- NA
# Add random detection year to each county
years <- 2002:2006
map_list <- split(map_df, f = map_df$subregion)
map_list <- map(map_list, function(.x) {
.x$detection_years <- mutate(.x, detection_years = sample(years, 1))
})
# collapse list back to data frame
map_df <- bind_rows(map_list)
map_df$detection_years <- as.factor(map_df$detection_years)
# Make plot
static_plot <- ggplot(map_df,
aes(x = long,
y = lat,
group = group)) +
geom_polygon(data = map_df, color = "black", aes(fill = detection_years)) +
scale_fill_manual(values = terrain.colors(n = length(unique(map_df$detection_years))),
name = "Year EAB First Detected") +
theme_void() +
coord_fixed(1.3)
animate_plot <- static_plot +
transition_reveal(detection_years)
If it's possible to do this with gganimate, I'd like to but I'm also open to other solutions if anyone has ideas.

After getting an answer from #RLave that almost did what I wanted and spending a little time with the documentation, I was able to figure out a way to do what I want. It doesn't seem very clean, but it works.
Essentially, I created a copy of my data frame for each year that needed a frame in the animation. Then for each year of detection I wanted to animate, I edited the detection_year variable in that copy of the data frame so that any county that had a detection in the year of interest or earlier retained their values and any county that had no detection yet was converted to the value I plotted as white. This made sure all the counties were always plotted. Then I needed to use transition_manual along with a unique ID I gave to each copy of the original data frame to determine the order of the animation.
library(dplyr)
library(purrr)
library(maps)
library(ggplot2)
library(gganimate)
# Reproducible example
set.seed(42)
years <- 2002:2006
map_df <- map_data("county") %>%
filter(region == "minnesota")
map_df <- map_df %>%
group_by(subregion) %>%
mutate(detection_year = sample(years,1))
animate_data <- data.frame()
for(i in 2002:2006){
temp_dat <- map_df %>%
mutate(detection_year = as.numeric(as.character(detection_year))) %>%
mutate(detection_year = case_when(
detection_year <= i ~ detection_year,
detection_year > i ~ 2001
),
animate_id = i - 2001
)
animate_data <- bind_rows(animate_data, temp_dat)
}
animate_data$detection_year <- as.factor(as.character(animate_data$detection_year))
# Make plot
static_plot <- ggplot(animate_data,
aes(x = long,
y = lat,
group = group)) +
geom_polygon(data = animate_data, color = "black", aes(fill = detection_year)) +
scale_fill_manual(values = c("white",
terrain.colors(n = 5)),
name = "Year First Detected") +
theme_void() +
coord_fixed(1.3) #+
facet_wrap(~animate_id)
animate_plot <- static_plot +
transition_manual(frames = animate_id)
animate_plot

Possibily this, but I'm not sure that this is the expected output.
I changed your code, probably you don't need to split. I used group_by to assign a year to each region.
set.seed(42)
years <- 2002:2006
map_df <- map_data("county") %>%
filter(region == "minnesota")
map_df <- map_df %>%
group_by(subregion) %>%
mutate(detection_year = sample(years,1))
For the transition you need to define the id, here the same as the grouping (subregion or group) and a correct date format for the transition (along) variable (I used lubridate::year())
# Make plot
static_plot <- ggplot(map_df,
aes(x = long,
y = lat,
group = group)) +
geom_polygon(color = "black", aes(fill = as.factor(detection_year))) +
scale_fill_manual(values = terrain.colors(n = length(unique(map_df$detection_year))),
name = "Year EAB First Detected") +
theme_void() +
coord_fixed(1.3)
animate_plot <- static_plot +
transition_reveal(subregion, # same as the group variable
lubridate::year(paste0(detection_year, "-01-01"))) # move along years
Does this do it for you?

Related

How to plot worldmap in R

I'm working on a project and a small part of it consists of drawing a world map with 43 countries on my list. My dataset is as follows:
How do I put this on the world map with different colors for development status as follows?
Data is here :
https://wetransfer.com/downloads/0960ed96fba15e9591a2e9c14ac852fa20220301181615/dc25f41a87fc2ba165a72ab6712dd8d020220301181640/832b5e
A quick example using fake data:
library(dplyr)
library(ggplot2)
# simulate data for subset of countries
mydata <- map_data("world") %>%
distinct(region) %>%
mutate(fakedata = runif(n())) %>%
slice_sample(n = 200)
# add simulated values and remove Antarctica
worldmap <- map_data("world") %>%
filter(region != "Antarctica") %>%
left_join(mydata)
ggplot(worldmap) +
geom_polygon(aes(long, lat, group = group, fill = fakedata)) +
coord_quickmap() +
scale_fill_viridis_c(option = "plasma", na.value = NA) +
theme_void()
Also look into the {sf} package and geom_sf(), which among other things makes it easier to use different / less distorted / less biased map projections.
Similar to #zephryl answer, but using tmap. The first step is joining your data with the World data by country name. The next step is drawing the map.
library(dplyr)
library(tmap)
# Get World data
data("World")
# Dummy data frame similar to your data
df <- data.frame(location = World$name,
devStat = rnorm(length(World$name), 5, 2.5))
# Join by country name
# Just need to make sure that country names are written exactly the same
# in the two datasets
df2 <- World |>
left_join(df, by = c("name" = "location"))
# Create map
# Shape to add to the map
tm_shape(df2) +
# Draw the previous shape as polygons
# Set the attribute to which the polygons will be coloured
tm_polygons("devStat",
# select palette
palette = "-plasma",
# Set palette categories as order
style = "order",
# Horizontal legend
legend.is.portrait = FALSE) +
# Remove frame from layout
tm_layout(frame = FALSE,
# Put legend outsize the frame
legend.outside = T,
legend.outside.position = "top")

R - Having trouble making US heatmap of states

I am trying to make a US state heatmap of the Tidy Tuesday data this week. I am having a lot of trouble getting it to work and I am thinking this should only take a few lines of code.
Is USMAP the best way to do mapping like this in R?
Is there a ggplot way to do this instead of using the usmap package?
What am I doing wrong in my example?
library(usmap)
library(tidyverse)
nurses <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-10-05/nurses.csv')
plot_data <- nurses %>%
filter(Year == 2020) %>%
select(State, `Total Employed RN`) %>%
rename("Total_Employed" = 2)
# Works with no data
plot_usmap()
# Does not work with data ??
plot_usmap(data = plot_data,
values = Total_Employed,
color = "blue") +
scale_fill_continuous(low = "white",
high = "blue")
# The column is real and I can access it
plot_data %>%
mutate(test_column = (1 + Total_Employed))
# Trying to emulate an example from : https://cran.r-project.org/web/packages/usmap/vignettes/mapping.html
statepop <- statepop %>%
rename(State = 3)
merged_df <- left_join(plot_data, statepop, by = "State") %>%
select(fips, abbr, Total_Employed, State)
merged_df
# Still does not work even though data is in the same format
plot_usmap(data = merged_df,
values = Total_Employed,
color = "blue") +
scale_fill_continuous(low = "white",
high = "blue")
Various answers in this thread: https://twitter.com/Indescribled/status/1445234775858368514
My preferred one - https://twitter.com/ivelasq3/status/1445242672021606401

how to loop a geographic mapping function over a list of dataframes (or a subsetted dataframe)

I have a dataframe consisting of species names, longitude and latitude coordinates. there are 115 different species with 25000 lat/long coordinates. I need to make individual maps that show observations for each specific species.
first, I created a function that would generate the kind of map that I want, called platmaps. when I call the function for my full dataset (platmaps(df1)), it creates a map displaying all lat long observations.
Then I constructed a for loop which was supposed to subset my df by species name, and insert that subsetted dataframe into my platmaps function. It runs for a couple of minutes and then nothing happens.
so I then I split the dataframe by species name, and created a list of dataframes(out1), and used lapply(out1, platmaps) but it only returned a list of the names of my dfs.
Then I tried a variation of an example that I saw here, but it also did not work.
function
platmaps<-function(df1){
wm <- wm <- borders("world", colour="gray50", fill="gray50")
ggplot()+
coord_fixed()+
wm +
geom_point(data =df1 , aes(x = decimalLongitude, y = decimalLatitude),
colour = "pink", size = 0.5)
subset
for(i in 1:nrow(PP)){
query<-paste(PP$species[i])
p<-subset(df1, df1$species== query))
platmaps(p)
}
list
for (i in 1:length(out1)){
pp<-out1[[i]]
platmaps(pp)
}
applied example
p =
wm <- wm <- borders("world", colour="gray50", fill="gray50")
ggplot()+
coord_fixed()+
wm +
geom_point(data =df1 , aes(x = decimalLongitude, y = decimalLatitude),
colour = "pink", size = 0.5)
plots = df1 %>%
group_by(species) %>%
do(plots = p %+% . + facet_wrap(~species))
the error for the applied example is:
Error: Cannot add ggproto objects together. Did you forget to add this
object to a ggplot object?
As I'm new to R (and coding), I assume I'm getting the syntax wrong, or am not applying my function correctly to/within either of my loops, or I fundamentally misunderstand the way looping works.
data frame sample
species decimalLongitude decimalLatitude
Platanthera lacera -71.90000 42.80000
Platanthera lacera -90.54861 40.12083
Platanthera lacera -71.00889 42.15500
Platanthera lacera -93.20833 45.20028
Platanthera lacera -72.45833 41.91666
Platanthera bifolia 5.19800 59.64310
Platanthera sparsiflora -117.67472 34.36278
fixed platmaps function
ggplot(data=df1 %>% filter(species == s))+
coord_fixed()+
borders("world", colour="gray50", fill="gray50")+
geom_point(aes(x = decimalLongitude, y = decimalLatitude),
colour = "pink", size = 0.5)+
labs(title=as.character(s))
Because you didn't provide a test data set, let me give you a general idea how to make multiple plots you can inspect later. The code below will plot a parameter for a number of countries and save plot pdfs to a given path. You can replace the code behind the pl variable in the loop with your function.
library(ggplot2)
library(dplyr)
df <- data.frame(country = c(rep('USA',20), rep('Canada',20), rep('Mexico',20)),
wave = c(1:20, 1:20, 1:20),
par = c(1:20 + 5*runif(20), 21:40 + 10*runif(20), 1:20 + 15*runif(20)))
countries <- unique(df$country)
plot_list <- list()
i <- 1
for (c in countries){
pl <- ggplot(data = df %>% filter(country == c)) +
geom_point(aes(wave, par), size = 3, color = 'red') +
labs(title = as.character(c), x = 'wave', y = 'value') +
theme_bw(base_size = 16)
plot_list[[i]] <- pl
i <- i + 1
}
pdf('path/to/pdf')
pdf.options(width = 9, height = 7)
for (i in 1:length(plot_list)){
print(plot_list[[i]])
}
dev.off()
After the plots are obtained (the plot_list variable), we turn on the pdf terminal and print them. In the end, we turn off the pdf terminal.
there is a neat way to apply any function to a list of items. I have outlined a way to do this with the data you added. I cannot get platmaps to work so I have just made a scatter plot.
The method is to split your data frame into individual subsets using split() and then apply the plotting function to the resulting list using lapply(). Since lapply() returns a list, this can be passed directly to a function such as ggpubr::ggarrange() for visualizing.
library(ggplot2)
plot_function <- function(x){
p <- ggplot(x, aes(x = decimalLongitude, y = decimalLatitude)) + geom_point()
p
}
plot_list <-
df %>%
split(.$species) %>% # Separate df into subset dfs based on species column
lapply(., plot_function) # map plot_function to list
# Display on a grid (many ways to do this - I just find this package simple)
ggpubr::ggarrange(plotlist = plot_list)

R ggplot automatic recalculation with geom tile when subsetting

I am attempting to create heat maps with a large data set that has several factors. I'd like to get a birds eye view first, by plotting the heat map of all values and all factors. THEN, I'd like to subset the heat map plot by a variety of factors - but have ggplot2::geom_tile re-calculate the heat map so it plots the relative abundance based on whatever factors I've subsampled.
library(reshape2)
library(ggplot2)
library(dplyr)
#Test data
df <- data.frame(
Measurement = c(1:30),
CA = rep(rnorm(30, mean=20, sd=5)),
TX = rep(rnorm(30, mean=18, sd=5)),
NY = rep(rnorm(30, mean=34, sd=2))
)
df.melt <- melt(df,id = c("Measurement"))
Basic heat map plot code. My actual data includes several factors/columns from which I want to pull data for various comparisons.
#Basic plot
ggplot(data = df.melt,
aes(x = variable, y = Measurement, colors = value, fill = value)) +
geom_tile(color = "black") +
scale_fill_gradientn(colors = c("lightyellow", "darkred"))
I want the output colors to correspond to relative abundance by measurement. So I can look at Relative changes across CA, TX, and NY. This would be my "Base plot".
df.melt.reabun <- df.melt %>% group_by(Measurement) %>%
mutate(RelAbun = value/sum(value))
df.melt.reabun <- as.data.frame(df.melt.reabun)
#New plot with relative abundance
ggplot(data = df.melt.reabun,
aes(x = variable, y = Measurement, colors = RelAbun, fill = RelAbun)) +
geom_tile(color = "black") +
scale_fill_gradientn(colors = c("lightyellow", "darkred"))
What I also want to do is be able to re-plot however I want and the relative abundance to automatically calculate within ggplot tile.
#Assign plot object
heat <- ggplot(data = df.melt.reabun,
aes(x = variable, y = Measurement, colors = RelAbun, fill = RelAbun)) +
geom_tile(color = "black")+
scale_fill_gradientn(colors = c("lightyellow", "darkred"))
#Select variable to subset data
alt <- c("CA", "TX")
#Subset ggplot object
heat %+% subset(df.melt.reabun, variable %in% alt)
But this output is incorrect, because it is only showing relative abundance from the calculation that included CA, TX, and NY.
I want the relative abundance to re-calculate every time I subset the df to plot at this step: heat %+% subset()
I have a feeling I can smoothly combine group_by and geom_tile to do this automatically.. but I can't quite figure it out. Any help would be appreciated. I have MANY MANY combinations of heat maps I want to look at and I do NOT want to re-calculate the relative abundance "manually" each time.
It's generally advisable to do your data wranglings before passing the data frame to ggplot. In this case, something like the following could work:
subsetFun <- function(df, var.filter){
return(df %>%
filter(variable %in% var.filter) %>%
group_by(Measurement) %>%
mutate(RelAbun = value / sum(value)) %>%
ungroup())
}
heat %+% subsetFun(df.melt.reabun, alt)

create a map with the adapted size of states

Hi visualization lovers,
I am trying to create a color map plot,like this one:
(source: https://github.com/hrbrmstr/albersusa)
BUT i want this maps to be biased so that the areas of the states to be proportional to the value I provide (in particular,I use GPD value).
What i mean is that I want some states to look bigger, some smaller that they are in reality but reminding the real USA map as much as possible.
No problems with the states moving or shape destroying.
Any ideas? Any ready solutions?
Currently I use R and albersusa package because it is something I am familiar with. Open to change!
My current code for the plot is:
gmap<-
ggplot() +
geom_map(data = counties#data, map = cmap,
aes(fill =atan(y/x),alpha=x+y, map_id = name),
color = "gray50") +
geom_map(data = smap, map = smap,
aes(x = long, y = lat, map_id = id),
color = "black", size = .5, fill = NA) +
theme_map(base_size = 12) +
theme(plot.title=element_text(size = 16, face="bold",margin=margin(b=10))) +
theme(plot.subtitle=element_text(size = 14, margin=margin(b=-20))) +
theme(plot.caption=element_text(size = 9, margin=margin(t=-15),hjust=0)) +
scale_fill_viridis()+guides(alpha=F,fill=F)
Here's a very ugly first try to get you started, using the outlines from the maps package and some data manipulation from dplyr.
library(maps)
library(dplyr)
library(ggplot2)
# Generate the base outlines
mapbase <- map_data("state.vbm")
# Load the centroids
data(state.vbm.center)
# Coerce the list to a dataframe, then add in state names
# Then generate some random value (or your variable of interest, like population)
# Then rescale that value to the range 0.25 to 0.95
df <- state.vbm.center %>% as.data.frame() %>%
mutate(region = unique(mapbase$region),
somevalue = rnorm(50),
scaling = scales::rescale(somevalue, to = c(0.25, 0.95)))
df
# Join your centers and data to the full state outlines
df2 <- df %>%
full_join(mapbase)
df2
# Within each state, scale the long and lat points to be closer
# to the centroid by the scaling factor
df3 <- df2 %>%
group_by(region) %>%
mutate(longscale = scaling*(long - x) + x,
latscale = scaling*(lat - y) + y)
df3
# Plot both the outlines for reference and the rescaled polygons
ggplot(df3, aes(long, lat, group = region, fill = somevalue)) +
geom_path() +
geom_polygon(aes(longscale, latscale)) +
coord_fixed() +
theme_void() +
scale_fill_viridis()
These outlines aren't the best, and the centroid positions they shrink toward cause the polygons to sometimes overlap the original state outline. But it's a start; you can find better shapes for US states and various centroid algorithms.

Resources