R: gganimate with geom_density - r

I am trying to create an animate graph with gganimate. My defaul graph, static is something like:
But instead of 3 grouping variables I have 12 (year 0, year 1, year 2, etc.). Instead of plotting all 12 variables together I would like to animate it. To avoid:
Those kernel densities plots are made as follows:
data_decidious %>% tidyr::gather("YEAR", "NDVI", colsPostNDVI) %>%
mutate(YEAR = str_remove(YEAR, 'meanNDVIyear')) %>% mutate(YEAR = str_remove(YEAR, 'meanprefire_NDVI')) %>% mutate(YEAR = as.factor(YEAR)) %>%
ggplot(aes(NDVI,fill=YEAR)) + geom_density(alpha=.2) + xlim(0.3, 0.7) + ylim(0,46) +
xlab("Mean NDVI") + ylab("Kernal density") + guides(fill=guide_legend(title="Comparative"))
I have found that this geom_density() only works when I add mutate(YEAR = as.factor(YEAR)). That means when I add:
transition_time(YEAR) + ease_aes('linear')
I get the error:
Error: time data must either be integer, numeric, POSIXct, Date, difftime, orhms
In addition: Warning message:
In min(cl[cl != 0]) : no non-missing arguments to min; returning Inf
Any idea to animate my graph?

Converting YEAR to a factor is not necessary. Instead simply map factor(YEAR) on fill. This way you can use YEAR in transition time and everything is fine.
Using the gapminder::gapminder dataset as example data the following code plots and animates the density of worldwide life-expectancy over time.
(BTW: Instead of using a categorical color scale you can map YEAR directly on fill to get a continuous color scale. However, in this case you have to map YEAR also on the group aesthetic):
library(ggplot2)
library(dplyr)
library(gganimate)
p <- gapminder::gapminder %>%
ggplot(aes(lifeExp, fill = factor(year))) +
geom_density(alpha=.2) +
xlab("Life Expectancy") +
ylab("Kernal density") +
guides(fill = guide_legend(title = "Year"))
p +
transition_time(year) +
ease_aes('linear')
Created on 2020-04-17 by the reprex package (v0.3.0)
Edit:
As far as I can tell without having seen your dataset you have to adapt your code like so (from inspecting your code I guess that YEAR is a character. So you have to convert it to an integer):
data_long <- data_decidious %>%
tidyr::gather("YEAR", "NDVI", colsPostNDVI) %>%
mutate(YEAR = str_remove(YEAR, 'meanNDVIyear')) %>%
mutate(YEAR = str_remove(YEAR, 'meanprefire_NDVI')) %>%
# Convert YEAR to integer
mutate(YEAR = as.integer(YEAR))
p <- data_long %>%
ggplot(aes(NDVI,fill=factor(YEAR))) +
geom_density(alpha=.2) +
xlim(0.3, 0.7) +
ylim(0,46) +
xlab("Mean NDVI") +
ylab("Kernal density") +
guides(fill=guide_legend(title="Comparative"))
p +
transition_time(YEAR) +
ease_aes('linear')
anim_save("test.gif")

Related

ggplot: No legend when using scale_fill_brewer with geom_contour_filled

I've plotted a specific set of meteorological data using ggplot as described in the R code below. However, when I use scale_fill_brewer to specific the fill color, a legend does not appear.
What changes are necessary for the legend to appear?
library(tidyverse)
library(lubridate)
library(ggplot2)
library(RColorBrewer)
qurl <- "https://www.geo.fu-berlin.de/met/ag/strat/produkte/qbo/singapore.dat"
sing <- read_table(qurl, skip=4)
# the data file adds a 100mb data row starting in 1997 increasing the number of rows per year from
# 14 to 15. So, one calcuation must be applied to rnum <140 and a different to rnum >140.
sing2 <- sing %>% separate(1,into=c('hpa','JAN'),sep='\\s+') %>% drop_na() %>%
subset(hpa != 'hPa') %>%
mutate(rnum = row_number(),
hpa=as.integer(hpa)) %>%
mutate(year = case_when(rnum <=140 ~ 1987 + floor(rnum/14), # the last year with 14 rows of data
rnum >=141 ~ 1987 + floor(rnum+10/15))) %>% # the first year with 15 rows of data
relocate(year, .before='hpa') %>% arrange(year,hpa) %>%
pivot_longer(cols=3:14, names_to='month',values_to='qbo') %>%
mutate(date=ymd(paste0(year,'-',month,'-15')),
hpa=as.integer(hpa),
qbo=as.numeric(qbo))
sing2 <- sing %>% separate(1,into=c('hpa','JAN'),sep='\\s+') %>% drop_na() %>%
subset(hpa != 'hPa') %>%
mutate(year=1987+floor(row_number()/15),
hpa=as.integer(hpa)) %>%
relocate(year, .before='hpa') %>% arrange(year,hpa) %>%
pivot_longer(cols=2:13, names_to='month',values_to='qbo') %>%
mutate(date=ymd(paste0(year,'-',month,'-15')),
hpa=as.integer(hpa),
qbo=as.numeric(qbo))
# End Data Massaging. It's ready to be graphed
# A simple call to ggplot with geom_contour_filled generates a legend
sing2 %>%
ggplot(aes(x=date,y=hpa)) +
geom_contour_filled(aes(z=qbo*0.1)) +
scale_y_reverse()
# Adding scale_fill_brewer removes the legend.
# Adding show.legend = TRUE to the geom_countour_filled options has no effect.
limits = c(-1,1)*max(abs(sing2$qbo),na.rm=TRUE)
zCuts <- round(seq(limits[1], limits[2], length.out = 11), digits=0)
sing2 %>%
ggplot() +
geom_contour_filled(aes(x=date,y=hpa, z = qbo*0.1),breaks=zCuts*0.1) +
scale_y_reverse(expand=c(0,0)) +
scale_x_date(expand=c(0,0), date_breaks = '1 year', date_labels = '%Y') +
scale_fill_brewer(palette = 5,type='div',breaks=zCuts) +
theme_bw() +
theme(legend.position = 'right',
axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))
OP, I don't have a direct answer for you, given that your example is not able to be replicated (unable to access the data you gave). In place, I can give you a bit of advice on debugging, since it seems the issue is related to the breaks= argument of scale_fill_brewer(). As you mention, you get a legend when using geom_contour_filled(), but not when you add the scale_fill_brewer() part.
Let me use the example from the documentation for geom_contour_filled() to illustrate this behavior, which utilizes the built-in dataset, fathfuld.
I'll add in your own palette and type choice, leaving out the breaks argument for example:
v <- ggplot(faithfuld, aes(waiting, eruptions, z = density))
v + geom_contour_filled() +
scale_fill_brewer(palette = 5, type='div')
If you do the same thing, but add in a "nonsensical" breaks argument, you get the same plot, but without a legend (like you are seeing):
v + geom_contour_filled() +
scale_fill_brewer(palette = 5, type='div', breaks=1:4)
For me, this is good evidence that the issue in your code relates to the value for breaks= not being within the range expected. Is this just a typo? Note that breaks=zCuts in scale_fill_brewer(), yet breaks=zCuts*0.1 in geom_contour_filled(). This would put each value for your color scale to be 10 times outside the range of the breaks for the contours themselves. I'd be willing to bet that this change to that scale_fill_brewer() line will do the trick:
# earlier plot code
... +
scale_fill_brewer(palette = 5,type='div',breaks=zCuts*0.1) +
...
# remaining plot code

Why can't I get the right horizontal axis labels on my ggplot2 chart?

I am trying to do a faceted plot of a grouped dataframe with ggplot2, using geom_line(). My dataframe has a Date column and I would like to have dates on the horizontal axis. If I just use Date in aes(x=Date, ...) I get nice labels on the horizontal axis. However, the line has an almost horizontal section where the date jumps from the end of one group to the beginning of the next group. This code and chart shows that:
dts <- seq.Date(as.Date("2020-01-01"), as.Date("2021-12-31"), by="day")
mos <- sapply(dts, month)
df <- data.frame(Date=dts, Month=mos)
nr <- nrow(df)
df$X <- rep(1, nr)
df %>%
group_by(Month) -> dfgrp
dfgrp %>%
group_by(Month) %>%
mutate(Time = Date[1:n()],
Z = cumsum(X)) %>%
ggplot(aes(x=Date, y=Z)) +
geom_line(color="darkgreen", size=0.5) +
facet_grid(. ~ Month, scale="free_x") +
theme(axis.text.x = element_text(angle=45, size=7))
I would not like my chart to have those almost-horizontal lines when the date changes by a large amount. I was able to generate a chart without those lines using integers on aes() as follows:
dfgrp %>%
mutate(Time = 1:n() %>% as.integer(),
Z = cumsum(X)) %>%
ggplot(aes(x=Time, y=Z)) +
geom_line(color="darkgreen", size=0.5) +
facet_grid(. ~ Month, scale="free_x") +
scale_x_continuous(breaks = seq(from=1, to=nr, by=10) %>% as.integer(),
labels = function(x) as.character(dfgrp$Date[x])) +
theme(axis.text.x = element_text(angle=45, size=7))
The line on the chart looks like I want it but the dates on the horizontal axis are not correct: they end in February 2020 in every facet while the dates in the dataframe end in December 2021 and the dates in the first chart begin and end on different months in different facets.
I tried many things but nothing worked. Any suggestions on how to have a chart with dates like in the first chart above and lines like in the second chart above?
Help will be much appreciated.
You may want to adjust the dates to be in the same year, but noting the original year as a variable:
library(lubridate)
dfgrp %>%
group_by(Month) %>%
mutate(year = year(Date),
adj_date = ymd(paste(2020, month(Date), day(Date)))) %>%
# 2020 was leap year so 2/29 won't be lost
mutate(Time = Date[1:n()],
Z = cumsum(X)) %>%
ggplot(aes(x=adj_date, y=Z, color = year, group = year)) +
geom_line(size=0.5) +
facet_grid(. ~ Month, scale="free_x") +
theme(axis.text.x = element_text(angle=45, size=7))

How to Combine 2 Line Graphs Together

I'm new to using R so please bear with me as my code might not look the best. So I want to combine these two line graphs together since right now I have written code for each item that I am analyzing. This is the dataset I am using: https://github.com/rfordatascience/tidytuesday/blob/master/data/2020/2020-09-01/readme.md I used the "Arable_Land" dataset!
##USA Arable Land
plot_arable_land_USA <- arable_land %>%
filter(Code == "USA") %>%
select(c(Year, Code, `Arable land needed to produce a fixed quantity of crops ((1.0 = 1961))`)) %>%
pivot_longer(-c(Year, Code)) %>%
ggplot(aes(x = Year, y = value,color=name,group=name)) +
geom_line() +
facet_wrap(.~name,scales = 'free_y') +
theme_light() +
theme(legend.position = 'none')
ggplotly(plot_arable_land_USA)
##Canada Arable Land
plot_arable_land_CAN <- arable_land %>%
filter(Code == "CAN") %>%
select(c(Year, Code, `Arable land needed to produce a fixed quantity of crops ((1.0 = 1961))`)) %>%
pivot_longer(-c(Year, Code)) %>%
ggplot(aes(x = Year, y = value,color=name,group=name)) +
geom_line() +
facet_wrap(.~name,scales = 'free_y') +
theme_light() +
theme(legend.position = 'none')
ggplotly(plot_arable_land_CAN)
Ideally, I would like one graph to show both like one line (in Purple) to show the USA and another line(in Brown) to show Canada.
Thank you!
Try this. It is a better practice to reshape data to long as you did. In your case you can add filter() to choose the desired countries. Then, reshape to long and design the plot. The key is setting color and group with Code in order to obtain the desired lines. You can set the colors using scale_color_manual() and I have left the facet option to get the title. Here the code:
library(plotly)
library(tidyverse)
#Code
plot_arable_land_CAN <- arable_land %>% select(-Entity) %>%
filter(Code %in% c('USA','CAN')) %>%
pivot_longer(-c(Code,Year)) %>%
ggplot(aes(x = Year, y = value,color=Code,group=Code)) +
geom_line() +
facet_wrap(.~name,scales = 'free_y') +
theme_light() +
theme(legend.position = 'none')+
scale_color_manual(values = c('brown','purple'))
#Transform
ggplotly(plot_arable_land_CAN)
Output:

How to adjust distances between years on x-axis and adjust line of geom_line(), in ggplot, R-studio?

I would like to create a line plot using ggplot's geom_line() where all distances between years are equal independent of the actual value the year-variable takes and where the dots of geom_point() are connected if there are only two years in between but not if the temporal distance is more than that.
Example:
my.data<-data.frame(
year=c(2001,2003,2005,NA,NA,NA,NA,NA,NA,2019),
value=c(runif(10)))
As for the plot I have tried two different things, both of which are not ideal:
Plotting year as continuous variable with breaks=year and minor_breaks=F, where, obviously the distances between the first three observations are much smaller than the distance between 2005 and 2019, and where, unfortunately, all dots are connected:
library(ggplot2)
library(dplyr)
my.data %>%
ggplot(aes(x=year,y=value)) +
geom_line() +
geom_point() +
scale_x_continuous(breaks=c(2001,2003,2005,2019), minor_breaks=F) +
theme_minimal()
Removing NAs and plotting year as factor which yields equal spacing between the years, but obviously removes the lines between data points:
my.data %>%
filter(!is.na(year)) %>%
ggplot(aes(x=factor(year),y=value)) +
geom_line() +
geom_point() +
theme_minimal()
Are there any solutions to these issues? What am I overlooking?
First attempt:
Second attempt:
What I need (but ideally without the help of Paint):
my.data %>%
ggplot(aes(x=year)) +
geom_line(aes(y = ifelse(year <= 2005,value,NA))) +
geom_point(aes(y = value)) +
scale_x_continuous(breaks=c(2001,2003,2005,2019), minor_breaks=F) +
theme_minimal()
maybe something like this would work
I came to a bit convoluted and not super clean solution, but it might get the job done. I am checking if one year should be connected to the next one with lead(). And "remove" the appropriate connections by turning them white. The dummy column is there to put all years in one line and not two.
my.data = data.frame(year=c(2001,2003,2005,2008,2009,2012,2015,2016,NA,2019),
value=c(runif(10))) %>%
filter(!is.na(year)) %>%
mutate(grouped = if_else(lead(year) - year <= 2, "yes", "no")) %>%
fill(grouped, .direction = "down") %>%
mutate(dummy = "all")
my.data %>%
ggplot(aes(x = factor(year),y = value)) +
geom_line(aes(y = value, group = dummy, color = grouped), show.legend = FALSE) +
geom_point() +
scale_color_manual(values = c("yes" = "black", "no" = "white")) +
theme_classic()

gganimate: handling of missing values

I'm trying the gganimate package for the first time and run into a problem with the handling of missing values (NA). I apologize if my question is trivial but I couldn't find any solution.
Here is a reproducible example of what I'm trying to do:
# Load libraries:
library(ggplot2)
library(gganimate)
library(dplyr)
library(tidyr)
# Create some data
## Monthly sales are in 100:1000
## Expected sales are 400/month, increasing by 5% every year
set.seed(123)
df <- data_frame(Year = rep(2015:2018, each=12),
Month = rep(1:12, 4),
Sales = unlist(lapply(1:4,
function(x){cumsum(sample(100:1000, 12))})),
Expected = unlist(lapply(1:4,
function(x){cumsum(rep(400*1.05^(x-1),12))})))
# gganimate works fine here:
df %>%
tidyr::gather("Type", "value", Sales:Expected) %>%
ggplot(aes(Month, value, col=Type)) +
geom_point() +
geom_line() +
gganimate::transition_time(Year)
# Now data for the end of Year 2018 are missing:
df[df$Year==2018 & df$Month %in% 9:12,"Sales"] = NA
# Plotting with ggplot2 works (and gives a warning about missing values):
df %>%
tidyr::gather("Type", "value", Sales:Expected) %>%
dplyr::filter(Year == "2018") %>%
ggplot(aes(Month, value, col=Type)) +
geom_point() +
geom_line()
# But gganimate fails
df %>%
tidyr::gather("Type", "value", Sales:Expected) %>%
ggplot(aes(Month, value, col=Type)) +
geom_point() +
geom_line() +
gganimate::transition_time(Year)
# I get the following error:
## Error in rep(seq_len(nrow(polygon)), splits + 1) : incorrect 'times' argument
I tried to play with the enter_() / exit_() functions of gganimate but without success.
Thank you for your help.
EDIT: (using the suggestion of MattL)
This works:
df %>%
# filter(!is.na(Sales)) %>% ##Proposed by Matt L but removes Expected values too
gather("Type", "value",Sales:Expected) %>%
filter(!is.na(value)) %>% ## Remove NA values
ggplot(aes(Month, value, col=Type)) +
geom_point() +
geom_line() +
labs(title = "Year: {frame_time}") + ## Add title
gganimate::transition_time(Year) +
gganimate::exit_disappear(early=TRUE) ## Removes 2017 points appearing in Year 2018
I still have the feeling that gganimate should be able to handle these NA values like ggplot does though.
Thanks!
Filter out the missing values before "piping" to the ggplot function:
df %>%
filter(!is.na(Sales)) %>%
tidyr::gather("Type", "value", Sales:Expected) %>%
ggplot(aes(Month, value, col=Type)) +
geom_point() +
geom_line() +
gganimate::transition_time(Year)

Resources