ggplot: No legend when using scale_fill_brewer with geom_contour_filled - r

I've plotted a specific set of meteorological data using ggplot as described in the R code below. However, when I use scale_fill_brewer to specific the fill color, a legend does not appear.
What changes are necessary for the legend to appear?
library(tidyverse)
library(lubridate)
library(ggplot2)
library(RColorBrewer)
qurl <- "https://www.geo.fu-berlin.de/met/ag/strat/produkte/qbo/singapore.dat"
sing <- read_table(qurl, skip=4)
# the data file adds a 100mb data row starting in 1997 increasing the number of rows per year from
# 14 to 15. So, one calcuation must be applied to rnum <140 and a different to rnum >140.
sing2 <- sing %>% separate(1,into=c('hpa','JAN'),sep='\\s+') %>% drop_na() %>%
subset(hpa != 'hPa') %>%
mutate(rnum = row_number(),
hpa=as.integer(hpa)) %>%
mutate(year = case_when(rnum <=140 ~ 1987 + floor(rnum/14), # the last year with 14 rows of data
rnum >=141 ~ 1987 + floor(rnum+10/15))) %>% # the first year with 15 rows of data
relocate(year, .before='hpa') %>% arrange(year,hpa) %>%
pivot_longer(cols=3:14, names_to='month',values_to='qbo') %>%
mutate(date=ymd(paste0(year,'-',month,'-15')),
hpa=as.integer(hpa),
qbo=as.numeric(qbo))
sing2 <- sing %>% separate(1,into=c('hpa','JAN'),sep='\\s+') %>% drop_na() %>%
subset(hpa != 'hPa') %>%
mutate(year=1987+floor(row_number()/15),
hpa=as.integer(hpa)) %>%
relocate(year, .before='hpa') %>% arrange(year,hpa) %>%
pivot_longer(cols=2:13, names_to='month',values_to='qbo') %>%
mutate(date=ymd(paste0(year,'-',month,'-15')),
hpa=as.integer(hpa),
qbo=as.numeric(qbo))
# End Data Massaging. It's ready to be graphed
# A simple call to ggplot with geom_contour_filled generates a legend
sing2 %>%
ggplot(aes(x=date,y=hpa)) +
geom_contour_filled(aes(z=qbo*0.1)) +
scale_y_reverse()
# Adding scale_fill_brewer removes the legend.
# Adding show.legend = TRUE to the geom_countour_filled options has no effect.
limits = c(-1,1)*max(abs(sing2$qbo),na.rm=TRUE)
zCuts <- round(seq(limits[1], limits[2], length.out = 11), digits=0)
sing2 %>%
ggplot() +
geom_contour_filled(aes(x=date,y=hpa, z = qbo*0.1),breaks=zCuts*0.1) +
scale_y_reverse(expand=c(0,0)) +
scale_x_date(expand=c(0,0), date_breaks = '1 year', date_labels = '%Y') +
scale_fill_brewer(palette = 5,type='div',breaks=zCuts) +
theme_bw() +
theme(legend.position = 'right',
axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))

OP, I don't have a direct answer for you, given that your example is not able to be replicated (unable to access the data you gave). In place, I can give you a bit of advice on debugging, since it seems the issue is related to the breaks= argument of scale_fill_brewer(). As you mention, you get a legend when using geom_contour_filled(), but not when you add the scale_fill_brewer() part.
Let me use the example from the documentation for geom_contour_filled() to illustrate this behavior, which utilizes the built-in dataset, fathfuld.
I'll add in your own palette and type choice, leaving out the breaks argument for example:
v <- ggplot(faithfuld, aes(waiting, eruptions, z = density))
v + geom_contour_filled() +
scale_fill_brewer(palette = 5, type='div')
If you do the same thing, but add in a "nonsensical" breaks argument, you get the same plot, but without a legend (like you are seeing):
v + geom_contour_filled() +
scale_fill_brewer(palette = 5, type='div', breaks=1:4)
For me, this is good evidence that the issue in your code relates to the value for breaks= not being within the range expected. Is this just a typo? Note that breaks=zCuts in scale_fill_brewer(), yet breaks=zCuts*0.1 in geom_contour_filled(). This would put each value for your color scale to be 10 times outside the range of the breaks for the contours themselves. I'd be willing to bet that this change to that scale_fill_brewer() line will do the trick:
# earlier plot code
... +
scale_fill_brewer(palette = 5,type='div',breaks=zCuts*0.1) +
...
# remaining plot code

Related

inner labelling for heatmap, in R ggplot

I am trying to add a number label on each cell of a heatmap. Because it also needs marginal barcharts I have tried two packages. iheatmapr and ComplexHeatmap.
(1st try) iheatmapr makes it easy to add to add bars as below, but I couldnt see how to add labels inside the heatmap on individual cells.
library(tidyverse)
library(iheatmapr)
library(RColorBrewer)
in_out <- data.frame(
'Economic' = c(2,1,1,3,4),
'Education' = c(0,3,0,1,1),
'Health' = c(1,0,1,2,0),
'Social' = c(2,5,0,3,1) )
rownames(in_out) <- c('Habitat', 'Resource', 'Combined', 'Protected', 'Livelihood')
GreenLong <- colorRampPalette(brewer.pal(9, 'Greens'))(12)
lowGreens <- GreenLong[0:5]
in_out_matrix <- as.matrix(in_out)
main_heatmap(in_out_matrix, colors = lowGreens)
in_out_plot <- iheatmap(in_out_matrix,
colors=lowGreens) %>%
add_col_labels() %>%
add_row_labels() %>%
add_col_barplot(y = colSums(bcio)/total) %>%
add_row_barplot(x = rowSums(bcio)/total)
in_out_plot
Then used: save_iheatmap(in_out_plot, "iheatmapr_test.png")
Because I couldnt use ggsave(device = ragg::agg_png etc) with iheatmapr object.
Also, the iheatmapr object's apparent incompatibility (maybe I am wrong) with ggsave() is a problem for me because I normally use ragg package to export image AGG to preserve font sizes. I am suspecting some other heatmap packages make custom objects that maybe incompatible with patchwork and ggsave.
ggsave("png/iheatmapr_test.png", plot = in_out_plot,
device = ragg::agg_png, dpi = 72,
units="in", width=3.453, height=2.5,
scaling = 0.45)
(2nd try) ComplexHeatmap makes it easy to label individual number "cells" inside a heatmap, and also offers marginal bars among its "Annotations", and I have tried it, but its colour palette system (which uses integers to refer to a set of colours) doesnt suit my RGB vector colour gradient, and overall it is a sophisticated package clearly designed to make graphics more advanced than what I am doing.
I am aiming for style as shown in screenshot example below, which was made in Excel.
Please can anyone suggest a more suitable R package for a simple heatmap like this with marginal bars, and number labels inside?
Instead of relying on packages which offer out-of-the-box solutions one option to achieve your desired result would be to create your plot from scratch using ggplot2 and patchwork which gives you much more control to style your plot, to add labels and so on.
Note: The issue with iheatmapr is that it returns a plotly object, not a ggplot. That's why you can't use ggsave.
library(tidyverse)
library(patchwork)
in_out <- data.frame(
'Economic' = c(1,1,1,5,4),
'Education' = c(0,0,0,1,1),
'Health' = c(1,0,1,0,0),
'Social' = c(1,1,0,3,1) )
rownames(in_out) <- c('Habitat', 'Resource', 'Combined', 'Protected', 'Livelihood')
in_out_long <- in_out %>%
mutate(y = rownames(.)) %>%
pivot_longer(-y, names_to = "x")
# Summarise data for marginal plots
yin <- in_out_long %>%
group_by(y) %>%
summarise(value = sum(value)) %>%
mutate(value = value / sum(value))
xin <- in_out_long %>%
group_by(x) %>%
summarise(value = sum(value)) %>%
mutate(value = value / sum(value))
# Heatmap
ph <- ggplot(in_out_long, aes(x, y, fill = value)) +
geom_tile() +
geom_text(aes(label = value), size = 8 / .pt) +
scale_fill_gradient(low = "#F7FCF5", high = "#00441B") +
theme(legend.position = "bottom") +
labs(x = NULL, y = NULL, fill = NULL)
# Marginal plots
py <- ggplot(yin, aes(value, y)) +
geom_col(width = .75) +
geom_text(aes(label = scales::percent(value)), hjust = -.1, size = 8 / .pt) +
scale_x_continuous(expand = expansion(mult = c(.0, .25))) +
theme_void()
px <- ggplot(xin, aes(x, value)) +
geom_col(width = .75) +
geom_text(aes(label = scales::percent(value)), vjust = -.5, size = 8 / .pt) +
scale_y_continuous(expand = expansion(mult = c(.0, .25))) +
theme_void()
# Glue plots together
px + plot_spacer() + ph + py + plot_layout(ncol = 2, widths = c(2, 1), heights = c(1, 2))

Additional x axis on ggplot

I'm aware there are similar posts but I could not get those answers to work in my case.
e.g. Here and here.
Example:
diamonds %>%
ggplot(aes(scale(price) %>% as.vector)) +
geom_density() +
xlim(-3, 3) +
facet_wrap(vars(cut))
Returns a plot:
Since I used scale, those numbers are the zscores or standard deviations away from the mean of each break.
I would like to add as a row underneath the equivalent non scaled raw number that corresponds to each.
Tried:
diamonds %>%
ggplot(aes(scale(price) %>% as.vector)) +
geom_density() +
xlim(-3, 3) +
facet_wrap(vars(cut)) +
geom_text(aes(label = price))
Gives:
Error: geom_text requires the following missing aesthetics: y
My primary question is how can I add the raw values underneath -3:3 of each break? I don't want to change those breaks, I still want 6 breaks between -3:3.
Secondary question, how can I get -3 and 3 to actually show up in the chart? They have been trimmed.
[edit]
I've been trying to make it work with geom_text but keep hitting errors:
diamonds %>%
ggplot(aes(x = scale(price) %>% as.vector)) +
geom_density() +
xlim(-3, 3) +
facet_wrap(vars(cut)) +
geom_text(label = price)
Error in layer(data = data, mapping = mapping, stat = stat, geom = GeomText, :
object 'price' not found
I then tried changing my call to geom_text()
geom_text(data = diamonds, aes(price), label = price)
This results in the same error message.
You can make a custom labeling function for your axis. This takes each label on the axis and performs a custom transform for you. In your case you could paste the z score, a line break, and the z-score times the standard deviation plus the mean. Because of the distribution of prices in the diamonds data set, this means that z scores below about -1 represent negative prices. This may not be a problem in your own data. For clarity I have drawn in a vertical line representing $0
labeller <- function(x) {
paste0(x,"\n", scales::dollar(sd(diamonds$price) * x + mean(diamonds$price)))
}
diamonds %>%
ggplot(aes(scale(price) %>% as.vector)) +
geom_density() +
geom_vline(aes(xintercept = -0.98580251364833), linetype = 2) +
facet_wrap(vars(cut)) +
scale_x_continuous(label = labeller, limits = c(-3, 3)) +
xlab("price")
We can use the sec_axis functionality in scale_x_continuous. To use this functionality we need to manually scale your data. This will add a secondary axis at the top of the plot, not underneath. So it's not quite exactly what you're looking for.
library(tidyverse)
# manually scale the data
mean_price <- mean(diamonds$price)
sd_price <- sd(diamonds$price)
diamonds$price_scaled <- (diamonds$price - mean_price) / sd_price
# make the plot
ggplot(diamonds, aes(price_scaled))+
geom_density()+
facet_wrap(~cut)+
scale_x_continuous(sec.axis = sec_axis(~ mean_price + (sd_price * .)),
limits = c(-3, 4), breaks = -3:3)
You could cheat a bit by passing some dummy data to geom_text:
geom_text(data = tibble(label = round(((-3:3) * sd_price) + mean_price),
y = -0.25,
x = -3:3),
aes(x, y, label = label))

R: gganimate with geom_density

I am trying to create an animate graph with gganimate. My defaul graph, static is something like:
But instead of 3 grouping variables I have 12 (year 0, year 1, year 2, etc.). Instead of plotting all 12 variables together I would like to animate it. To avoid:
Those kernel densities plots are made as follows:
data_decidious %>% tidyr::gather("YEAR", "NDVI", colsPostNDVI) %>%
mutate(YEAR = str_remove(YEAR, 'meanNDVIyear')) %>% mutate(YEAR = str_remove(YEAR, 'meanprefire_NDVI')) %>% mutate(YEAR = as.factor(YEAR)) %>%
ggplot(aes(NDVI,fill=YEAR)) + geom_density(alpha=.2) + xlim(0.3, 0.7) + ylim(0,46) +
xlab("Mean NDVI") + ylab("Kernal density") + guides(fill=guide_legend(title="Comparative"))
I have found that this geom_density() only works when I add mutate(YEAR = as.factor(YEAR)). That means when I add:
transition_time(YEAR) + ease_aes('linear')
I get the error:
Error: time data must either be integer, numeric, POSIXct, Date, difftime, orhms
In addition: Warning message:
In min(cl[cl != 0]) : no non-missing arguments to min; returning Inf
Any idea to animate my graph?
Converting YEAR to a factor is not necessary. Instead simply map factor(YEAR) on fill. This way you can use YEAR in transition time and everything is fine.
Using the gapminder::gapminder dataset as example data the following code plots and animates the density of worldwide life-expectancy over time.
(BTW: Instead of using a categorical color scale you can map YEAR directly on fill to get a continuous color scale. However, in this case you have to map YEAR also on the group aesthetic):
library(ggplot2)
library(dplyr)
library(gganimate)
p <- gapminder::gapminder %>%
ggplot(aes(lifeExp, fill = factor(year))) +
geom_density(alpha=.2) +
xlab("Life Expectancy") +
ylab("Kernal density") +
guides(fill = guide_legend(title = "Year"))
p +
transition_time(year) +
ease_aes('linear')
Created on 2020-04-17 by the reprex package (v0.3.0)
Edit:
As far as I can tell without having seen your dataset you have to adapt your code like so (from inspecting your code I guess that YEAR is a character. So you have to convert it to an integer):
data_long <- data_decidious %>%
tidyr::gather("YEAR", "NDVI", colsPostNDVI) %>%
mutate(YEAR = str_remove(YEAR, 'meanNDVIyear')) %>%
mutate(YEAR = str_remove(YEAR, 'meanprefire_NDVI')) %>%
# Convert YEAR to integer
mutate(YEAR = as.integer(YEAR))
p <- data_long %>%
ggplot(aes(NDVI,fill=factor(YEAR))) +
geom_density(alpha=.2) +
xlim(0.3, 0.7) +
ylim(0,46) +
xlab("Mean NDVI") +
ylab("Kernal density") +
guides(fill=guide_legend(title="Comparative"))
p +
transition_time(YEAR) +
ease_aes('linear')
anim_save("test.gif")

Fill area under time series based on factor value

I am trying to fill the area under a time series line based on a factor value of 0 and 1. The area should only be filled if the value is equal to 1.
I have managed to colour code the time series line based on the factor value with the following code:
install.packages("scales")
library("scales")
library("ggplot2")
ggplot(plot.timeseries) +
geom_line(aes(x = Date, y = Price, color = Index, group = 1)) +
scale_x_date(labels = date_format("%Y"), breaks = date_breaks("years")) +
scale_colour_manual(values = c("red3", "green3"))
This provides the following graph:
I have also tried this:
ggplot(plot.timeseries, aes(x=Date, y = Price, fill=Index)) +
geom_area(alpha=0.6) +
theme_classic() +
scale_fill_manual(values=c("#999999", "#32CD32"))
which comes out as a complete mess:
Ideally the final result should look like plot1 where the parts of the line in green are filled.
The time series data can be accessed here:
https://drive.google.com/file/d/1qWsuJk41_fJZktLCAZSgfGvoDLqTt-jk/view?usp=sharing
Any help would be greatly appreciated!
Okay, here is what I did to get the graph shown below if that is what you want.
# -------------------------------------------------------------------------
# load required packages #
library(scales)
library("ggplot2")
library(dplyr)
# -------------------------------------------------------------------------
# load the data to a df #
plot.timeseries <- get(load("TimeSeries_Data.RData"))
# -------------------------------------------------------------------------
# transform the data (my_fill_color will have green and NA values)
my_object <- plot.timeseries %>%
select(Price, Index, Date) %>%
mutate(Index_ord_factor = factor(Index, levels = unique(Index), ordered=TRUE),
my_fill_color = case_when(
Index_ord_factor > 0 ~ "green" # ordered factor enables the '>' operation
))
# -------------------------------------------------------------------------
# Plot your graph using the transformed data
ggplot(my_object, mapping = aes(x=Date, y=Price)) +
geom_line(aes(color = Index, group = 1))+
geom_col(fill =my_object$my_fill_color, width = 1)
# -------------------------------------------------------------------------
Let me know if you need elaboration to understand the script. Attached is the output in my end.
For those that are interested I also received this alternative solution from Erik Chacon.
You can view his tutorial here for a better understanding of the ggplot2 extension he designed, which is used in this solution.
# Installing and loading necessary packages
install.packages("remotes")
remotes::install_github("ErickChacon/mbsi")
library(mbsi)
library(ggplot2)
load("timeseries.RData")
#converting factor to numeric
plot.timeseries$Index <- as.numeric(levels(plot.timeseries$Index))[plot.timeseries$Index]
ggplot(plot.timeseries, aes(Date, Price)) +
geom_line() +
stat_events(aes(event = I(1 * (Index > 0)), fill = "Index"),
threshold = min(plot.timeseries$Price),
fill = "green", alpha = 0.3)

How can I make axis x vertical so the long names of countries become understandable in the plot

The name of the countries are long and are on top of each other in the x labels, how can I make it readable?
ggplot(results, aes(x = Nationality, horiz=TRUE)) +
theme_solarized() +
geom_bar() +
labs(y = "Number of Medals",
title = "Number of Medals by Country")
Welcome to stackoverflow. Here are some suggestions on how you can deal with the many values. In both methods, I am using the forcats library within the tidyverse. You can read more about it here: https://r4ds.had.co.nz/factors.html
First, some fake data & replicating your problem
library(tidyverse)
df <-
mpg %>%
arrange(manufacturer) %>%
mutate(
n = row_number(),
vehicle = paste(year, manufacturer, model)
) %>%
uncount(n)
# this replicates your problem
ggplot(df, aes(vehicle)) +
geom_bar() +
coord_flip()
Option 1: consolidate
df %>%
mutate(
vehicle = # making heavy use of forcats here
fct_lump(vehicle, 35) %>% # keep only the 35 most frequent values, others in "Other" category
fct_infreq() %>% # order them by frequency
fct_rev() #reverse the order
) %>%
ggplot(aes(vehicle)) +
geom_bar() +
coord_flip()
Option 2: facet
Someone may have a more elegant way of getting these groups but I use this method quite a bit
df %>%
mutate(
vehicle = # similar methods to earlier
fct_infreq(vehicle) %>%
fct_rev(),
num_fct = as.integer(vehicle), # generates a number for each factor
facet = (max(num_fct)-num_fct) %/% 20 # will make groups of 20, but they need to be in descending order within each facet
) %>%
ggplot(aes(vehicle)) +
geom_bar() +
coord_flip() +
facet_wrap(~facet, scales = "free_y", nrow = 1) +
theme(
strip.background = element_blank(),
strip.text = element_blank()
)
Hope this helps.

Resources