R: adding a legend to a density plot - r

I am working with the R programming language. I am following this tutorial here on density plots: https://www.r-graph-gallery.com/2d-density-plot-with-ggplot2.html
I am trying to figure out how to add a "legend" to the density plots, so that the user can see roughly how many observations are located within a given region of the density plot.
I was able to figure out how to do this for a basic plot (by following the tutorial) :
#load library
library(ggplot2)
#create data
a <- data.frame( x=rnorm(20000, 10, 1.9), y=rnorm(20000, 10, 1.2) )
b <- data.frame( x=rnorm(20000, 14.5, 1.9), y=rnorm(20000, 14.5, 1.9) )
c <- data.frame( x=rnorm(20000, 9.5, 1.9), y=rnorm(20000, 15.5, 1.9) )
data <- rbind(a,b,c)
#make density plot
ggplot(data, aes(x=x, y=y) ) +
geom_bin2d(bins = 70) +
scale_fill_continuous(type = "viridis") +
theme_bw()
As seen in the above plot, a legend has been automatically created ("count").
But when I try to do this for the other plots in the tutorial, no legends are added:
# plot 1
ggplot(data, aes(x=x, y=y) ) +
stat_density_2d(aes(fill = ..density..), geom = "raster", contour = FALSE) +
scale_fill_distiller(palette=4, direction=-1) +
scale_x_continuous(expand = c(0, 0)) +
scale_y_continuous(expand = c(0, 0)) +
theme(
legend.position='none'
)
# plot 2
ggplot(data, aes(x=x, y=y) ) +
stat_density_2d(aes(fill = ..density..), geom = "raster", contour = FALSE) +
scale_fill_distiller(palette=4, direction=1) +
scale_x_continuous(expand = c(0, 0)) +
scale_y_continuous(expand = c(0, 0)) +
theme(
legend.position='none'
)
# plot 3
ggplot(data, aes(x=x, y=y) ) +
stat_density_2d(aes(fill = ..density..), geom = "raster", contour = FALSE) +
scale_fill_distiller(palette= "Spectral", direction=1) +
scale_x_continuous(expand = c(0, 0)) +
scale_y_continuous(expand = c(0, 0)) +
theme(
legend.position='none'
)
Can someone please show me if it is possible to add legends to these plots?
Thanks

Related

Bunched up x axis ticks on multi panelled plot in ggplot

I am attempting to make a multi-panelled plot from three individual plots (see images).However, I am unable to rectify the bunched x-axis tick labels when the plots are in the multi-panel format. Following is the script for the individual plots and the multi-panel:
Individual Plot:
NewDat [[60]]
EstRes <- NewDat [[60]]
EstResPlt = ggplot(EstRes,aes(Distance3, `newBa`))+geom_line() + scale_x_continuous(n.breaks = 10, limits = c(0, 3500))+ scale_y_continuous(n.breaks = 10, limits = c(0,25))+ xlab("Distance from Core (μm)") + ylab("Ba:Ca concentration(μmol:mol)") + geom_hline(yintercept=2.25, linetype="dashed", color = "red")+ geom_vline(xintercept = 1193.9, linetype="dashed", color = "grey")+ geom_vline(xintercept = 1965.5, linetype="dashed", color = "grey") + geom_vline(xintercept = 2616.9, linetype="dashed", color = "grey") + geom_vline(xintercept = 3202.8, linetype="dashed", color = "grey")+ geom_vline(xintercept = 3698.9, linetype="dashed", color = "grey")
EstResPlt
Multi-panel plot:
MultiP <- grid.arrange(MigrPlt,OcResPlt,EstResPlt, nrow =1)
I have attempted to include:
MultiP <- grid.arrange(MigrPlt,OcResPlt,EstResPlt, nrow =1)+
theme(axis.text.x = element_text (angle = 45)) )
MultiP
but have only received errors. It's not necessary for all tick marks to be included. An initial, mid and end value is sufficient and therefore they would not need to all be included or angled. I'm just not sure how to do this. Assistance would be much appreciated.
There are several options to resolve the crowded axes. Let's consider the following example which parallels your case. The default labelling strategy wouldn't overcrowd the x-axis.
library(ggplot2)
library(patchwork)
library(scales)
df <- data.frame(
x = seq(0, 3200, by = 20),
y = cumsum(rnorm(161))
)
p <- ggplot(df, aes(x, y)) +
geom_line()
(p + p + p) / p &
scale_x_continuous(
name = "Distance (um)"
)
However, because you've given n.breaks = 10 to the scale, it becomes crowded. So a simple solution would just be to remove that.
(p + p + p) / p &
scale_x_continuous(
n.breaks = 10,
name = "Distance (um)"
)
Alternatively, you could convert the micrometers to millimeters, which makes the labels less wide.
(p + p + p) / p &
scale_x_continuous(
n.breaks = 10,
labels = label_number(scale = 1e-3, accuracy = 0.1),
name = "Distance (mm)"
)
Yet another alternative is to put breaks only every n units, in the case below, a 1000. This happens to coincide with omitting n.breaks = 10 by chance.
(p + p + p) / p &
scale_x_continuous(
breaks = breaks_width(1000),
name = "Distance (um)"
)
Created on 2021-11-02 by the reprex package (v2.0.1)
I thought it would be better to show with an example.
What I mean was, you made MigrPlt, OcResPlt, EstResPlt each with ggplot() +...... For plot that you want to rotate x axis, add + theme(axis.text.x = element_text (angle = 45)).
For example, in iris data, only rotate x axis text for a like
a <- ggplot(iris, aes(Sepal.Width, Sepal.Length)) +
geom_point() +
theme(axis.text.x = element_text (angle = 45))
b <- ggplot(iris, aes(Petal.Width, Petal.Length)) +
geom_point()
gridExtra::grid.arrange(a,b, nrow = 1)

Creating polar heatmap but leaving a fan-shaped space to annotate what each ring indicates using R

I would like to create a polar heatmap like the heatmap from the Lancet paper "Height and body-mass index trajectories of school-aged children and adolescents from 1985 to 2019 in 200 countries and territories: a pooled analysis of 2181 population-based studies with 65 million participants":
I appreciate the idea of annotating the age each layer of ring represents (age 5 to 19 years) by creating a fan-shaped opening of the polar heatmap (manually circled in red). I refer to 5-19 as the Y-AXIS LABELS hereafter.
Below is the code from #Cyrus Mohammadian describing how to arrange the positions of Y-AXIS LABELS of polar heatmaps. I replicate Cyrus Mohammadian's code below:
library(grid)
library(gtable)
library(reshape)
library(ggplot2)
library(plyr)
nba <- read.csv("http://datasets.flowingdata.com/ppg2008.csv")
nba$Name <- with(nba, reorder(Name, PTS))
nba.m <- melt(nba)
nba.m <- ddply(nba.m, .(variable), transform, value = scale(value))
# Convert the factor levels (variables) to numeric + quanity to determine size of hole.
nba.m$var2 = as.numeric(nba.m$variable) + 15
# Labels and breaks need to be added with scale_y_discrete.
y_labels = levels(nba.m$variable)
y_breaks = seq_along(y_labels) + 15
nba.labs <- subset(nba.m, variable==levels(nba.m$variable) [nlevels(nba.m$variable)])
nba.labs <- nba.labs[order(nba.labs$Name),]
nba.labs$ang <- seq(from=(360/nrow(nba.labs))/1.5, to=(1.5* (360/nrow(nba.labs)))-360, length.out=nrow(nba.labs))+80
nba.labs$hjust <- 0
nba.labs$hjust[which(nba.labs$ang < -90)] <- 1
nba.labs$ang[which(nba.labs$ang < -90)] <- (180+nba.labs$ang)[which(nba.labs$ang < -90)]
p<-ggplot(nba.m, aes(x=Name, y=var2, fill=value)) +
geom_tile(colour="white") +
geom_text(data=nba.labs, aes(x=Name, y=var2+1.5,
label=Name, angle=ang, hjust=hjust), size=2.5) +
scale_fill_gradient(low = "white", high = "steelblue") +
ylim(c(0, 50)) +
coord_polar(theta="x") +
theme(panel.background=element_blank(),
axis.title=element_blank(),
panel.grid=element_blank(),
axis.text.x=element_blank(),
axis.ticks=element_blank(),
axis.text.y=element_text(size=5))+ theme(axis.title.y=element_blank(),
axis.text.y=element_blank(),
axis.ticks.y=element_blank())
lab = textGrob((paste("G MIN PTS FGM FGA FGP FTM FTA FTP X3PM X3PA X3PP ORB DRB TRB AST STL BLK TO PF")),
x = unit(.1, "npc"), just = c("left"),
gp = gpar(fontsize = 7))
gp = ggplotGrob(p)
gp = gtable_add_rows(gp, unit(10, "grobheight", lab), -1)
gp = gtable_add_grob(gp, lab, t = -2, l = gp$layout[gp$layout$name == "panel",]$l)
grid.newpage()
grid.draw(gp)
This is the resultant figure:
Y-AXIS LABELS are placed at the bottom of the heatmap instead of being positioned immediately next to each layer of ring like the Lancet paper. I therefore ask if it is possible to modify Cyrus Mohammadian's plot so that Y-AXIS LABELS are positioned next to each layer of the ring instead of being presented outside of the heatmap? In addition, it is preferrable that we can control the size of the fan-shaped opening so that we can customize according to length of the Y-AXIS LABEL texts.
A second request is to place the color legend in the center of the heatmap and make it curved. An example is illustrated in the figure below, which is from Fig 3 of the paper "Infectious diseases in children and adolescents in China: analysis of national surveillance data from 2008 to 2017":
Note that the color legend is centrally located and curved. How this could be done?
Thank you.
Here is some example code for how you can shape something like a legend and add it to your plot. Due to some restrictions on annotation_custom() in relation to polar coordinates, I decided to use the devel version of patchwork from github to use the new inset_element() function (devtools::install_github("thomasp85/patchwork")).
library(ggplot2)
library(patchwork)
df <- reshape2::melt(volcano[1:20, 1:20])
breaks <- scales::extended_breaks()(df$value)
breaks <- scales::discard(breaks, range(df$value))
main <- ggplot(df, aes(Var1, Var2, fill = value)) +
geom_tile() +
scale_y_continuous(limits = c(-20, NA)) +
guides(fill = "none") +
coord_polar()
legend <- ggplot() +
geom_tile(
aes(
x = seq(min(df$value), max(df$value), length.out = 255),
y = 1, fill = after_stat(x)
)
) +
annotate(
"text", x = breaks, y = -0.1, label = breaks, size = 3
) +
annotate(
"segment", x = breaks, xend = breaks, y = 0.5, yend = 0.7,
colour = "white", size = 1
) +
annotate(
"segment", x = breaks, xend = breaks, y = 1.5, yend = 1.3,
colour = "white", size = 1
) +
guides(fill = "none") +
scale_y_continuous(limits = c(-2, 2)) +
scale_x_continuous(expand = c(0.1, 0)) +
coord_polar() +
theme_void()
legend <- ggplotGrob(legend)
main + inset_element(legend, 0.3, 0.3, 0.7, 0.7) &
theme(plot.background = element_blank())
Created on 2020-11-06 by the reprex package (v0.3.0)

How can I make a density scatterplot with log scale in R?

I'd like to make a density scatterplot with log10 scale in R. I tried to plot it using ggplot and stat_density2d in R. I used this code:
ggplot(data=vod_agb_df, aes(vod, agb)) +
stat_density2d(aes(fill = ..density..), geom = "tile", contour = FALSE, n = 100) +
scale_fill_distiller(palette = 'YlOrRd', direction = 1) +
scale_x_continuous(breaks=seq(0, 1, 0.25), limits = c(0, 1)) +
scale_y_continuous(breaks=seq(0, 300, 50), limits = c(0, 300)) +
labs(x='L-VOD', y='AGB(Mg/ha)') +
theme_bw()
But the result looks strange. the density scatterplot with my code
This is the plot I want to plot
The original scatterplot
You can log10-transform the density; here's a minimal & reproducible example
library(MASS)
library(tidyverse)
set.seed(2020)
mvrnorm(100, mu = c(0, 0), Sigma = matrix(c(1, 0.5, 0.5, 1), 2, 2)) %>%
as_tibble() %>%
ggplot(aes(V1, V2)) +
stat_density2d(
aes(fill = log10(..density..)), geom = "tile", contour = FALSE, n = 100) +
scale_fill_distiller(palette = 'YlOrRd', direction = 1) +
theme_bw()
Update
It's not clear to me what you mean by ""I'd like to make the density scatterplot in the point distributed area, not the whole area of the plot."" If you're asking how to increase the height of the gradient colour bar, you can do the following
set.seed(2020)
mvrnorm(100, mu = c(0, 0), Sigma = matrix(c(1, 0.5, 0.5, 1), 2, 2)) %>%
as_tibble() %>%
ggplot(aes(V1, V2)) +
stat_density2d(
aes(fill = log10(..density..)), geom = "tile", contour = FALSE, n = 100) +
scale_fill_distiller(palette = 'YlOrRd', direction = 1) +
theme_bw() +
guides(fill = guide_colorbar(barheight = unit(3.5, "in"), title.position = "right"))
Whatever plot you are showing as your expected output for that you can use following code
library(tidyverse)
# Bin size control + color palette
ggplot(iris, aes(x=Sepal.Length, y=Petal.Length) ) +
geom_bin2d(bins = 20) +
scale_fill_distiller(palette = 'YlOrRd', direction = 1) +
theme_bw() +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())

Perfectly Align Density plots to scatterplot using cowplot

I am trying to build a function for bivariate plotting that taking 2 variables it is able to represent a marginal scatterplot and two lateral density plots.
The problem is that the density plot on the right does not align with the bottom axis.
Here is a sample data:
g1 = c(rnorm(200, mean=350, sd=100), rnorm(200, mean=700, sd=100))
g2 = c(rnorm(200, mean=350, sd=100), rnorm(200, mean=500, sd=100))
df_exp = data.frame(var1=log2(g1 + 1) , var2=log2(g2 + 1))
Here is the function:
bivariate_plot <- function(df, var1, var2, density = T, box = F) {
require(ggplot2)
require(cowplot)
scatter = ggplot(df, aes(eval(parse(text = var1)), eval(parse(text = var2)), color = "red")) +
geom_point(alpha=.8)
plot1 = ggplot(df, aes(eval(parse(text = var1)), fill = "red")) + geom_density(alpha=.5)
plot1 = plot1 + ylab("G1 density")
plot2 = ggplot(df, aes(eval(parse(text = var2)),fill = "red")) + geom_density(alpha=.5)
plot2 = plot2 + ylab("G2 density")
plot_grid(scatter, plot1, plot2, nrow=1, labels=c('A', 'B', 'C')) #Or labels="AUTO"
# Avoid displaying duplicated legend
plot1 = plot1 + theme(legend.position="none")
plot2 = plot2 + theme(legend.position="none")
# Homogenize scale of shared axes
min_exp = min(df[[var1]], df[[var2]]) - 0.01
max_exp = max(df[[var1]], df[[var2]]) + 0.01
scatter = scatter + ylim(min_exp, max_exp)
scatter = scatter + xlim(min_exp, max_exp)
plot1 = plot1 + xlim(min_exp, max_exp)
plot2 = plot2 + xlim(min_exp, max_exp)
plot1 = plot1 + ylim(0, 2)
plot2 = plot2 + ylim(0, 2)
first_row = plot_grid(scatter, labels = c('A'))
second_row = plot_grid(plot1, plot2, labels = c('B', 'C'), nrow = 1)
gg_all = plot_grid(first_row, second_row, labels=c('', ''), ncol=1)
# Display the legend
scatter = scatter + theme(legend.justification=c(0, 1), legend.position=c(0, 1))
# Flip axis of gg_dist_g2
plot2 = plot2 + coord_flip()
# Remove some duplicate axes
plot1 = plot1 + theme(axis.title.x=element_blank(),
axis.text=element_blank(),
axis.line=element_blank(),
axis.ticks=element_blank())
plot2 = plot2 + theme(axis.title.y=element_blank(),
axis.text=element_blank(),
axis.line=element_blank(),
axis.ticks=element_blank())
# Modify margin c(top, right, bottom, left) to reduce the distance between plots
#and align G1 density with the scatterplot
plot1 = plot1 + theme(plot.margin = unit(c(0.5, 0, 0, 0.7), "cm"))
scatter = scatter + theme(plot.margin = unit(c(0, 0, 0.5, 0.5), "cm"))
plot2 = plot2 + theme(plot.margin = unit(c(0, 0.5, 0.5, 0), "cm"))
# Combine all plots together and crush graph density with rel_heights
first_col = plot_grid(plot1, scatter, ncol = 1, rel_heights = c(1, 3))
second_col = plot_grid(NULL, plot2, ncol = 1, rel_heights = c(1, 3))
perfect = plot_grid(first_col, second_col, ncol = 2, rel_widths = c(3, 1),
axis = "lrbl", align = "hv")
print(perfect)
}
And here is the call for plotting:
bivariate_plot(df = df_exp, var1 = "var1", var2 = "var2")
It is important to point out that this alignment problem is always present even by changing the data.
And this is what happen with my real data:
This can be accomplished easily using the ggExtra package, rather than rolling your own solution.
library(ggExtra)
library(ggplot2)
g1 = c(rnorm(200, mean=350, sd=100), rnorm(200, mean=700, sd=100))
g2 = c(rnorm(200, mean=350, sd=100), rnorm(200, mean=500, sd=100))
df_exp = data.frame(var1=log2(g1 + 1) , var2=log2(g2 + 1))
g <- ggplot(df_exp, aes(x=var1, y=var2)) + geom_point()
ggMarginal(g)
Output:
There's so many bugs in your code that I don't quite know where to start. The code below fixes them, to the extent that I understand what the intended result is.
g1 = c(rnorm(200, mean=350, sd=100), rnorm(200, mean=700, sd=100))
g2 = c(rnorm(200, mean=350, sd=100), rnorm(200, mean=500, sd=100))
df_exp = data.frame(var1=log2(g1 + 1) , var2=log2(g2 + 1))
bivariate_plot <- function(df, var1, var2, density = T, box = F) {
require(ggplot2)
require(cowplot)
scatter = ggplot(df, aes_string(var1, var2)) +
geom_point(alpha=.8, color = "red")
plot1 = ggplot(df, aes_string(var1)) + geom_density(alpha=.5, fill = "red")
plot1 = plot1 + ylab("G1 density")
plot2 = ggplot(df, aes_string(var2)) + geom_density(alpha=.5, fill = "red")
plot2 = plot2 + ylab("G2 density")
# Avoid displaying duplicated legend
plot1 = plot1 + theme(legend.position="none")
plot2 = plot2 + theme(legend.position="none")
# Homogenize scale of shared axes
min_exp = min(df[[var1]], df[[var2]]) - 0.01
max_exp = max(df[[var1]], df[[var2]]) + 0.01
scatter = scatter + ylim(min_exp, max_exp)
scatter = scatter + xlim(min_exp, max_exp)
plot1 = plot1 + xlim(min_exp, max_exp)
plot2 = plot2 + xlim(min_exp, max_exp)
plot1 = plot1 + ylim(0, 2)
plot2 = plot2 + ylim(0, 2)
# Flip axis of gg_dist_g2
plot2 = plot2 + coord_flip()
# Remove some duplicate axes
plot1 = plot1 + theme(axis.title.x=element_blank(),
axis.text=element_blank(),
axis.line=element_blank(),
axis.ticks=element_blank())
plot2 = plot2 + theme(axis.title.y=element_blank(),
axis.text=element_blank(),
axis.line=element_blank(),
axis.ticks=element_blank())
# Modify margin c(top, right, bottom, left) to reduce the distance between plots
#and align G1 density with the scatterplot
plot1 = plot1 + theme(plot.margin = unit(c(0.5, 0, 0, 0.7), "cm"))
scatter = scatter + theme(plot.margin = unit(c(0, 0, 0.5, 0.5), "cm"))
plot2 = plot2 + theme(plot.margin = unit(c(0, 0.5, 0.5, 0), "cm"))
# Combine all plots together and crush graph density with rel_heights
perfect = plot_grid(plot1, NULL, scatter, plot2,
ncol = 2, rel_widths = c(3, 1), rel_heights = c(1, 3))
print(perfect)
}
bivariate_plot(df = df_exp, var1 = "var1", var2 = "var2")

Problems adding legend to ggplot2 + ggfortify

I'm having troubles using
scale_colour_manual
function of ggplot. I tried
guide = "legend"
to force legend appears, but it doesn't work. Rep code:
library(ggfortify)
library(ggplot2)
p <- ggdistribution(pgamma, seq(0, 100, 0.1), shape = 0.92, scale = 22,
colour = 'red')
p2 <- ggdistribution(pgamma, seq(0, 100, 0.1), shape = 0.9, scale = 5,
colour = 'blue', p=p)
p2 +
theme_bw(base_size = 14) +
theme(legend.position ="top") +
xlab("Precipitación") +
ylab("F(x)") +
scale_colour_manual("Legend title", guide = "legend",
values = c("red", "blue"), labels = c("Observado","Reforecast")) +
ggtitle("Ajuste Gamma")
A solution with stat_function:
library(ggplot2)
library(scales)
cols <- c("LINE1"="red","LINE2"="blue")
df <- data.frame(x=seq(0, 100, 0.1))
ggplot(data=df, aes(x=x)) +
stat_function(aes(colour = "LINE1"), fun=pgamma, args=list(shape = 0.92, scale = 22)) +
stat_function(aes(colour = "LINE2"), fun=pgamma, args=list(shape = 0.9, scale = 5)) +
theme_bw(base_size = 14) +
theme(legend.position ="top") +
xlab("Precipitación") +
ylab("F(x)") +
scale_colour_manual("Legend title", values=c(LINE1="red",LINE2="blue"),
labels = c("Observado","Reforecast")) +
scale_y_continuous(labels=percent) +
ggtitle("Ajuste Gamma")
This appears to be a bug with ggfortify.* You can achieve identical results simply using geom_line() from ggplot2 though:
library(ggplot2)
# Sequence of values to draw from dist(s) for plotting
x = seq(0, 100, 0.1)
# Defining dists
d1 = pgamma(x, shape=0.92, scale=22)
d2 = pgamma(x, shape=0.90, scale=5)
# Plotting
p1 = ggplot() +
geom_line(aes(x,d1,colour='red')) +
geom_line(aes(x,d2,colour='blue')) +
theme_bw(base_size = 14) +
theme(legend.position="top") +
ggtitle("Ajuste Gamma") +
xlab("Precipitación") +
ylab("F(x)") +
scale_colour_manual("Legend title",
guide = "legend",
values = c("red", "blue"),
labels=c("Observado", "Reforecast"))
* Related question: Plotting multiple density distributions on one plot

Resources