I'd like to make a density scatterplot with log10 scale in R. I tried to plot it using ggplot and stat_density2d in R. I used this code:
ggplot(data=vod_agb_df, aes(vod, agb)) +
stat_density2d(aes(fill = ..density..), geom = "tile", contour = FALSE, n = 100) +
scale_fill_distiller(palette = 'YlOrRd', direction = 1) +
scale_x_continuous(breaks=seq(0, 1, 0.25), limits = c(0, 1)) +
scale_y_continuous(breaks=seq(0, 300, 50), limits = c(0, 300)) +
labs(x='L-VOD', y='AGB(Mg/ha)') +
theme_bw()
But the result looks strange. the density scatterplot with my code
This is the plot I want to plot
The original scatterplot
You can log10-transform the density; here's a minimal & reproducible example
library(MASS)
library(tidyverse)
set.seed(2020)
mvrnorm(100, mu = c(0, 0), Sigma = matrix(c(1, 0.5, 0.5, 1), 2, 2)) %>%
as_tibble() %>%
ggplot(aes(V1, V2)) +
stat_density2d(
aes(fill = log10(..density..)), geom = "tile", contour = FALSE, n = 100) +
scale_fill_distiller(palette = 'YlOrRd', direction = 1) +
theme_bw()
Update
It's not clear to me what you mean by ""I'd like to make the density scatterplot in the point distributed area, not the whole area of the plot."" If you're asking how to increase the height of the gradient colour bar, you can do the following
set.seed(2020)
mvrnorm(100, mu = c(0, 0), Sigma = matrix(c(1, 0.5, 0.5, 1), 2, 2)) %>%
as_tibble() %>%
ggplot(aes(V1, V2)) +
stat_density2d(
aes(fill = log10(..density..)), geom = "tile", contour = FALSE, n = 100) +
scale_fill_distiller(palette = 'YlOrRd', direction = 1) +
theme_bw() +
guides(fill = guide_colorbar(barheight = unit(3.5, "in"), title.position = "right"))
Whatever plot you are showing as your expected output for that you can use following code
library(tidyverse)
# Bin size control + color palette
ggplot(iris, aes(x=Sepal.Length, y=Petal.Length) ) +
geom_bin2d(bins = 20) +
scale_fill_distiller(palette = 'YlOrRd', direction = 1) +
theme_bw() +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())
Related
I was curious if anyone knew how to create a heatmap with an isoquant curve that identifies all x and y combinations whose product equals a certain constant. The final product should look like the following picture:
Scatterplot with isoquant curve
Here is the code I use to generate my plot, but as of right now I can't get the curve in the plot as depicted in the picture above:
vs.vpd.by.drop_days <- ggplot(event_drops, aes(vs, vpd)) +
geom_point(aes(color = day_since), size = 2, alpha = 0.2) +
scale_color_gradientn(colors = c("darkblue","green","yellow","red"),
breaks = c(0,25,50,75),
limits = c(0,75),
name = "Days since \n first drop") +
ggtitle("Drops by VPD and Wind Speed") +
theme(plot.title = element_text(size = 18, face = "bold", hjust = 0.5),
axis.title = element_text(size = 15)) +
xlab(label = "Wind Speed (mph)") +
ylab(label = "Vapor Pressure Deficit") +
expand_limits(x = 0, y = 0) +
scale_x_continuous(expand = c(0, 0), limits = c(0,20)) +
scale_y_continuous(expand = c(0, 0), limits = c(0,5))
vs.vpd.by.drop_days
One option to achieve that would be geom_function.
Using mtcars as example data:
library(ggplot2)
ggplot(mtcars, aes(hp, mpg, color = disp)) +
geom_point() +
geom_function(fun = function(x) 3000 / x)
I am including marginal distribution plots on a scatterplot of a continuous and integer variable. However, in the integer variable maringal distribution plot (y-axis) there is this zig-zag pattern that shows up because the y-values are all integers. Is there any way to increase the "width" (not sure that's the right term) of the bins/values the function calculates the distribution density over?
The goal is to get rid of that zig-zag pattern that develops because the y-values are integers.
library(GlmSimulatoR)
library(ggplot2)
library(patchwork)
### Create right-skewed dataset that has one continous variable and one integer variable
set.seed(123)
df1 <- data.frame(matrix(ncol = 2, nrow = 1000))
x <- c("int","cont")
colnames(df1) <- x
df1$int <- round(rgamma(1000, shape = 1, scale = 1),0)
df1$cont <- round(rgamma(1000, shape = 1, scale = 1),1)
p1 <- ggplot(data = df1, aes(x = cont, y = int)) +
geom_point(shape = 21, size = 2, color = "black", fill = "black", stroke = 1, alpha = 0.4) +
xlab("Continuous Value") +
ylab("Integer Value") +
theme_bw() +
theme(panel.grid = element_blank(),
text = element_text(size = 16),
axis.text.x = element_text(size = 16, color = "black"),
axis.text.y = element_text(size = 16, color = "black"))
dens1 <- ggplot(df1, aes(x = cont)) +
geom_density(alpha = 0.4) +
theme_void() +
theme(legend.position = "none")
dens2 <- ggplot(df1, aes(x = int)) +
geom_density(alpha = 0.4) +
theme_void() +
theme(legend.position = "none") +
coord_flip()
dens1 + plot_spacer() + p1 + dens2 +
plot_layout(ncol = 2, nrow = 2, widths = c(6,1), heights = c(1,6))
From ?geom_density:
adjust: A multiplicate [sic] bandwidth adjustment. This makes it possible
to adjust the bandwidth while still using the a bandwidth
estimator. For example, ‘adjust = 1/2’ means use half of the
default bandwidth.
So as a start try e.g. geom_density(..., adjust = 2) (bandwidth twice as wide as default) and go from there.
I would like to create a raincloud plot. I have successfully done it. But I would like to know if instead of the density curve, I can put a histogram (it's better for my dataset).
This is my code if it can be usefull
ATSC <- ggplot(data = data, aes(y = atsc, x = numlecteur, fill = numlecteur)) +
geom_flat_violin(position = position_nudge(x = .2, y = 0), alpha = .5) +
geom_point(aes(y = atsc, color = numlecteur), position = position_jitter(width = .15), size = .5, alpha = 0.8) +
geom_point(data = sumld, aes(x = numlecteur, y = mean), position = position_nudge(x = 0.25), size = 2.5) +
geom_errorbar(data = sumld, aes(ymin = lower, ymax = upper, y = mean), position = position_nudge(x = 0.25), width = 0) +
guides(fill = FALSE) +
guides(color = FALSE) +
scale_color_brewer(palette = "Spectral") +
scale_y_continuous(breaks=c(0,2,4,6,8,10), labels=c("0","2","4","6","8","10"))+
scale_fill_brewer(palette = "Spectral") +
coord_flip() +
theme_bw() +
expand_limits(y=c(0, 10))+
xlab("Lecteur") + ylab("Age total sans check")+
raincloud_theme
I think we can maybe put the "geom_histogram()" but it doesn't work
Thank you in advance for your help !
(sources : https://peerj.com/preprints/27137v1.pdf
https://neuroconscience.wordpress.com/2018/03/15/introducing-raincloud-plots/)
This is actually not quite easy. There are a few challenges.
geom_histogram is "horizontal by nature", and the custom geom_flat_violin is vertical - as are boxplots. Therefore the final call to coord_flip in that tutorial. In order to combine both, I think best is switch x and y, forget about coord_flip, and use ggstance::geom_boxploth instead.
Creating separate histograms for each category is another challenge. My workaround to create facets and "merge them together".
The histograms are scaled way bigger than the width of the points/boxplots. My workaround scale via after_stat function.
How to nudge the histograms to the right position above Boxplot and points - I am converting the discrete scale to a continuous by mapping a constant numeric to the global y aesthetic, and then using the facet labels for discrete labels.
library(tidyverse)
my_data<-read.csv("https://data.bris.ac.uk/datasets/112g2vkxomjoo1l26vjmvnlexj/2016.08.14_AnxietyPaper_Data%20Sheet.csv")
my_datal <-
my_data %>%
pivot_longer(cols = c("AngerUH", "DisgustUH", "FearUH", "HappyUH"), names_to = "EmotionCondition", values_to = "Sensitivity")
# use y = -... to position boxplot and jitterplot below the histogram
ggplot(data = my_datal, aes(x = Sensitivity, y = -.5, fill = EmotionCondition)) +
# after_stat for scaling
geom_histogram(aes(y = after_stat(count/100)), binwidth = .05, alpha = .8) +
# from ggstance
ggstance::geom_boxploth( width = .1, outlier.shape = NA, alpha = 0.5) +
geom_point(aes(color = EmotionCondition), position = position_jitter(width = .15), size = .5, alpha = 0.8) +
# merged those calls to one
guides(fill = FALSE, color = FALSE) +
# scale_y_continuous(breaks = 1, labels = unique(my_datal$EmotionCondition))
scale_color_brewer(palette = "Spectral") +
scale_fill_brewer(palette = "Spectral") +
# facetting, because each histogram needs its own y
# strip position = left to fake discrete labels in continuous scale
facet_wrap(~EmotionCondition, nrow = 4, scales = "free_y" , strip.position = "left") +
# remove all continuous labels from the y axis
theme(axis.title.y = element_blank(), axis.text.y = element_blank(),
axis.ticks.y = element_blank())
Created on 2021-04-15 by the reprex package (v1.0.0)
I'm plotting some graphs to introduce the concept of mathematical function to highschool students. Right now, I'd like to give them an example of what is NOT a function, by plotting an horizontal parabola:
x <- seq(from = -3, to = 3, by = 0.001)
y <- -x^2 + 5
grafico <- ggplot()+
geom_hline(yintercept = 0)+
geom_vline(xintercept = 0)+
geom_line(mapping = aes(x = x, y = y),color="darkred",size=1)+
theme_light()+
xlab("")+
ylab("")+
scale_x_continuous(breaks = seq(from = -100, to = 100, by = 1))+
scale_y_continuous(breaks = seq(from = -100, to = 100, by = 1))+
coord_flip(ylim = c(-1.5,5.5), xlim = c(-3,3),expand = FALSE)
print(grafico)
Which outputs the following image:
This is quite close to what I want, but I would like both axes' scales to match, to keep things simple for the students. For this, I'd tried using coord_equal, but unluckily, it seems to cancel coord_flip's effects:
x <- seq(from = -3, to = 3, by = 0.001)
y <- -x^2 + 5
grafico <- ggplot()+
geom_hline(yintercept = 0)+
geom_vline(xintercept = 0)+
geom_line(mapping = aes(x = x, y = y),color="darkred",size=1)+
theme_light()+
xlab("")+
ylab("")+
scale_x_continuous(breaks = seq(from = -100, to = 100, by = 1))+
scale_y_continuous(breaks = seq(from = -100, to = 100, by = 1))+
coord_flip(ylim = c(-1.5,5.5), xlim = c(-3,3),expand = FALSE)+
coord_equal()
print(grafico)
My question is: Is there a simple way to include coord_flip functionality into coord_equal?
For example, I know I can get coord_cartesian functionality by using the parameters ylim and xlim.
Based on your use case, it doesn't look like you really need to flip the coordinates: you can just reverse the order of inputs for x & y, and use geom_path() instead of geom_line() to force the plot to follow the order in your inputs.
The ggplot help file states:
geom_path() connects the observations in the order in which they
appear in the data. geom_line() connects them in order of the
variable on the x axis.
ggplot() +
geom_hline(yintercept = 0) +
geom_vline(xintercept = 0) +
geom_path(mapping = aes(x = y, y = x), color="darkred", size = 1) + # switch x & y here
theme_light() +
xlab("") +
ylab("") +
scale_x_continuous(breaks = seq(from = -100, to = 100, by = 1)) +
scale_y_continuous(breaks = seq(from = -100, to = 100, by = 1)) +
coord_equal(xlim = c(-1.5, 5.5), ylim = c(-3, 3), expand = FALSE) # switch x & y here
I'm having troubles using
scale_colour_manual
function of ggplot. I tried
guide = "legend"
to force legend appears, but it doesn't work. Rep code:
library(ggfortify)
library(ggplot2)
p <- ggdistribution(pgamma, seq(0, 100, 0.1), shape = 0.92, scale = 22,
colour = 'red')
p2 <- ggdistribution(pgamma, seq(0, 100, 0.1), shape = 0.9, scale = 5,
colour = 'blue', p=p)
p2 +
theme_bw(base_size = 14) +
theme(legend.position ="top") +
xlab("Precipitación") +
ylab("F(x)") +
scale_colour_manual("Legend title", guide = "legend",
values = c("red", "blue"), labels = c("Observado","Reforecast")) +
ggtitle("Ajuste Gamma")
A solution with stat_function:
library(ggplot2)
library(scales)
cols <- c("LINE1"="red","LINE2"="blue")
df <- data.frame(x=seq(0, 100, 0.1))
ggplot(data=df, aes(x=x)) +
stat_function(aes(colour = "LINE1"), fun=pgamma, args=list(shape = 0.92, scale = 22)) +
stat_function(aes(colour = "LINE2"), fun=pgamma, args=list(shape = 0.9, scale = 5)) +
theme_bw(base_size = 14) +
theme(legend.position ="top") +
xlab("Precipitación") +
ylab("F(x)") +
scale_colour_manual("Legend title", values=c(LINE1="red",LINE2="blue"),
labels = c("Observado","Reforecast")) +
scale_y_continuous(labels=percent) +
ggtitle("Ajuste Gamma")
This appears to be a bug with ggfortify.* You can achieve identical results simply using geom_line() from ggplot2 though:
library(ggplot2)
# Sequence of values to draw from dist(s) for plotting
x = seq(0, 100, 0.1)
# Defining dists
d1 = pgamma(x, shape=0.92, scale=22)
d2 = pgamma(x, shape=0.90, scale=5)
# Plotting
p1 = ggplot() +
geom_line(aes(x,d1,colour='red')) +
geom_line(aes(x,d2,colour='blue')) +
theme_bw(base_size = 14) +
theme(legend.position="top") +
ggtitle("Ajuste Gamma") +
xlab("Precipitación") +
ylab("F(x)") +
scale_colour_manual("Legend title",
guide = "legend",
values = c("red", "blue"),
labels=c("Observado", "Reforecast"))
* Related question: Plotting multiple density distributions on one plot