I am creating a ridge plot to compare a few groups (using ggridges package) and would like to add significance brackets to show comparisons between some group levels (using ggsignif package).
But this doesn't seem to work because the computation fails in stat_ggsignif.
Here is a reproducible example:
set.seed(123)
library(ggsignif)
library(ggridges)
library(ggplot2)
ggplot(iris, aes(x = Sepal.Length, y = Species)) +
geom_density_ridges(scale = 1) +
coord_flip() +
geom_signif(comparisons = list(c("setosa", "versicolor")))
#> Picking joint bandwidth of 0.181
#> Warning in f(..., self = self): NAs introduced by coercion
#> Warning: Computation failed in `stat_signif()`:
#> missing value where TRUE/FALSE needed
Created on 2021-07-29 by the reprex package (v2.0.0)
How can I get these two packages to work with each other? Thanks.
I did not manage to combine A) geom_density_ridges and B) geom_signif. The reason is that (A) requires numerical variable as x and categories as y, while (B) requires numerical variable as y and categories as x. And I have not managed to overwrite this behaviour.
But I assume that you have chosen ridge_plots over simple boxplots as you are interested in a more informative visualization of the distribution. To do so, there is a much better solution than ridge_plots, the so called violin plots. See below a standard boxplot (with labelled significance):
ggplot(iris, aes(x = Species, y = Sepal.Length, fill = Species)) +
geom_boxplot() +
geom_signif(comparisons = list(c("setosa", "versicolor")), test = "t.test")
See below a violin plot (with jitter and labelled significance):
ggplot(iris, aes(x = Species, y = Sepal.Length, fill = Species)) +
geom_violin(trim = F) + geom_jitter() +
geom_signif(comparisons = list(c("setosa", "versicolor")), test = "t.test")
This does the job unless you are particularly interest in making ggridges and ggsignif work together. Please note that a violin plot is just a folded density plot (see https://en.wikipedia.org/wiki/Violin_plot#:~:text=A%20violin%20plot%20is%20a,by%20a%20kernel%20density%20estimator for more details).
For the same purpose, see also the sina plot (suggestion by tjebo):
library(ggforce)
ggplot(iris, aes(x = Species, y = Sepal.Length, colour = Species)) +
geom_sina() +
geom_signif(comparisons = list(c("setosa", "versicolor")), test = "t.test")
Thanks to a new pull request to ggsignif, the following now works:
set.seed(123)
library(ggsignif)
library(ggridges)
library(ggplot2)
ggplot(iris, aes(x = Sepal.Length, y = Species)) +
geom_density_ridges(scale = 1) +
coord_flip() +
geom_signif(comparisons = list(c("setosa", "versicolor")),
y_position = 9)
#> Picking joint bandwidth of 0.181
Created on 2021-08-06 by the reprex package (v2.0.1)
Related
I'm looking to add a legend to my plot, for the moment the code I wrote is:
plot(allEffects(covid.lm, residuals=T), # plot with countries on graph
band.colors="grey2",
residuals.color=adjustcolor("steelblue3",alpha.f=0.5),
residuals.pch=16, smooth.residuals=F,
id = list(n=length(d$COUNTRY), cex=0.5))
Basically, I added numbers to the points in the plot (for which I created a linear model covid.lm, that done I'd need to add a legend for those points (that is a list of countries). Thanks in advance.
library(ggplot2)
data(iris)
m <- lm(Petal.Length ~ Sepal.Length, data = iris)
iris$Fitted <- predict(m)
iris$Species_num <- as.numeric(iris$Species)
ggplot(iris, aes(x = Petal.Length, y = Fitted)) +
geom_point(aes(color = as.factor(Species_num))) +
geom_text(aes(label = Species_num), hjust = 1.1, vjust = 1.1) +
labs(title = "Residuals", x = "Observed", y = "Fitted") +
guides(color=guide_legend(title="New Legend Title"))
Created on 2022-04-08 by the reprex package (v2.0.1)
I have some data that I would like to plot a threshold on, only if the data approaches the threshold. Therefore I would like to have a horizontal line at my threshold, but not extend the y axis limits if this value wouldn't have already been included. As my data is faceted it is not feasible to pre-calculate limits and I am doing it for many different data sets so would get very messy. This question seems to be asking the same thing but the answers are not relevant to me: ggplot2: Adding a geom without affecting limits
Simple example.
library(ggplot2)
#> Warning: package 'ggplot2' was built under R version 3.5.3
ggplot(iris, aes(x = Sepal.Width, y = Sepal.Length))+geom_point()+facet_wrap(~Species, scales = "free")+geom_hline(yintercept = 7)
which gives me
But I would like this (created in paint) where the limits have not been impacted by the geom_hline
Created on 2020-01-21 by the reprex package (v0.3.0)
You can automate this by checking whether a given facet has a maximum y-value that exceeds the threshold.
threshold = 7
iris %>%
ggplot(aes(Sepal.Width, Sepal.Length)) +
geom_point() +
facet_wrap(~Species, scales = "free") +
geom_hline(data = . %>%
group_by(Species) %>%
filter(max(Sepal.Length, na.rm=TRUE) >= threshold),
yintercept = threshold)
Adapting from this post:
How can I add a line to one of the facets?
library(tidyverse)
iris %>%
ggplot(aes(x = Sepal.Width, y = Sepal.Length)) +
geom_point() +
facet_wrap(~Species, scales = "free") +
geom_hline(data = . %>% filter(Species != "setosa"), aes(yintercept = 7))
I use ggridges in R to visualize my data. But a lot of the lines are overlapping and are hard to read.
My code is:
ggplot(task1, aes(x = ibu, y = style, fill = style)) +
geom_density_ridges(alpha=1) +
theme_ridges() +
theme(legend.position = "none")
What should I change to make this visualization more readable?
You can use the scale parameter to adjust the overall height scaling. Just set it to a number that produces results you like.
library(ggplot2)
library(ggridges)
#>
#> Attaching package: 'ggridges'
#> The following object is masked from 'package:ggplot2':
#>
#> scale_discrete_manual
ggplot(iris, aes(x = Sepal.Length, y = Species, fill = Species)) +
geom_density_ridges()
#> Picking joint bandwidth of 0.181
ggplot(iris, aes(x = Sepal.Length, y = Species, fill = Species)) +
geom_density_ridges(scale = 0.5)
#> Picking joint bandwidth of 0.181
Created on 2019-11-03 by the reprex package (v0.3.0)
How do you only display the correlation coefficient in ggpubr::stat_cor, and not the p-value? There doesn't seem to be an argument within stat_cor to specify only one statistic or the other. Is there some other creative work-around?
Following Ben's solution - I including a reproducible example.
First let's use a simple example:
library('ggplot2')
library('ggpubr')
data(iris)
ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width )) +
geom_point() + theme_bw() +
stat_cor(method = "pearson")
Now, you wanted to display only the correlation coefficient, meaning R.
ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width )) +
geom_point() + theme_bw() +
stat_cor(method = "pearson", aes(label = ..r.label..))
Actually you can also calculate the R-value independently, and ad a textbox in the plot using annotate(example from iris database):
round(cor(iris$Sepal.Length, iris$Sepal.Width),2)
I'm having trouble substituting color() for facet_grid() when I want to 'split' my data by a variable. Instead of generating individual plots with regression lines, I'm looking to generate a single plot with all regression lines.
Here's my code:
ggplot(data, aes(x = Rooms, y = Price)) +
geom_point(size = 1, alpha = 1/100) +
geom_smooth(method = "lm", color = Type) # Single plot with all regression lines
ggplot(data, aes(x = Rooms, y = Price)) +
geom_point(size = 1, alpha = 1/100) +
geom_smooth(method = "lm") + facet_grid(. ~ Type) # Individual plots with regression lines
(The first plot doesn't work) Here's the output:
"Error in grDevices::col2rgb(colour, TRUE) : invalid color name 'Type'
In addition: Warning messages:
1: Removed 12750 rows containing non-finite values (stat_smooth).
2: Removed 12750 rows containing missing values (geom_point)."
Here's a link to the data:
Dataset
You need to supply an aesthetic mapping to geom_smooth, not just a parameter, which means you need to put colour inside aes(). This is what you need to do any time you want to have an graphical element correspond to something in the data rather than a fixed parameter.
Here's an example with the built-in iris dataset. In fact, if you move colour to the ggplot call so it is inherited by geom_point as well, then you can colour the points as well as the lines.
library(ggplot2)
ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width)) +
geom_point() +
geom_smooth(aes(colour = Species), method = "lm")
ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, colour = Species)) +
geom_point() +
geom_smooth(method = "lm")
Created on 2018-07-20 by the reprex package (v0.2.0).