I'm plotting a wrapped ggscatter like the image below. What I want is to color differently according to the R and P values. For example, when P is not significant, I want the plot gray; when P is significant is want the plot colored according the R value in a continue scale. The problem is I don't know how to get those values to make an if statement inside the ggscatter. Anyone can help me? Thank you!
Example of dataset:
conc exposure col
11.16 21294 0.139275104
11.16 18018 0.150012216
13.8 26208 0.067379679
18.1 29484 0.013190731
Plot:
ggscatter(data, x = "exposure", y = col, add = "reg.line", conf.int = TRUE, color = cor.hilab[1], cor.coef = TRUE) +
facet_wrap(~conc)+
ylab("OD")+
xlab("Exposure")
This is now implemented in package 'ggpmisc' in GitHub (future version 0.4.4), but not yet in version 0.4.3 available in CRAN. This does not do exactly what you asked as at the moment geom_poly_eq() does not return R as it is not meaningful for polynomials of higher order than 1. Keeping to the grammar of graphics paradigm allows easier customization at the cost of lengthier code than the all-in-one functions from package pubr. Each approach has its advantages and drawbacks. I show here three examples of different possible approaches using package 'ggpmisc' (together with 'ggplot2').
library(ggpmisc)
#> Loading required package: ggpp
#> Loading required package: ggplot2
#>
#> Attaching package: 'ggpp'
#> The following object is masked from 'package:ggplot2':
#>
#> annotate
# First approach: faint lines for non-significant fits, with bands
ggplot(mpg, aes(displ, hwy)) +
geom_point() +
stat_poly_eq(aes(label = paste(after_stat(rr.label),
after_stat(p.value.label),
sep = "*\", \"*")),
label.x = "right") +
stat_poly_line(aes(colour = stage(after_scale = ifelse(p.value < 0.05,
alpha(colour, 1),
alpha(colour, 0.25)))),
se = TRUE,
mf.values = T) +
facet_wrap(~class, ncol = 2) +
theme_bw()
# Second approach: faint lines for non-significant fits, no-bands
# colour mapped to class
ggplot(mpg, aes(displ, hwy)) +
geom_point(aes(colour = class)) +
stat_poly_eq(aes(label = paste(after_stat(rr.label),
after_stat(p.value.label),
sep = "*\", \"*")),
label.x = "right") +
stat_poly_line(aes(colour = stage(start = class,
after_scale = ifelse(p.value < 0.05,
alpha(colour, 1),
alpha(colour, 0.25)))),
se = FALSE,
mf.values = T) +
facet_wrap(~class, ncol = 2) +
theme_bw()
#> Warning: Failed to apply `after_scale()` modifications to legend
# Third approach: no bands or lines for non-significant fits
ggplot(mpg, aes(displ, hwy)) +
geom_point(aes(colour = class)) +
stat_poly_eq(aes(label = paste(after_stat(rr.label),
after_stat(p.value.label),
sep = "*\", \"*")),
label.x = "right") +
stat_poly_line(aes(colour = stage(after_scale = ifelse(p.value < 0.05,
colour,
NA)),
fill = stage(after_scale = ifelse(p.value < 0.05,
fill,
NA))),
se = TRUE,
mf.values = T) +
facet_wrap(~class, ncol = 2) +
theme_bw()
#> Warning: Duplicated aesthetics after name standardisation: NA
#> Warning: Failed to apply `after_scale()` modifications to legend
Created on 2021-09-07 by the reprex package (v2.0.1)
Related
I am using the function stat_poly_eq to display the R-squared and p-value of a linear regression within a ggplot. I have two questions to optimize my output:
How do I remove the equal sign (=) from the p-value, such that only these less than sign (<) remains?
How can I set a desired number of significant digits to display? For example, I would like to see 3 significant digits for both the R-squared and p-value.
Here's some reproducible code to show the issue:
data(mtcars)
ggplot(data=mtcars, aes(x=mpg,y=hp)) +
geom_point() +
geom_smooth(method = "lm",formula = y ~ x,se=TRUE, color="black") +
stat_poly_eq(formula = y ~ x,
aes(label = paste(..rr.label.., ..p.value.label.., sep = "*`,`~")),
parse = TRUE,label.x.npc = "right",size=8)
The number of digits is easy to control with the argument rr.digits. I can't replicate your problem with the equals sign, but if you update ggplot and ggpmisc and use the modern after_stat syntax rather than the depricated .. syntax, you should get the same result as demonstrated in this reprex:
library(ggplot2)
library(ggpmisc)
#> Loading required package: ggpp
#>
#> Attaching package: 'ggpp'
#> The following object is masked from 'package:ggplot2':
#>
#> annotate
ggplot(mtcars, aes(mpg, hp)) +
geom_point() +
geom_smooth(method = "lm", formula = y ~ x, color = "black") +
stat_poly_eq(formula = y ~ x,
aes(label = paste(after_stat(rr.label), "*`,`~",
after_stat(p.value.label))),
parse = TRUE, label.x.npc = "right", size = 8, rr.digits = 3)
Created on 2022-12-23 with reprex v2.0.2
I'm making a plot of several linear regressions and I would like to find the slope of each of them. The problem is that I don't find how to do it in my case.
Like you can see on my plot, I'm testing the weight as a function of the temperature, a quality (my two colors) and quantity (my facet wrap).
My code for this plot is that :
g = ggplot(donnees_tot, aes(x=temperature, y=weight, col = quality))+
geom_point(aes(col=quality), size = 3)+
geom_smooth(method="lm", span = 0.8,aes(col=quality, fill=quality))+
scale_color_manual(values=c("S" = "aquamarine3",
"Y" = "darkgoldenrod3"))+
scale_fill_manual(values=c("S" = "aquamarine3",
"Y" = "darkgoldenrod3"))+
scale_x_continuous(breaks=c(20,25,28), limits=c(20,28))+
annotate("text", x= Inf, y = - Inf, label =eqn, parse = T, hjust=1.1, vjust=-.5)+
facet_wrap(~quantity)
g
Also, if you have a tips to write them on my plot, I would be really grateful !
Thank you
By using the ggpmisc package, I've had these lines to my code and it works !
stat_poly_line() +
stat_poly_eq(aes(label = paste(after_stat(eq.label),
after_stat(rr.label), sep = "*\", \"*"))) +
I am trying to put an R² value on a graph where I have filtered and ordered and summarised some values, and it will not let me. It allows a line of best fit but the R² value will not appear despite a stat_regline_equation line.
Here is my code:
s3average <- summarySE(S2_data, measurevar="Oxygen_consumption_rate", groupvars=c("Species", "Min", "Max", "Median", "Feeding"))
s3average$FeedingOrder <- factor(s3average$Feeding, levels= c("NA", "Large nektonic", "Mixed", "Small nektonic", "Planktonic"))
Core_feeding <- filter(s3average, grepl('Large nektonic|Small nektonic|Planktonic', Feeding))
diet2 <- ggplot(Core_feeding, aes(FeedingOrder, Oxygen_consumption_rate))+
geom_point()+
geom_smooth(method=lm, na.rm = TRUE, aes(group=1),colour="black", se = FALSE)+
stat_regline_equation(aes(label = ..rr.label..))
print(diet2)
It winds up looking like this:
Despite the line of code requesting the value it does not give it and I cannot figure out why.
Add group=1 to your top level aesthetics rather than in geom_smooth():
data(iris)
library(ggplot2)
library(ggpubr)
ggplot(iris, aes(x = Species, y = Sepal.Length, group = 1)) +
geom_point() +
geom_smooth(method = lm, na.rm = TRUE, colour = "black", se = FALSE) +
stat_regline_equation(aes(label = ..rr.label..))
Created on 2022-03-28 by the reprex package (v2.0.1)
In the following plot, which is a simple scatter plot + theme_apa(), I would like that both axes go through 0.
I tried some of the solutions proposed in the answers to similar questions to that but none of them worked.
A MWE to reproduce the plot:
library(papaja)
library(ggplot2)
library(MASS)
plot_two_factor <- function(factor_sol, groups) {
the_df <- as.data.frame(factor_sol)
the_df$groups <- groups
p1 <- ggplot(data = the_df, aes(x = MR1, y = MR2, color = groups)) +
geom_point() + theme_apa()
}
set.seed(131340)
n <- 30
group1 <- mvrnorm(n, mu=c(0,0.6), Sigma = diag(c(0.01,0.01)))
group2 <- mvrnorm(n, mu=c(0.6,0), Sigma = diag(c(0.01,0.01)))
factor_sol <- rbind(group1, group2)
colnames(factor_sol) <- c("MR1", "MR2")
groups <- as.factor(rep(c(1,2), each = n))
print(plot_two_factor(factor_sol, groups))
The papaja package can be installed via
devtools::install_github("crsh/papaja")
What you request cannot be achieved in ggplot2 and for a good reason, if you include axis and tick labels within the plotting area they will sooner or later overlap with points or lines representing data. I used #phiggins and #Job Nmadu answers as a starting point. I changed the order of the geoms to make sure the "data" are plotted on top of the axes. I changed the theme to theme_minimal() so that axes are not drawn outside the plotting area. I modified the offsets used for the data to better demonstrate how the code works.
library(ggplot2)
iris %>%
ggplot(aes(Sepal.Length - 5, Sepal.Width - 2, col = Species)) +
geom_hline(yintercept = 0) +
geom_vline(xintercept = 0) +
geom_point() +
theme_minimal()
This gets as close as possible to answering the question using ggplot2.
Using package 'ggpmisc' we can slightly simplify the code.
library(ggpmisc)
iris %>%
ggplot(aes(Sepal.Length - 5, Sepal.Width - 2, col = Species)) +
geom_quadrant_lines(linetype = "solid") +
geom_point() +
theme_minimal()
This code produces exactly the same plot as shown above.
If you want to always have the origin centered, i.e., symmetrical plus and minus limits in the plots irrespective of the data range, then package 'ggpmisc' provides a simple solution with function symmetric_limits(). This is how quadrant plots for gene expression and similar bidirectional responses are usually drawn.
iris %>%
ggplot(aes(Sepal.Length - 5, Sepal.Width - 2, col = Species)) +
geom_quadrant_lines(linetype = "solid") +
geom_point() +
scale_x_continuous(limits = symmetric_limits) +
scale_y_continuous(limits = symmetric_limits) +
theme_minimal()
The grid can be removed from the plotting area by adding + theme(panel.grid = element_blank()) after theme_minimal() to any of the three examples.
Loading 'ggpmisc' just for function symmetric_limits() is overkill, so here I show its definition, which is extremely simple:
symmetric_limits <- function (x)
{
max <- max(abs(x))
c(-max, max)
}
For the record, the following also works as above.
iris %>%
ggplot(aes(Sepal.Length-6.2, Sepal.Width-3.2, col = Species)) +
geom_point() +
geom_hline(yintercept = 0) +
geom_vline(xintercept = 0)
Setting xlim and slim should work.
library(tidyverse)
# default
iris %>%
ggplot(aes(Sepal.Length, Sepal.Width, col = Species)) +
geom_point()
# setting xlim and ylim
iris %>%
ggplot(aes(Sepal.Length, Sepal.Width, col = Species)) +
geom_point() +
xlim(c(0,8)) +
ylim(c(0,4.5))
Created on 2020-06-12 by the reprex package (v0.3.0)
While the question is not very clear, PoGibas seems to think that this is what the OP wanted.
library(tidyverse)
# default
iris %>%
ggplot(aes(Sepal.Length-6.2, Sepal.Width-3.2, col = Species)) +
geom_point() +
xlim(c(-2.5,2.5)) +
ylim(c(-1.5,1.5)) +
geom_hline(yintercept = 0) +
geom_vline(xintercept = 0)
Created on 2020-06-12 by the reprex package (v0.3.0)
I am looking to format the value labels with "," separators, particularly on the stratums (bar columns) of Alluvial/ Sankey plot using R ggalluvial.
While similar answers were found on other charts, the same attempt has returned an error (notice the missing value labels and messed up flow connections):
library(ggplot2)
library(ggalluvial)
library(scales)
vaccinations$freq = vaccinations$freq * 1000
ggplot(vaccinations,
aes(x = survey, stratum = response, alluvium = subject,
y = freq,
fill = response, label = comma(freq))) +
scale_x_discrete(expand = c(.1, .1)) +
geom_flow() +
geom_stratum(alpha = .5) +
geom_text(stat = "stratum", size = 3) +
theme(legend.position = "bottom") +
ggtitle("vaccination survey responses at three points in time")
Warning message:
Removed 12 rows containing missing values (geom_text).
The internals of ggalluvial prevent this from working, as #TobiO suspects. Specifically, when a numeric-valued variable is passed to label and processed by one of the alluvial stats, it is automatically totaled. When a character-valued variable is passed to label, this can't be done. So the formatting must take place after the statistical transformation.
A solution is provided by ggfittext: The function geom_fit_text() has a formatter parameter to which a formatting function can be passed—though the function must be compatible with the type of variable passed to label! Here's an example:
library(ggalluvial)
#> Loading required package: ggplot2
library(ggfittext)
library(scales)
data(vaccinations)
vaccinations <- transform(vaccinations, freq = freq * 1000)
ggplot(vaccinations,
aes(x = survey, stratum = response, alluvium = subject,
y = freq,
fill = response, label = freq)) +
scale_x_discrete(expand = c(.1, .1)) +
geom_flow() +
geom_stratum(alpha = .5) +
geom_fit_text(stat = "stratum", size = 10, min.size = 6, formatter = comma) +
theme(legend.position = "bottom") +
ggtitle("vaccination survey responses at three points in time")
Created on 2019-09-04 by the reprex package (v0.2.1)