ggplot add geom (specifically geom_hline) which doesn't affect limits - r

I have some data that I would like to plot a threshold on, only if the data approaches the threshold. Therefore I would like to have a horizontal line at my threshold, but not extend the y axis limits if this value wouldn't have already been included. As my data is faceted it is not feasible to pre-calculate limits and I am doing it for many different data sets so would get very messy. This question seems to be asking the same thing but the answers are not relevant to me: ggplot2: Adding a geom without affecting limits
Simple example.
library(ggplot2)
#> Warning: package 'ggplot2' was built under R version 3.5.3
ggplot(iris, aes(x = Sepal.Width, y = Sepal.Length))+geom_point()+facet_wrap(~Species, scales = "free")+geom_hline(yintercept = 7)
which gives me
But I would like this (created in paint) where the limits have not been impacted by the geom_hline
Created on 2020-01-21 by the reprex package (v0.3.0)

You can automate this by checking whether a given facet has a maximum y-value that exceeds the threshold.
threshold = 7
iris %>%
ggplot(aes(Sepal.Width, Sepal.Length)) +
geom_point() +
facet_wrap(~Species, scales = "free") +
geom_hline(data = . %>%
group_by(Species) %>%
filter(max(Sepal.Length, na.rm=TRUE) >= threshold),
yintercept = threshold)

Adapting from this post:
How can I add a line to one of the facets?
library(tidyverse)
iris %>%
ggplot(aes(x = Sepal.Width, y = Sepal.Length)) +
geom_point() +
facet_wrap(~Species, scales = "free") +
geom_hline(data = . %>% filter(Species != "setosa"), aes(yintercept = 7))

Related

Same y-axis range with ggarrange if I am already using facets and cannot use them again

my question is basically a follow-up to this question. However, the problem is that in the said question the answer completely bypasses the fact that ggarrange is used and instead transfers the whole issue to be handled by the facets functionality of ggplot.
This doesn't work for me since I already am using facets in the sub-plots and I cannot use them again.
Here is some example code. I am wondering how to achieve that the two plots which are joined with ggarrange have the same range of y-axis (of course, not setting the limits manually).
mtcars %>%
group_split(vs) %>%
map(~ggplot(., aes(x = mpg, y = wt)) +
geom_point() +
facet_grid(rows = vars(am), cols = vars(gear))) %>%
ggarrange(plotlist = .)
As you can see, the left image's y-axis ranges from 2 to 5, while the right plot's y-axis ranges from 1.5 to 3.5. How can I make them be the same?
I'm once again arguing for abandoning the 'ggarrange' approach, this time in favour of the {patchwork} package, which allows you to apply an operation to all previous plots. In this case, we can use & scale_y_continuous(limits = ...) to set the limits for all plots.
library(ggplot2)
library(dplyr)
library(purrr)
library(patchwork)
mtcars %>%
group_split(vs) %>%
map(~ggplot(., aes(x = mpg, y = wt)) +
geom_point() +
facet_grid(rows = vars(am), cols = vars(gear))) %>%
wrap_plots() &
scale_y_continuous(limits = range(mtcars$wt))
Created on 2022-12-08 by the reprex package (v2.0.0)
One option would be to compute and add the range of your x and y variables to your dataset before splitting, which could then be used to set the limits.
library(dplyr)
library(ggplot2)
library(ggpubr)
library(purrr)
mtcars %>%
mutate(across(c(mpg, wt), list(range = ~list(range(.x))))) %>%
group_split(vs) %>%
map(~ggplot(., aes(x = mpg, y = wt)) +
geom_point() +
scale_x_continuous(limits = .$mpg_range[[1]]) +
scale_y_continuous(limits = .$wt_range[[1]]) +
facet_grid(rows = vars(am), cols = vars(gear))) %>%
ggarrange(plotlist = .)

Is there a way to get multi-paneled plots in ggplot with different color axes for each plot? [duplicate]

Say, I make a gpplot2 plot like the following with several facets:
ggplot(iris) +
geom_tile(aes(x = Petal.Width, fill = Sepal.Width, y = Petal.Length)) +
facet_wrap(~Species)
Note that there is one colourbar for all three plots, but each facet could potentially have a very different values. Is it possible to have a separate colourbar for each facet?
I agree with Alex's answer, but against my better scientific and design judgment, I took a stab at it.
require(gridExtra)
require(dplyr)
iris %>% group_by(Species) %>%
do(gg = {ggplot(., aes(Petal.Width, Petal.Length, fill = Sepal.Width)) +
geom_tile() + facet_grid(~Species) +
guides(fill = guide_colourbar(title.position = "top")) +
theme(legend.position = "top")}) %>%
.$gg %>% arrangeGrob(grobs = ., nrow = 1) %>% grid.arrange()
Of course, then you're duplicating lots of labels, which is annoying. Additionally, you lose the x and y scale information by plotting each species as a separate plot, instead of facets of a single plot. You could fix the axes by adding ... + coord_cartesian(xlim = range(iris$Petal.Width), ylim = range(iris$Petal.Length)) + ... within that ggplot call.
To be honest, the only way this makes sense at all is if it's comparing two different variables for the fill, which is why you don't care about comparing their true value between plots. A good alternative would be rescaling them to percentiles within a facet using dplyr::group_by() and dplyr::percent_rank.
Edited to update:
In the two-different-variables case, you have to first "melt" the data, which I assume you've already done. Here I'm repeating it with the iris data. Then you can look at the relative values by examining the percentiles, rather than the absolute values of the two variables.
iris %>%
tidyr::gather(key = Sepal.measurement,
value = value,
Sepal.Length, Sepal.Width) %>%
group_by(Sepal.measurement) %>%
mutate(percentilevalue = percent_rank(value)) %>%
ggplot(aes(Petal.Length, Petal.Width)) +
geom_tile(aes(fill = percentilevalue)) +
facet_grid(Sepal.measurement ~ Species) +
scale_fill_continuous(limits = c(0,1), labels = scales::percent)
Separate palettes for facets in ggplot facet_grid
It has been asked before. This is the best solution I have seen so far, however I think having a common palette is more ideal from a visualization standpoint.
If this is what you want then there is a simple hack to it.
tf1 <- iris
tf1$COL <- rep(1:50, each=3)
ggplot(tf1) +
geom_tile(aes(x = Petal.Width, fill = interaction(Petal.Length,COL), y = Petal.Length)) +
facet_wrap(~Species, scales = "free") + theme(legend.position="none")

How To Center Axes in ggplot2

In the following plot, which is a simple scatter plot + theme_apa(), I would like that both axes go through 0.
I tried some of the solutions proposed in the answers to similar questions to that but none of them worked.
A MWE to reproduce the plot:
library(papaja)
library(ggplot2)
library(MASS)
plot_two_factor <- function(factor_sol, groups) {
the_df <- as.data.frame(factor_sol)
the_df$groups <- groups
p1 <- ggplot(data = the_df, aes(x = MR1, y = MR2, color = groups)) +
geom_point() + theme_apa()
}
set.seed(131340)
n <- 30
group1 <- mvrnorm(n, mu=c(0,0.6), Sigma = diag(c(0.01,0.01)))
group2 <- mvrnorm(n, mu=c(0.6,0), Sigma = diag(c(0.01,0.01)))
factor_sol <- rbind(group1, group2)
colnames(factor_sol) <- c("MR1", "MR2")
groups <- as.factor(rep(c(1,2), each = n))
print(plot_two_factor(factor_sol, groups))
The papaja package can be installed via
devtools::install_github("crsh/papaja")
What you request cannot be achieved in ggplot2 and for a good reason, if you include axis and tick labels within the plotting area they will sooner or later overlap with points or lines representing data. I used #phiggins and #Job Nmadu answers as a starting point. I changed the order of the geoms to make sure the "data" are plotted on top of the axes. I changed the theme to theme_minimal() so that axes are not drawn outside the plotting area. I modified the offsets used for the data to better demonstrate how the code works.
library(ggplot2)
iris %>%
ggplot(aes(Sepal.Length - 5, Sepal.Width - 2, col = Species)) +
geom_hline(yintercept = 0) +
geom_vline(xintercept = 0) +
geom_point() +
theme_minimal()
This gets as close as possible to answering the question using ggplot2.
Using package 'ggpmisc' we can slightly simplify the code.
library(ggpmisc)
iris %>%
ggplot(aes(Sepal.Length - 5, Sepal.Width - 2, col = Species)) +
geom_quadrant_lines(linetype = "solid") +
geom_point() +
theme_minimal()
This code produces exactly the same plot as shown above.
If you want to always have the origin centered, i.e., symmetrical plus and minus limits in the plots irrespective of the data range, then package 'ggpmisc' provides a simple solution with function symmetric_limits(). This is how quadrant plots for gene expression and similar bidirectional responses are usually drawn.
iris %>%
ggplot(aes(Sepal.Length - 5, Sepal.Width - 2, col = Species)) +
geom_quadrant_lines(linetype = "solid") +
geom_point() +
scale_x_continuous(limits = symmetric_limits) +
scale_y_continuous(limits = symmetric_limits) +
theme_minimal()
The grid can be removed from the plotting area by adding + theme(panel.grid = element_blank()) after theme_minimal() to any of the three examples.
Loading 'ggpmisc' just for function symmetric_limits() is overkill, so here I show its definition, which is extremely simple:
symmetric_limits <- function (x)
{
max <- max(abs(x))
c(-max, max)
}
For the record, the following also works as above.
iris %>%
ggplot(aes(Sepal.Length-6.2, Sepal.Width-3.2, col = Species)) +
geom_point() +
geom_hline(yintercept = 0) +
geom_vline(xintercept = 0)
Setting xlim and slim should work.
library(tidyverse)
# default
iris %>%
ggplot(aes(Sepal.Length, Sepal.Width, col = Species)) +
geom_point()
# setting xlim and ylim
iris %>%
ggplot(aes(Sepal.Length, Sepal.Width, col = Species)) +
geom_point() +
xlim(c(0,8)) +
ylim(c(0,4.5))
Created on 2020-06-12 by the reprex package (v0.3.0)
While the question is not very clear, PoGibas seems to think that this is what the OP wanted.
library(tidyverse)
# default
iris %>%
ggplot(aes(Sepal.Length-6.2, Sepal.Width-3.2, col = Species)) +
geom_point() +
xlim(c(-2.5,2.5)) +
ylim(c(-1.5,1.5)) +
geom_hline(yintercept = 0) +
geom_vline(xintercept = 0)
Created on 2020-06-12 by the reprex package (v0.3.0)

Binning continuous variable then visualizing it with ggplot does not reproduce example in textbook?

Given "diamonds" dataset in tidyverse,
and filtering it as per ch.7 in "R for Data Science":
smaller <- diamonds %>% filter(carat < 3),
I expect
ggplot(data = smaller, mapping = aes(x = carat, y = price)) +
+ geom_boxplot(mapping = aes(group = cut_width(carat, 0.1)))
to return
price vs carat (binned),
but instead see this
returned.
Why is this? Is it because of a change in ggplot2, or is it another reason?
Yes. With the release of ggplot2 3.3.0 bi-directional geoms and stats have been introduced, which also result in a kind of direction determination by ggplot2. See here:
For this reason you have to add orientation=x to get the plot from R4DS:
library(ggplot2)
library(dplyr)
smaller <- diamonds %>% filter(carat < 3)
ggplot(data = smaller, mapping = aes(x = carat, y = price)) +
geom_boxplot(mapping = aes(group = cut_width(carat, 0.1)), orientation = "x")
Created on 2020-04-13 by the reprex package (v0.3.0)

Different colourbar for each facet in ggplot figure

Say, I make a gpplot2 plot like the following with several facets:
ggplot(iris) +
geom_tile(aes(x = Petal.Width, fill = Sepal.Width, y = Petal.Length)) +
facet_wrap(~Species)
Note that there is one colourbar for all three plots, but each facet could potentially have a very different values. Is it possible to have a separate colourbar for each facet?
I agree with Alex's answer, but against my better scientific and design judgment, I took a stab at it.
require(gridExtra)
require(dplyr)
iris %>% group_by(Species) %>%
do(gg = {ggplot(., aes(Petal.Width, Petal.Length, fill = Sepal.Width)) +
geom_tile() + facet_grid(~Species) +
guides(fill = guide_colourbar(title.position = "top")) +
theme(legend.position = "top")}) %>%
.$gg %>% arrangeGrob(grobs = ., nrow = 1) %>% grid.arrange()
Of course, then you're duplicating lots of labels, which is annoying. Additionally, you lose the x and y scale information by plotting each species as a separate plot, instead of facets of a single plot. You could fix the axes by adding ... + coord_cartesian(xlim = range(iris$Petal.Width), ylim = range(iris$Petal.Length)) + ... within that ggplot call.
To be honest, the only way this makes sense at all is if it's comparing two different variables for the fill, which is why you don't care about comparing their true value between plots. A good alternative would be rescaling them to percentiles within a facet using dplyr::group_by() and dplyr::percent_rank.
Edited to update:
In the two-different-variables case, you have to first "melt" the data, which I assume you've already done. Here I'm repeating it with the iris data. Then you can look at the relative values by examining the percentiles, rather than the absolute values of the two variables.
iris %>%
tidyr::gather(key = Sepal.measurement,
value = value,
Sepal.Length, Sepal.Width) %>%
group_by(Sepal.measurement) %>%
mutate(percentilevalue = percent_rank(value)) %>%
ggplot(aes(Petal.Length, Petal.Width)) +
geom_tile(aes(fill = percentilevalue)) +
facet_grid(Sepal.measurement ~ Species) +
scale_fill_continuous(limits = c(0,1), labels = scales::percent)
Separate palettes for facets in ggplot facet_grid
It has been asked before. This is the best solution I have seen so far, however I think having a common palette is more ideal from a visualization standpoint.
If this is what you want then there is a simple hack to it.
tf1 <- iris
tf1$COL <- rep(1:50, each=3)
ggplot(tf1) +
geom_tile(aes(x = Petal.Width, fill = interaction(Petal.Length,COL), y = Petal.Length)) +
facet_wrap(~Species, scales = "free") + theme(legend.position="none")

Resources