Different colourbar for each facet in ggplot figure - r

Say, I make a gpplot2 plot like the following with several facets:
ggplot(iris) +
geom_tile(aes(x = Petal.Width, fill = Sepal.Width, y = Petal.Length)) +
facet_wrap(~Species)
Note that there is one colourbar for all three plots, but each facet could potentially have a very different values. Is it possible to have a separate colourbar for each facet?

I agree with Alex's answer, but against my better scientific and design judgment, I took a stab at it.
require(gridExtra)
require(dplyr)
iris %>% group_by(Species) %>%
do(gg = {ggplot(., aes(Petal.Width, Petal.Length, fill = Sepal.Width)) +
geom_tile() + facet_grid(~Species) +
guides(fill = guide_colourbar(title.position = "top")) +
theme(legend.position = "top")}) %>%
.$gg %>% arrangeGrob(grobs = ., nrow = 1) %>% grid.arrange()
Of course, then you're duplicating lots of labels, which is annoying. Additionally, you lose the x and y scale information by plotting each species as a separate plot, instead of facets of a single plot. You could fix the axes by adding ... + coord_cartesian(xlim = range(iris$Petal.Width), ylim = range(iris$Petal.Length)) + ... within that ggplot call.
To be honest, the only way this makes sense at all is if it's comparing two different variables for the fill, which is why you don't care about comparing their true value between plots. A good alternative would be rescaling them to percentiles within a facet using dplyr::group_by() and dplyr::percent_rank.
Edited to update:
In the two-different-variables case, you have to first "melt" the data, which I assume you've already done. Here I'm repeating it with the iris data. Then you can look at the relative values by examining the percentiles, rather than the absolute values of the two variables.
iris %>%
tidyr::gather(key = Sepal.measurement,
value = value,
Sepal.Length, Sepal.Width) %>%
group_by(Sepal.measurement) %>%
mutate(percentilevalue = percent_rank(value)) %>%
ggplot(aes(Petal.Length, Petal.Width)) +
geom_tile(aes(fill = percentilevalue)) +
facet_grid(Sepal.measurement ~ Species) +
scale_fill_continuous(limits = c(0,1), labels = scales::percent)

Separate palettes for facets in ggplot facet_grid
It has been asked before. This is the best solution I have seen so far, however I think having a common palette is more ideal from a visualization standpoint.

If this is what you want then there is a simple hack to it.
tf1 <- iris
tf1$COL <- rep(1:50, each=3)
ggplot(tf1) +
geom_tile(aes(x = Petal.Width, fill = interaction(Petal.Length,COL), y = Petal.Length)) +
facet_wrap(~Species, scales = "free") + theme(legend.position="none")

Related

Same y-axis range with ggarrange if I am already using facets and cannot use them again

my question is basically a follow-up to this question. However, the problem is that in the said question the answer completely bypasses the fact that ggarrange is used and instead transfers the whole issue to be handled by the facets functionality of ggplot.
This doesn't work for me since I already am using facets in the sub-plots and I cannot use them again.
Here is some example code. I am wondering how to achieve that the two plots which are joined with ggarrange have the same range of y-axis (of course, not setting the limits manually).
mtcars %>%
group_split(vs) %>%
map(~ggplot(., aes(x = mpg, y = wt)) +
geom_point() +
facet_grid(rows = vars(am), cols = vars(gear))) %>%
ggarrange(plotlist = .)
As you can see, the left image's y-axis ranges from 2 to 5, while the right plot's y-axis ranges from 1.5 to 3.5. How can I make them be the same?
I'm once again arguing for abandoning the 'ggarrange' approach, this time in favour of the {patchwork} package, which allows you to apply an operation to all previous plots. In this case, we can use & scale_y_continuous(limits = ...) to set the limits for all plots.
library(ggplot2)
library(dplyr)
library(purrr)
library(patchwork)
mtcars %>%
group_split(vs) %>%
map(~ggplot(., aes(x = mpg, y = wt)) +
geom_point() +
facet_grid(rows = vars(am), cols = vars(gear))) %>%
wrap_plots() &
scale_y_continuous(limits = range(mtcars$wt))
Created on 2022-12-08 by the reprex package (v2.0.0)
One option would be to compute and add the range of your x and y variables to your dataset before splitting, which could then be used to set the limits.
library(dplyr)
library(ggplot2)
library(ggpubr)
library(purrr)
mtcars %>%
mutate(across(c(mpg, wt), list(range = ~list(range(.x))))) %>%
group_split(vs) %>%
map(~ggplot(., aes(x = mpg, y = wt)) +
geom_point() +
scale_x_continuous(limits = .$mpg_range[[1]]) +
scale_y_continuous(limits = .$wt_range[[1]]) +
facet_grid(rows = vars(am), cols = vars(gear))) %>%
ggarrange(plotlist = .)

Is there a way to get multi-paneled plots in ggplot with different color axes for each plot? [duplicate]

Say, I make a gpplot2 plot like the following with several facets:
ggplot(iris) +
geom_tile(aes(x = Petal.Width, fill = Sepal.Width, y = Petal.Length)) +
facet_wrap(~Species)
Note that there is one colourbar for all three plots, but each facet could potentially have a very different values. Is it possible to have a separate colourbar for each facet?
I agree with Alex's answer, but against my better scientific and design judgment, I took a stab at it.
require(gridExtra)
require(dplyr)
iris %>% group_by(Species) %>%
do(gg = {ggplot(., aes(Petal.Width, Petal.Length, fill = Sepal.Width)) +
geom_tile() + facet_grid(~Species) +
guides(fill = guide_colourbar(title.position = "top")) +
theme(legend.position = "top")}) %>%
.$gg %>% arrangeGrob(grobs = ., nrow = 1) %>% grid.arrange()
Of course, then you're duplicating lots of labels, which is annoying. Additionally, you lose the x and y scale information by plotting each species as a separate plot, instead of facets of a single plot. You could fix the axes by adding ... + coord_cartesian(xlim = range(iris$Petal.Width), ylim = range(iris$Petal.Length)) + ... within that ggplot call.
To be honest, the only way this makes sense at all is if it's comparing two different variables for the fill, which is why you don't care about comparing their true value between plots. A good alternative would be rescaling them to percentiles within a facet using dplyr::group_by() and dplyr::percent_rank.
Edited to update:
In the two-different-variables case, you have to first "melt" the data, which I assume you've already done. Here I'm repeating it with the iris data. Then you can look at the relative values by examining the percentiles, rather than the absolute values of the two variables.
iris %>%
tidyr::gather(key = Sepal.measurement,
value = value,
Sepal.Length, Sepal.Width) %>%
group_by(Sepal.measurement) %>%
mutate(percentilevalue = percent_rank(value)) %>%
ggplot(aes(Petal.Length, Petal.Width)) +
geom_tile(aes(fill = percentilevalue)) +
facet_grid(Sepal.measurement ~ Species) +
scale_fill_continuous(limits = c(0,1), labels = scales::percent)
Separate palettes for facets in ggplot facet_grid
It has been asked before. This is the best solution I have seen so far, however I think having a common palette is more ideal from a visualization standpoint.
If this is what you want then there is a simple hack to it.
tf1 <- iris
tf1$COL <- rep(1:50, each=3)
ggplot(tf1) +
geom_tile(aes(x = Petal.Width, fill = interaction(Petal.Length,COL), y = Petal.Length)) +
facet_wrap(~Species, scales = "free") + theme(legend.position="none")

ggplot add geom (specifically geom_hline) which doesn't affect limits

I have some data that I would like to plot a threshold on, only if the data approaches the threshold. Therefore I would like to have a horizontal line at my threshold, but not extend the y axis limits if this value wouldn't have already been included. As my data is faceted it is not feasible to pre-calculate limits and I am doing it for many different data sets so would get very messy. This question seems to be asking the same thing but the answers are not relevant to me: ggplot2: Adding a geom without affecting limits
Simple example.
library(ggplot2)
#> Warning: package 'ggplot2' was built under R version 3.5.3
ggplot(iris, aes(x = Sepal.Width, y = Sepal.Length))+geom_point()+facet_wrap(~Species, scales = "free")+geom_hline(yintercept = 7)
which gives me
But I would like this (created in paint) where the limits have not been impacted by the geom_hline
Created on 2020-01-21 by the reprex package (v0.3.0)
You can automate this by checking whether a given facet has a maximum y-value that exceeds the threshold.
threshold = 7
iris %>%
ggplot(aes(Sepal.Width, Sepal.Length)) +
geom_point() +
facet_wrap(~Species, scales = "free") +
geom_hline(data = . %>%
group_by(Species) %>%
filter(max(Sepal.Length, na.rm=TRUE) >= threshold),
yintercept = threshold)
Adapting from this post:
How can I add a line to one of the facets?
library(tidyverse)
iris %>%
ggplot(aes(x = Sepal.Width, y = Sepal.Length)) +
geom_point() +
facet_wrap(~Species, scales = "free") +
geom_hline(data = . %>% filter(Species != "setosa"), aes(yintercept = 7))

How to graph "before and after" measures using ggplot with connecting lines and subsets?

I’m totally new to ggplot, relatively fresh with R and want to make a smashing ”before-and-after” scatterplot with connecting lines to illustrate the movement in percentages of different subgroups before and after a special training initiative. I’ve tried some options, but have yet to:
show each individual observation separately (now same values are overlapping)
connect the related before and after measures (x=0 and X=1) with lines to more clearly illustrate the direction of variation
subset the data along class and id using shape and colors
How can I best create a scatter plot using ggplot (or other) fulfilling the above demands?
Main alternative: geom_point()
Here is some sample data and example code using genom_point
x <- c(0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1) # 0=before, 1=after
y <- c(45,30,10,40,10,NA,30,80,80,NA,95,NA,90,NA,90,70,10,80,98,95) # percentage of ”feelings of peace"
class <- c(0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,1,1) # 0=multiple days 1=one day
id <- c(1,1,2,3,4,4,4,4,5,6,1,1,2,3,4,4,4,4,5,6) # id = per individual
df <- data.frame(x,y,class,id)
ggplot(df, aes(x=x, y=y), fill=id, shape=class) + geom_point()
Alternative: scale_size()
I have explored stat_sum() to summarize the frequencies of overlapping observations, but then not being able to subset using colors and shapes due to overlap.
ggplot(df, aes(x=x, y=y)) +
stat_sum()
Alternative: geom_dotplot()
I have also explored geom_dotplot() to clarify the overlapping observations that arise from using genom_point() as I do in the example below, however I have yet to understand how to combine the before and after measures into the same plot.
df1 <- df[1:10,] # data before
df2 <- df[11:20,] # data after
p1 <- ggplot(df1, aes(x=x, y=y)) +
geom_dotplot(binaxis = "y", stackdir = "center",stackratio=2,
binwidth=(1/0.3))
p2 <- ggplot(df2, aes(x=x, y=y)) +
geom_dotplot(binaxis = "y", stackdir = "center",stackratio=2,
binwidth=(1/0.3))
grid.arrange(p1,p2, nrow=1) # GridExtra package
Or maybe it is better to summarize data by x, id, class as mean/median of y, filter out ids producing NAs (e.g. ids 3 and 6), and connect the points by lines? So in case if you don't really need to show variability for some ids (which could be true if the plot only illustrates tendencies) you can do it this way:
library(ggplot)
library(dplyr)
#library(ggthemes)
df <- df %>%
group_by(x, id, class) %>%
summarize(y = median(y, na.rm = T)) %>%
ungroup() %>%
mutate(
id = factor(id),
x = factor(x, labels = c("before", "after")),
class = factor(class, labels = c("one day", "multiple days")),
) %>%
group_by(id) %>%
mutate(nas = any(is.na(y))) %>%
ungroup() %>%
filter(!nas) %>%
select(-nas)
ggplot(df, aes(x = x, y = y, col = id, group = id)) +
geom_point(aes(shape = class)) +
geom_line(show.legend = F) +
#theme_few() +
#theme(legend.position = "none") +
ylab("Feelings of peace, %") +
xlab("")
Here's one possible solution for you.
First - to get the color and shapes determined by variables, you need to put these into the aes function. I turned several into factors, so the labs function fixes the labels so they don't appear as "factor(x)" but just "x".
To address multiple points, one solution is to use geom_smooth with method = "lm". This plots the regression line, instead of connecting all the dots.
The option se = FALSE prevents confidence intervals from being plotted - I don't think they add a lot to your plot, but play with it.
Connecting the dots is done by geom_line - feel free to try that as well.
Within geom_point, the option position = position_jitter(width = .1) adds random noise to the x-axis so points do not overlap.
ggplot(df, aes(x=factor(x), y=y, color=factor(id), shape=factor(class), group = id)) +
geom_point(position = position_jitter(width = .1)) +
geom_smooth(method = 'lm', se = FALSE) +
labs(
x = "x",
color = "ID",
shape = 'Class'
)

Split data to plot histograms side-by-side in R

I am learning R with the Australian athletes data set.
By using ggplot, I can plot a histogram like this.
library(DAAG)
ggplot(ais, aes(wt, fill = sex)) +
geom_histogram(binwidth = 5)
By using summary(ais$wt), the 3rd Quartile is 84.12. Now I want to split the data by the wt 84.12. and plot 2 similar histograms accordingly (side by side)
The split is:
ais1 = ais$wt[which(ais$wt>=0 & ais$wt<=84.12)]
ais2 = ais$wt[which(ais$wt>84.12)]
But I don’t know how to fit them in the plotting. I tried but it doesn't work:
ggplot(ais1, aes(wt, fill = sex)) +...
How can I plot the histograms (2 similar histograms accordingly, side by side)?
Add the split as a column to your data
ais$wt_3q = ifelse(ais$wt < 84.12, "Quartiles 1-3", "Quartile 4")
Then use facets:
ggplot(ais, aes(wt, fill = sex)) +
geom_histogram(binwidth = 5) +
facet_wrap(~ wt_3q)
The created variable is a factor, if you specify the order of the levels you can order the facets differently (lots of questions on here showing that if you search for them - same as reordering bars for a ggplot barplot). You can also let the scales vary - look at ?facet_wrap for more details.
Generally, you shouldn't create more data frames. Creating ais1 and ais2 is usually avoidable, and your life will be simpler if you use a single data frame for a single data set. Adding a new column for grouping makes it easy to keep things organized.
We can do this with ggarrange to arrange the plot objects for each subset
library(DAAG)
library(ggplot2)
library(ggpubr)
p2 <- ais %>%
filter(wt>=0, wt<=84.12) %>%
ggplot(., aes(wt, fill = sex)) +
geom_histogram(binwidth = 5) +
coord_cartesian(ylim = c(0, 30))
p1 <- ais %>%
filter(wt>84.12) %>%
ggplot(., aes(wt, fill = sex)) +
geom_histogram(binwidth = 5) +
coord_cartesian(ylim = c(0, 30))
ggarrange(p1, p2, ncol =2, nrow = 1, labels = c("p1", "p2"))
-output

Resources