I am, in R and using ggplot2, plotting the development over time of several variables for several groups in my sample (days of the week, to be precise). An artificial sample (using long data suitable for plotting) is this:
library(tidyverse)
groups1 <- rep(1:2, each = 7 * 100)
groups2 <- rep(rep(1:7, times = 2), each = 100)
x <- rep(1:100, times = 14)
values <- c(rnorm(n = 700), rgamma(n = 700, shape = 2))
data <- tibble(x, groups1, groups2, values)
data %>% ggplot(mapping = aes(x = x, y = values)) + geom_line() + facet_grid(groups2 ~ groups1)
which gives
In this example, the first variable -- shown in the left column -- has unlimited range, while the second variable -- shown in the right column -- is weakly positive.
I would like to reflect this in my plot by allowing the Y axes to differ across the columns in this plot, i.e. set Y axis limits separately for the two variables plotted. However, in order to allow for easy visual comparison of the different groups for each of the two variables, I would also like to have the identical Y axes within each column.
I've looked at the scales option to facet_grid(), but it does not seem to be able to do what I want. Specifically,
passing scales = "free_x" allows the Y axes to vary across rows, while
passing scales = "free_y" allows the X axes to vary across columns, but
there is no option to allow the Y axes to vary across columns (nor, presumably, the X axes across rows).
As usual, my attempts to find a solution have yielded nothing. Thank you very much for your help!
I think the easiest would to create a plot per facet column and bind them with something like {patchwork}. To get the facet look, you can still add a faceting layer.
library(tidyverse)
library(patchwork)
groups1 <- rep(1:2, each = 7 * 100)
groups2 <- rep(rep(1:7, times = 2), each = 100)
x <- rep(1:100, times = 14)
set.seed(42) ## always better to set a seed before using random functions
values <- c(rnorm(n = 700), rgamma(n = 700, shape = 2))
data <- tibble(x, groups1, groups2, values)
data %>%
group_split(groups1) %>%
map({
~ggplot(.x, aes(x = x, y = values)) +
geom_line() +
facet_grid(groups2 ~ groups1)
}) %>%
wrap_plots()
Created on 2023-01-11 with reprex v2.0.2
I want to make an xy plot of nested groups (Group and Subgroup) where points are colored by Group and have shape by Subgroup. A minimal example is below:
DATA<-data.frame(
Group=c(rep("group1",10),rep("group2",10),rep("group3",10) ),
Subgroup = c(rep(c("1.1","1.2"),5), rep(c("2.1","2.2"),5), rep(c("3.1","3.2"),5)),
x=c(rnorm(10, mean=5),rnorm(10, mean=10),rnorm(10, mean=15)),
y=c(rnorm(10, mean=3),rnorm(10, mean=4),rnorm(10, mean=5))
)
ggplot(DATA, aes(x=x, y=y,colour=Group, shape=Subgroup) ) +
geom_point(size=3)
However, because in reality I have many more subgroups than can be easily be identified based on the available shapes I want to repeat the same shapes within each Group. Below is the same code but with an additional column (Shape) specifying the shape:
DATA<-data.frame(
Group=c(rep("group1",10),rep("group2",10),rep("group3",10) ),
Subgroup = c(rep(c("1.1","1.2"),5), rep(c("2.1","2.2"),5), rep(c("3.1","3.2"),5)),
Shape = as.character(c(rep(c(1,2),15) ) ),
x=c(rnorm(10, mean=5),rnorm(10, mean=10),rnorm(10, mean=15)),
y=c(rnorm(10, mean=3),rnorm(10, mean=4),rnorm(10, mean=5))
)
ggplot(DATA, aes(x=x, y=y,colour=Group, shape=Shape) ) +
geom_point(size=3)
Now the shapes and colours are as I want them. However, the legend no longer lists the subgroups. What I want is a legend that lists all subgroups under each respective Group. Something like:
Group1
1.1
1.2
Group2
2.1
2.2
Group3
3.1
3.2
(Ideally, this would be a single nested legend. If nested legends are not possible, perhaps they can be three separate legends with the Groups as titles)
Is this something that can be achieved, and how?
Thanks
One option to achieve your desired result would be via the ggnewscale package which allows for multiple scales and legends for the same aesthetic.
To this end we have to
split the data by GROUP and plot each GROUP via a separate geom_point layer.
Additionally each GROUP gets a separate shape scale and legend which via achieve via ggnewscale::new_scale.
Instead of making use of the color aesthetic we set the color for each group as an argument for which I make use of a named vector of colors
Instead of copying and pasting the code for each group I make use of purrr::imap to loop over the splitted dataset and add the layers dynamically.
One more note: In general the order of legends is by default set via a "magic algorithm". To get the groups in the right order we have to explicitly set the order via guide_legend.
library(ggplot2)
library(ggnewscale)
library(dplyr)
library(purrr)
library(tibble)
DATA_split <- split(DATA, DATA$Group)
# Vector of colors and shapes
colors <- setNames(scales::hue_pal()(length(DATA_split)), names(DATA_split))
shapes <- setNames(scales::shape_pal()(length(unique(DATA$Shape))), unique(DATA$Shape))
ggplot(mapping = aes(x = x, y = y)) +
purrr::imap(DATA_split, function(x, y) {
# Get Labels
labels <- x[c("Shape", "Subgroup")] %>%
distinct(Shape, Subgroup) %>%
deframe()
# Get order
order <- as.numeric(gsub("^.*?(\\d+)$", "\\1", y))
list(
geom_point(data = x, aes(shape = Shape), color = colors[[y]], size = 3),
scale_shape_manual(values = shapes, labels = labels, name = y, guide = guide_legend(order = order)),
new_scale("shape")
)
})
DATA
set.seed(123)
DATA <- data.frame(
Group = c(rep("group1", 10), rep("group2", 10), rep("group3", 10)),
Subgroup = c(rep(c("1.1", "1.2"), 5), rep(c("2.1", "2.2"), 5), rep(c("3.1", "3.2"), 5)),
Shape = as.character(c(rep(c(1, 2), 15))),
x = c(rnorm(10, mean = 5), rnorm(10, mean = 10), rnorm(10, mean = 15)),
y = c(rnorm(10, mean = 3), rnorm(10, mean = 4), rnorm(10, mean = 5))
)
I am trying to plot multiple box plots as a single graph. The data is where I have done a wilcoxon test. It should be like this
I have four/five questions and I want to plot the respondent score for two sets as a box plot. This should be done for all questions (Two groups for each question).
I am thinking of using ggplot2. My data is like
q1o <- c(4,4,5,4,4,4,4,5,4,5,4,4,5,4,4,4,5,5,5,5,5,5,5,5,5,3,4,4,3,4)
q1s <- c(5,4,4,5,5,5,5,5,4,5,4,4,5,4,5,5,5,5,5,5,5,5,5,5,5,5,4,5,4,4)
q2o <- c(3,3,3,4,3,4,4,3,3,3,4,4,3,4,3,3,4,3,3,3,3,4,4,4,4,3,3,3,3,4)
q2s <- c(5,4,4,5,5,5,5,5,4,5,4,4,5,4,5,5,5,5,5,5,5,5,5,5,5,5,4,3,4,4)
....
....
q1 means question 1 and q2 means question 2. I also want to know how to align these stacked box plots based on my need. Like one row or two rows.
This should get you started:
Unfortunately you don't provide a minimal example with sample data, so I will generate some random sample data.
# Generate sample data
set.seed(2017);
df <- cbind.data.frame(
value = rnorm(1000),
Label = sample(c("Good", "Bad"), 1000, replace = T),
variable = sample(paste0("F", 5:11), 1000, replace = T));
# ggplot
library(tidyverse);
df %>%
mutate(variable = factor(variable, levels = paste0("F", 5:11))) %>%
ggplot(aes(variable, value, fill = Label)) +
geom_boxplot(position=position_dodge()) +
facet_wrap(~ variable, ncol = 3, scale = "free");
You can specify the number of columns and rows in your 2d panel layout through arguments ncol and nrow, respectively, of facet_wrap. Many more details and examples can be found if you follow ?geom_boxplot and ?facet_wrap.
Update 1
A boxplot based on your sample data doesn't make too much sense, because your data are not continuous. But ignoring that, you could do the following:
df <- data.frame(
q1o = c(4,4,5,4,4,4,4,5,4,5,4,4,5,4,4,4,5,5,5,5,5,5,5,5,5,3,4,4,3,4),
q1s = c(5,4,4,5,5,5,5,5,4,5,4,4,5,4,5,5,5,5,5,5,5,5,5,5,5,5,4,5,4,4),
q2o = c(3,3,3,4,3,4,4,3,3,3,4,4,3,4,3,3,4,3,3,3,3,4,4,4,4,3,3,3,3,4),
q2s = c(5,4,4,5,5,5,5,5,4,5,4,4,5,4,5,5,5,5,5,5,5,5,5,5,5,5,4,3,4,4));
df %>%
gather(key, value, 1:4) %>%
mutate(
variable = ifelse(grepl("q1", key), "F1", "F2"),
Label = ifelse(grepl("o$", key), "Bad", "Good")) %>%
ggplot(aes(variable, value, fill = Label)) +
geom_boxplot(position = position_dodge()) +
facet_wrap(~ variable, ncol = 3, scale = "free");
Update 2
One way of visualising discrete data would be in a mosaicplot.
mosaicplot(table(df2));
The plot shows the count of value (as filled rectangles) per Variable per Label. See ?mosaicplot for details.
I recently discovered the multiplot function from the Rmisc package to produce stacked plots using ggplot2 plots/objects. What I am trying to do now is to create a multiplot of multiplots. Unfortunately, unlike the ggplot function, multiplot does not produce objects, so my issue cannot be resolved by simply nesting multiplot.
I will create a dataframe to make my point clear. In my dataframe named df, I have 3 columns: period, group and value. A certain value is recorded for each of 3 groups over 10 periods. (Note: I don't use a seed number below despite the use of the sample function because the focus is not numerical, it is graphical)
# Create a data frame for illustration purposes
df <- data.frame(period = rep(1:10, 3),
group = rep(LETTERS[1:3], each = 10),
value = sample(100, 30, replace = TRUE))
I then add a fourth column to df, which is the exponential transformation of the value column.
df$exp.value = exp(df$value)
I would like to create stacked plots allowing me to compare the values in each group to their exponential counterparts.
# Split dataframe by group
df_split <- split(df, df$group)
# Plots of values in each group
plots <- lapply(df_split, function(i){
ggplot(data = i, aes(x = period, y = value)) + geom_line()
})
# Plots of logged values in each group
plots_exp <- lapply(df_split, function(i){
ggplot(data = i, aes(x = period, y = exp.value)) + geom_line()
})
plots and plots_exp are both lists of 3 elements each containing ggplot objects. The first element of each list corresponds to group A, the second element corresponds to group B and the third element corresponds to group C.
In order to compare each group's values to the exponential values, I can use the multiplot function. Following is an example with group A:
multiplot(plots[[1]], plots_log[[1]], cols = 1)
How can I create a grid which will include the multiplot above as well as the ones for groups B and C? As if the code included ... + facet_grid(. ~ group)?
We can use cowplot package:
library(cowplot)
plot_grid(plots[[1]], plots_exp[[1]],
plots[[2]], plots_exp[[2]],
plots[[3]], plots_exp[[3]],
labels = c("A", "A", "B", "B", "C", "C"),
ncol = 1, align = "v")
We can output to a pdf looping through plots and plots_exp list objects. Every page will contain 2 plots. This is a better option when we have a lot of groups:
pdf("myPlots.pdf")
lapply(seq(length(plots)), function(i){
plot_grid(plots[[i]], plots_exp[[i]], ncol = 1, align = "v")
})
dev.off()
Another option is to prepare the data for ggplot and use facet as usual:
library(dplyr)
library(tidyr)
library(ggplot2)
gather(df, valueType, value, -c(group, period)) %>%
mutate(myGroup = paste(group, valueType)) %>%
ggplot(aes(period, value)) +
geom_line() +
facet_grid(myGroup ~ ., scales = "free_y")